Professional Documents
Culture Documents
by Antonio Mele
April 2016
c
by
A. Mele
ii
Preface
These Lectures on Financial Economics are based on notes I wrote in support of advanced
undergraduate and graduate lectures in nancial economics, macroeconomic dynamics, nancial
econometrics and nancial engineering.
Part I, Foundations, develops the fundamentals tools of analysis used in Part II and Part III.
These tools span such disparate topics as classical portfolio selection, dynamic consumption- and
production- based asset pricing, in both discrete and continuous-time, the intricacies underlying
incomplete markets and other market imperfections and, nally, econometric tools comprising
maximum likelihood, methods of moments, and the relatively more modern simulation-based
inference methods.
Part II, Applied asset pricing theory, is about identifying the main empirical facts in nance
and the challenges they pose to nancial economists: from excess price volatility and countercyclical stock market volatility, to cross-sectional puzzles such as the value premium. This
second part reviews the main models aiming to take these puzzles on board.
Part III, Asset pricing and reality, aims just to this: to use the main tools in Part I and the
lessons drawn from Part II, so as to cope with the main challenges occurring in actual capital
markets, arising from option pricing and trading, interest rate modeling and credit risk and
their associated derivatives. In a sense, Part II is about the big puzzles we face in fundamental
research, while Part III is about how to live within our current and certainly unsatisfactory
paradigms, so as to cope with demand for intellectual expertise.
These notes are still underground. Economic motivation and intuition are not always developed
as they would deserve, some derivations are inelegant, and sometimes, the English is a bit
informal. Moreover, I still have to include material on monetary models of asset prices, theories
of the nominal and the real term structure of interest rates, bubbles, asset prices implications of
overlapping generations models, or nancial frictions and their interconnections with business
cycle developments. Finally, I need to include more extensive surveys for each topic I cover,
especially in Chapters 1, 3, 5, 6, and 10. Of the 13 Chapters I have already drafted, I believe
c
by
A. Mele
Chapters 1 and 6 are those in need of the most serious revamp. I plan to revise these notes to
ll all these gaps. Meanwhile, any comments on this version are more than welcome.
Antonio Mele
April 2016
iv
c
by
A. Mele
Antonio Mele does not accept any liability for any losses related to the use of the
models, data, and methods described or developed in these lectures.
Contents
Foundations
14
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
15
15
16
17
18
19
21
23
23
24
25
25
26
26
28
31
33
33
33
33
33
35
c
by
A. Mele
Contents
1.6 Stochastic dominance . . . . . . . . . . . . . . . . . . . .
1.7 Appendix 1: Analytical details relating to portfolio choice
1.7.1 The primal program . . . . . . . . . . . . . . . .
1.7.2 The dual program . . . . . . . . . . . . . . . . . .
1.8 Appendix 2: The market portfolio . . . . . . . . . . . . .
1.8.1 The tangent portfolio is the market portfolio . . .
1.8.2 Tangency condition . . . . . . . . . . . . . . . . .
1.9 Appendix 3: An alternative derivation of the SML . . . .
1.10 Appendix 4: Demand for money and liquidity traps . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
39
39
40
42
42
42
44
45
47
49
49
51
52
52
53
56
56
57
58
61
64
66
66
67
68
70
71
71
73
78
79
80
81
81
82
82
83
84
85
88
c
by
A. Mele
Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Innite horizon economies
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Consumption-based asset evaluation . . . . . . . . . . . . . . . . . . .
3.2.1 Recursive plans: introduction . . . . . . . . . . . . . . . . . .
3.2.2 Asset pricing: the marginalist argument . . . . . . . . . . . . .
3.2.3 Intertemporal elasticity of substitution . . . . . . . . . . . . .
3.2.4 Lucas model . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Production: foundational issues . . . . . . . . . . . . . . . . . . . . .
3.3.1 Decentralized economy . . . . . . . . . . . . . . . . . . . . . .
3.3.2 The social planner solution . . . . . . . . . . . . . . . . . . . .
3.3.3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.4 Stochastic economies . . . . . . . . . . . . . . . . . . . . . . .
3.4 Production-based asset pricing . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 Consumers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Money, production and asset prices in overlapping generations models
3.5.1 Introduction: endowment economies . . . . . . . . . . . . . . .
3.5.2 Diamonds model . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.3 Money . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.4 Money in a model with real shocks . . . . . . . . . . . . . . .
3.6 Optimality and bubbles . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1 Economies with production . . . . . . . . . . . . . . . . . . .
3.6.2 Over-accumulation of capital . . . . . . . . . . . . . . . . . . .
3.6.3 Money . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7 Appendix 1: Finite di erence equations, with economic applications .
3.8 Appendix 2: Neoclassic growth in continuous-time . . . . . . . . . . .
3.8.1 Convergence from discrete-time . . . . . . . . . . . . . . . . .
3.8.2 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.9 Appendix 3: Notes on optimization of continuous time systems . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Continuous time models
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . .
4.2 An introduction to no-arbitrage and equilibrium . .
4.2.1 Time . . . . . . . . . . . . . . . . . . . . . .
4.2.2 The origins: Black & Scholes . . . . . . . . .
4.2.3 Asset prices as Feynman-Kac representations
4.2.4 The Girsanov theorem . . . . . . . . . . . .
4.2.5 The APT in continuous time . . . . . . . . .
4.2.6 Example: no-arbitrage in Lucas tree . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
90
91
91
91
91
93
93
94
98
98
99
100
102
107
107
110
111
111
111
114
114
118
119
119
120
120
122
126
126
127
129
132
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
133
. 133
. 134
. 134
. 135
. 139
. 141
. 144
. 147
c
by
A. Mele
Contents
4.3 Distorsions and numeraires . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Leading example: consumption-based probabilities . . . . . . .
4.3.2 Numeraire pricing . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Martingales and arbitrage . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 The information framework . . . . . . . . . . . . . . . . . . .
4.4.2 Viability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.3 Market completeness . . . . . . . . . . . . . . . . . . . . . . .
4.5 Equilibrium with a representative agent . . . . . . . . . . . . . . . . .
4.5.1 Mertons approach: dynamic programming . . . . . . . . . . .
4.5.2 Martingale methods . . . . . . . . . . . . . . . . . . . . . . .
4.5.3 Continuous time Consumption-CAPM . . . . . . . . . . . . .
4.6 Partial hedging in incomplete markets: introduction . . . . . . . . . .
4.7 Inaction: the economics of American options . . . . . . . . . . . . . .
4.7.1 Early exercise premiums: an introductory example . . . . . . .
4.7.2 Gambles and securities again . . . . . . . . . . . . . . . . . .
4.7.3 Real options theory . . . . . . . . . . . . . . . . . . . . . . . .
4.7.4 Perpetual puts . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.5 Perpetual calls . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8 Further topics on real options and controlled Brownian motions . . .
4.8.1 Irreversible investments and the decision to invest . . . . . . .
4.8.2 A model of determination of exchange rates in target zones . .
4.8.3 Liquidity constraints and optimal dividend policy . . . . . . .
4.9 Portfolio constraints . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.9.1 Technical background . . . . . . . . . . . . . . . . . . . . . . .
4.9.2 Articial markets . . . . . . . . . . . . . . . . . . . . . . . . .
4.10 Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.10.1 Poisson jumps . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.10.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.10.3 Properties and related distributions . . . . . . . . . . . . . . .
4.10.4 Asset pricing implications . . . . . . . . . . . . . . . . . . . .
4.10.5 An option pricing formula . . . . . . . . . . . . . . . . . . . .
4.11 Continuous time Markov chains . . . . . . . . . . . . . . . . . . . . .
4.12 Appendix 1: An introduction to stochastic calculus for nance . . . .
4.12.1 Stochastic integrals . . . . . . . . . . . . . . . . . . . . . . . .
4.12.2 Stochastic di erential equations . . . . . . . . . . . . . . . . .
4.13 Appendix 2: Self-nanced strategies, from discrete to continuous time
4.13.1 The basic dynamics . . . . . . . . . . . . . . . . . . . . . . . .
4.13.2 Models with nal consumption only . . . . . . . . . . . . . . .
4.14 Appendix 3: Proof of selected results . . . . . . . . . . . . . . . . . .
4.14.1 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . .
4.14.2 Proof of Eq. (4.82). . . . . . . . . . . . . . . . . . . . . . . . .
4.14.3 Walrass consistency tests . . . . . . . . . . . . . . . . . . . .
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
153
153
154
158
158
159
160
162
162
163
167
168
168
168
169
170
171
172
175
175
176
177
181
181
184
185
185
186
187
188
189
189
190
190
200
205
205
205
208
208
209
209
c
by
A. Mele
Contents
4.15 Appendix 4: The Greens function . . .
4.15.1 Setup . . . . . . . . . . . . . .
4.15.2 The PDE connection . . . . . .
4.16 Appendix 5: Portfolio constraints . . .
4.17 Appendix 6: Topics on jumps . . . . .
4.17.1 The Radon-Nikodym derivative
4.17.2 Arbitrage restrictions . . . . . .
4.17.3 State price density: introduction
4.17.4 State price density: general case
References . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
211
211
212
213
215
215
216
216
217
219
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
221
. 221
. 221
. 221
. 222
. 223
. 223
. 224
. 224
. 224
. 224
. 225
. 227
. 228
. 229
. 232
. 233
. 235
. 238
. 239
. 239
. 240
. 244
. 245
. 248
. 249
. 249
. 249
. 249
. 251
. 252
c
by
A. Mele
Contents
II
255
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
256
. 256
. 257
. 257
. 260
. 260
. 262
. 264
. 265
. 266
. 268
. 268
. 268
. 270
. 272
. 273
. 276
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
277
. 277
. 278
. 285
. 286
. 287
. 292
. 296
. 297
. 297
. 299
. 304
. 304
. 304
. 306
. 310
. 317
. 320
. 321
. 322
. 324
. 325
. 326
c
by
A. Mele
Contents
7.12 Appendix 6: Stochastic dominance beyond Rothschild and Stiglitz .
7.12.1 Dynamic stochastic dominance . . . . . . . . . . . . . . . .
7.12.2 Proof of Theorem 7.1 . . . . . . . . . . . . . . . . . . . . . .
7.13 Appendix 7: Dynamics of habit in Campbell and Cochrane (1999) .
7.14 Appendix 8: An algorithm to simulate discrete-time pricing models
7.15 Appendix 9: Heuristic details of learning in continuous time . . . .
7.16 Appendix 10: Linear regime-switching economies . . . . . . . . . . .
7.17 Appendix 11: Bond price convexity revisited . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 Macronance
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2 Non-expected utility . . . . . . . . . . . . . . . . . . . . . . . .
8.2.1 Recursive formulations . . . . . . . . . . . . . . . . . . .
8.2.2 Testable restrictions . . . . . . . . . . . . . . . . . . . .
8.2.3 Risk premiums and interest rates . . . . . . . . . . . . .
8.2.4 Campbell-Shiller approximation . . . . . . . . . . . . . .
8.2.5 Risks for the long-run . . . . . . . . . . . . . . . . . . .
8.3 Heterogeneous agents and catching up with the Joneses . . . .
8.4 Idiosyncratic risk . . . . . . . . . . . . . . . . . . . . . . . . . .
8.4.1 A static model . . . . . . . . . . . . . . . . . . . . . . .
8.4.2 Self-insurance and persistence of idiosyncratic shocks . .
8.4.3 A model with countercyclical income inequality . . . . .
8.5 Incomplete markets with homogeneous and heterogenous agents
8.5.1 Idiosyncratic shocks unrelated to aggregate risk . . . . .
8.5.2 A two-agents economy . . . . . . . . . . . . . . . . . . .
8.6 Disagreement and learning . . . . . . . . . . . . . . . . . . . . .
8.6.1 Learning with multiple signals . . . . . . . . . . . . . . .
8.6.2 Overcondence and bubbles . . . . . . . . . . . . . . . .
8.6.3 General equilibrium without frictions . . . . . . . . . . .
8.7 Coping with Knigthian uncertainty . . . . . . . . . . . . . . . .
8.7.1 Prelude . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.7.2 Uncertainty aversion and Ellsberg paradox . . . . . . . .
8.7.3 Portfolio selection and market participation . . . . . . .
8.7.4 A model of multiple likelihoods . . . . . . . . . . . . . .
8.8 Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.9 Government spending and asset prices . . . . . . . . . . . . . .
8.10 Leverage and volatility . . . . . . . . . . . . . . . . . . . . . . .
8.10.1 Primitives . . . . . . . . . . . . . . . . . . . . . . . . . .
8.10.2 Equity volatility: a decomposition formula . . . . . . . .
8.10.3 Bankruptcy . . . . . . . . . . . . . . . . . . . . . . . . .
8.11 Multiple trees and the cross-section of asset returns . . . . . . .
8.12 The term-structure of interest rates . . . . . . . . . . . . . . . .
7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
328
328
329
330
332
333
334
335
336
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
340
. 340
. 342
. 342
. 343
. 344
. 345
. 346
. 347
. 349
. 350
. 351
. 352
. 353
. 354
. 355
. 357
. 358
. 359
. 362
. 371
. 371
. 372
. 374
. 378
. 382
. 384
. 384
. 384
. 385
. 386
. 386
. 386
Contents
c
by
A. Mele
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
388
388
389
389
390
390
394
395
396
396
399
399
400
405
407
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
413
. 413
. 415
. 417
. 419
. 420
. 422
. 423
. 428
. 431
. 431
. 432
. 435
. 436
. 437
. 443
. 447
. 447
. 447
. 447
. 447
. 448
. 449
. 450
. 451
. 455
. 458
c
by
A. Mele
Contents
III
460
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
461
. 461
. 462
. 462
. 463
. 463
. 463
. 464
. 467
. 467
. 471
. 472
. 473
. 473
. 474
. 476
. 476
. 477
. 478
. 479
. 482
. 482
. 484
. 488
. 496
. 496
. 501
. 503
. 504
. 504
. 507
. 509
. 510
. 511
. 512
. 517
. 518
. 518
. 519
. 520
. 521
c
by
A. Mele
Contents
10.11Appendix 1: The original arguments of Black & Scholes
10.12Appendix 2: Black (1976) . . . . . . . . . . . . . . . .
10.13Appendix 3: Stochastic volatility . . . . . . . . . . . .
10.13.1 Hull & White equation . . . . . . . . . . . . .
10.13.2 Extensions . . . . . . . . . . . . . . . . . . . . .
10.13.3 Smile analytics . . . . . . . . . . . . . . . . . .
10.14Appendix 4: Local volatility . . . . . . . . . . . . . . .
10.15Appendix 5: Variance contracts . . . . . . . . . . . . .
10.16Appendix 6: Skewness contracts . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
522
523
524
524
524
525
527
529
532
533
536
. 536
. 537
. 537
. 538
. 538
. 538
. 541
. 543
. 543
. 545
. 546
. 548
. 548
. 556
. 557
. 560
. 561
. 576
. 584
. 584
. 585
. 586
. 586
. 588
. 590
. 590
. 594
. 598
. 599
. 602
. 611
c
by
A. Mele
Contents
11.7.1 Denitions and rationale . . . . . . . . . . . . .
11.7.2 Callable bonds . . . . . . . . . . . . . . . . . .
11.7.3 Convertible bonds . . . . . . . . . . . . . . . . .
11.8 Appendix 1: Botstrapping and no-arbitrage restrictions
11.9 Appendix 2: Proof of Eq. (11.17) . . . . . . . . . . . .
11.10Appendix 2: The Ho and Lee price representation . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12 Interest rates
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.2 Bond prices and interest rates . . . . . . . . . . . . . . . . .
12.2.1 A rst representation of bond prices . . . . . . . . . .
12.2.2 Forward rates . . . . . . . . . . . . . . . . . . . . . .
12.2.3 A second representation of bond prices . . . . . . . .
12.3 Stylized facts . . . . . . . . . . . . . . . . . . . . . . . . . .
12.3.1 The expectation hypothesis . . . . . . . . . . . . . .
12.3.2 Bond returns predictability . . . . . . . . . . . . . .
12.3.3 The yield curve and the business cycle . . . . . . . .
12.3.4 Additional stylized facts about the US yield curve . .
12.3.5 Common factors a ecting the yield curve . . . . . . .
12.4 Models of the short-term rate: Introduction . . . . . . . . .
12.4.1 Models versus representations . . . . . . . . . . . . .
12.4.2 The bond pricing equation . . . . . . . . . . . . . . .
12.4.3 Stochastic duration . . . . . . . . . . . . . . . . . . .
12.4.4 Some famous models . . . . . . . . . . . . . . . . . .
12.4.5 The Monetary Experiment and interest rate volatility
12.4.6 Short-term rates as jump-di usion processes . . . . .
12.5 Multifactor models of the short-term rate . . . . . . . . . . .
12.5.1 Stochastic volatility . . . . . . . . . . . . . . . . . . .
12.5.2 Three-factor models . . . . . . . . . . . . . . . . . .
12.5.3 A ne and quadratic term-structure models . . . . .
12.5.4 Unspanned stochastic volatility . . . . . . . . . . . .
12.5.5 Topics regarding estimation and trading strategies . .
12.6 No-arbitrage models: early formulations . . . . . . . . . . . .
12.6.1 Fitting the yield-curve, perfectly . . . . . . . . . . . .
12.6.2 Ho & Lee . . . . . . . . . . . . . . . . . . . . . . . .
12.6.3 Hull & White . . . . . . . . . . . . . . . . . . . . . .
12.7 The Heath-Jarrow-Morton framework . . . . . . . . . . . . .
12.7.1 Framework . . . . . . . . . . . . . . . . . . . . . . .
12.7.2 The model . . . . . . . . . . . . . . . . . . . . . . . .
12.7.3 The dynamics of the short-term rate . . . . . . . . .
12.7.4 Embedding . . . . . . . . . . . . . . . . . . . . . . .
12.7.5 Stochastic string shocks models . . . . . . . . . . . .
11
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
611
614
618
622
626
628
630
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
631
. 631
. 632
. 632
. 634
. 634
. 635
. 635
. 636
. 638
. 641
. 641
. 644
. 645
. 646
. 649
. 650
. 656
. 659
. 660
. 661
. 664
. 665
. 666
. 667
. 669
. 670
. 671
. 672
. 672
. 672
. 673
. 674
. 675
. 676
Contents
c
by
A. Mele
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
678
678
681
682
683
686
688
689
692
693
694
694
695
696
699
699
700
704
706
707
708
709
710
712
713
714
718
. 718
. 718
. 721
. 721
. 723
. 723
. 738
. 740
. 745
. 749
. 749
. 750
. 766
. 778
. 778
c
by
A. Mele
Contents
13.5.2 Backtesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13.5.3 Stress testing . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13.5.4 Credit risk and VaR . . . . . . . . . . . . . . . . . . . . . . . .
13.5.5 Expected shortfall and measures of systemic risk . . . . . . . . .
13.6 Procyclicality, credit crunches and quantitative easing . . . . . . . . . .
13.6.1 Regulatory framework . . . . . . . . . . . . . . . . . . . . . . .
13.6.2 The 2007 subprime crisis . . . . . . . . . . . . . . . . . . . . . .
13.6.3 Top tier capital ratio targets and endogenous volatility . . . . .
13.6.4 Credit crunches and quantitative easing . . . . . . . . . . . . . .
13.7 Appendix 1: Present values contingent on future bankruptcy . . . . . .
13.8 Appendix 2: Proof of selected results . . . . . . . . . . . . . . . . . . .
13.9 Appendix 3: Transition probability matrices and pricing . . . . . . . . .
13.10Appendix 4: Bond spreads in markets with stochastic default intensity .
13.11Appendix 6: Conditional probabilities of survival . . . . . . . . . . . . .
13.12Appendix 7: Details regarding CDS index swaps and swaptions . . . . .
13.13Appendix 8: Modeling correlation with copulae functions . . . . . . . .
13.14Appendix 9: Details on CDO pricing with imperfect correlation . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
781
782
783
785
785
787
789
792
798
802
803
804
806
808
809
811
813
814
Part I
Foundations
14
1
The classic capital asset pricing model
1.1 Introduction
An investor is considering investing into a portfolio of securities. How would he choose the proportion of each asset based on his attitude towards risk? What are the asset pricing implications
of these portfolio choices? Are these choices consistent with absence of arbitrage opportunities?
This chapter deals with these issues in the context of the static market idealized in the rst
contributions that would give birth to nancial economics. The next section deals with portfolio
selection when our investor maximizes a mean-variance criterion, as in the seminal approach of
Markovitz (1952). We shall see that optimal portfolio choices like these lead to a notion of a
market portfolio as well as a rst theory of asset prices, known as the Capital Asset Pricing
Model, the celebrated CAPM (see Section 1.3). The CAPM predicts that each asset expected
return is proportional to the expected return of the market portfolio. It is, of course, a quite
coarse description of asset markets. Section 1.4 develops the Arbitrage Pricing Theory, or
APT, model. The APT provides renements of the CAPM, in that it predicts that each asset
expected return does relate to a number of factors. [In progress]
c
by
A. Mele
we dene
1 as the safe interest rate; = [1 ], where
1 is the rate of
return on the -th asset; and
(), the vector of the expected returns on the risky assets.
Finally, we let = [ 1
], where
is the wealth invested in the -th asset. We
have
X
X
X
0
= 0 0+
and
= 0+
(1.1)
0+
=1
=1
>
and
)+
=1
leaves:
>
)+
>
>
. Let 2
, which implies that
>
)+
[ 0 ( )] =
and
. We assume that
min ( ).
>
(1.3)
,
and
> =
where is a Lagrange multiplier for the variance constraint. By plugging the rst condition
, where
into the second, we obtain, (2 ) 1 =
Sh
Sh
)>
16
(1.4)
c
by
A. Mele
is the Sharpe market performance. To ensure e ciency, we take the positive solution. Substituting the positive solution for (2 ) 1 into the rst order condition, we obtain that the portfolio
that solves [1.P1] is
1
(
1 )
( )
(1.5)
Sh
We are ready to determine the value of [1.P1], [ 0 ( ( ))] and, hence, the expected portfolio
return, dened as,
[ 0 ( ( ))]
( )
(1.6)
= + Sh
where the last equality follows by a simple calculation. Eq. (1.6) describes what is known as
the Capital Market Line (CML).
1.2.3 Without the safe asset
Next, assume that the investors choice space does not include the riskless asset. In this case,
P
P
. By the denition
his current wealth is =
, and his terminal wealth is 0 =
=1
=1
of
1, and a few basic calculations,
0
=1
>
>
(1.7)
=1
where and are as dened as in Eq. (1.2). We can use Eq. (1.7) to determine the expected
return and the variance of the portfolio value, which are
[ 0 ( )] =
>
+ , where
>
[ 0 ( )] =
and
>
(1.8)
[ 0 ( )]
[ 0 ( )] =
s.t.
( )
1
2
and
>
[1.P2]
( )
2
(1.9)
>
1
,
1> 1 and
1> 1 1 , and ( ) is the expected portfolio
where
return, dened as in Eq. (1.6). In the Appendix, we also show that
1
1
2
(1.10)
=
( )
1+
2
Based on Eq. (1.10), we dene the global minimum variance portfolio as that portfolio that
1
achieves a variance equal to 2 =
and an expected return equal to
= / . We shall
return to this portfolio below.
17
c
by
A. Mele
Note that for each , there are two values of ( ) that solve Eq. (1.10). The optimal choice
for our investor is that with the highest . We dene the e cient portfolio frontier as the set
of values (
) that solve Eq. (1.10) with the highest . It has the following expression
q
1 2
2)
( )=
1 (
(1.11)
+
The e cient portfolio frontier is increasing and concave in risk, . It can be interpreted
as a production function, i.e., one that produces expected returns obtained using varying
levels of risk as inputs (see, e.g., Figure 1.1). Which portfolio on this frontier is selected by
an investor depends on the investors attitudes towards risk.
Example 1.1. Let the number of risky assets
= 2. In this case, the e cient portfolio
frontier is obtained without optimizing as above: the budget constraint, 1 + 2 = 1, pins down
an unique relation between the expected portfolio return and the variance of the portfolios
0
value. Precisely, we have:
= [ ( )] = 1 1 + 2 2 , or,
=
whence:
1
=
2
When
= 1
+(
2
2
2
1
1)
2
1
+2 1
+2
12 +
2) ( 1
1 2
2
+
2
2
2
2
= 1,
=
But diversication pays, provided asset returns are not perfectly positively correlated. Figure
1.1 actually reveals that there are portfolios that are even less risky than the less risky asset.
1
Moreover, risk can be zeroed when = 1, in which case 1 = 2 2 1 and 2 =
or,
2
1
1
2
2
1
alternatively, to
=
and
= 2 1 .Section 1.2.5 relies on a simplied version of this
2
1
model, which was at the basis of Tobins (1958) famous reformulation of the Keynesian theory
of money demand.
1.2.4 The global minimum variance portfolio
Note that the portfolio in Eq. (1.9) can be decomposed into two components, as follows:
( )
( )
= ( )
+ [1
( )]
( )
(1.12)
2
where
is the global minimum variance portfolio, for we know from Eq. (1.10) that the
q
1
minimum variance occurs at (
)=
, in which case ( ) = 0.1 More generally, we
Therefore,
1 It
is easy to show that the covariance of the global minimum variance portfolio with any other portfolio equals
18
1.
c
by
A. Mele
0.15
0.14
Expected return, mu
= 1
= 0.5
0.13
=0
= 0.5
0.12
=1
0.11
0.1
0.09
0.05
0.1
0.15
0.2
0.25
Volatility, vp
FIGURE 1.1. From top to bottom: portfolio frontiers corresponding to = 1 0 5 0 0 5 1. Parameters are set to 1 = 0 10, 2 = 0 15, 1 = 0 20, 2 = 0 25. For each portfolio frontier, the e cient
portfolio frontier includes those portfolios that yield the highest expected return for a given volatility.
can span any portfolio on the frontier by just choosing a convex combination of
and
, with
weight equal to ( ). Its a mutual fund separation theorem. We shall use this representation of
the portfolios on the e cient portfolio frontier while deriving the zero-beta CAPM in Section
1.3.3.
1.2.5 The market portfolio
The market portfolio is the portfolio at which the CML in Eq. (1.6) and the e cient portfolio
frontier in Eq. (1.11) intersect. In fact, the market portfolio is the point at which the CML is
tangent at the e cient portfolio frontier. For this reason, the market portfolio is also referred to
as the tangent portfolio. In Figure 1.2, the market portfolio is at point
and has volatility
equal to
and expected return equal to
. At this point, the CML is tangent to the e cient
2
portfolio frontier,
.
As Figure 1.2 illustrates, the CML dominates the e cient portfolio frontier
. This is
because the CML is the value of the investors problem, [1.P1], obtained using all the risky
assets and the riskless asset, and the e cient portfolio frontier is the value of the investors
problem, [1.P2], obtained using only all the risky assets.3 For the same reason, the CML and
the e cient portfolio frontier can only be tangent to each other. For suppose not. Then, there
would exist a point on the e cient portfolio frontier that dominates some portfolio on the CML,
2 The
existence of the market portfolio requires a restriction on , derived in Eq. (1.13) below.
1.2 also depicts the dotted line
, which is the value of the investors problem when he invests a proportion higher
than 100% in the market portfolio, leveraged at an interest rate for borrowing higher than the interest rate for lending. In this case,
the CML coincides with
, up to the point . From
onwards, the CML coincides with the highest between
and
.
3 Figure
19
c
by
A. Mele
a contradiction. Likewise, the CML must have a portfolio in common with the e cient portfolio
frontierthe portfolio that does not include the safe asset. Below, we shall use this insight to
characterize the market portfolio analytically.
Why is the market portfolio called in this way? Figure 1.2 reveals that any portfolio on the
CML can be obtained as a combination of the safe asset and the market portfolio
(a portfolio
containing only the risky assets). An investor with high risk-aversion would like to choose a
point such as , say. An investor with low risk-aversion would like to choose a point such as ,
say. But no matter how risk averse an individual is, the optimal solution for him is to choose a
combination of the safe asset and the market portfolio . Thus, the market portfolio plays a
mere instrumental role. It obviously does not depend on the risk attitudes of any investorit
is a mere convex combination of all the existing assets in the economy. Instead, the optimal
course of action for any investor is to use those proportions of this portfolio that make his
overall exposure to risk consistent with his risk appetite. Its a two fund separation theorem.
Are these predictions observed, in practice? Financial advisors are known to recommend
young investors to hold more risky positions, and less conservative investors to increase their
stock holdings, compared to bonds. Instead, according to the 1.2, the stocks/bonds mix in
should be the same, independently of risk-attitudes. These are assets allocation puzzles,
described for example by Campbell and Viceira (2002), which can be addressed through extensions of the CAPMe.g., assuming that agents have access to stochastic opportunity sets
including stochastic volatility, as discussed in later chapters of Part II. [In progress]
The equilibrium implications of the previous separation theorem lead to better clarify the
reasons we refer the tangent portfolio to as the market portfolio. As explained, any portfolio
can be attained by lending or borrowing funds in zero net supply, and in the portfolio .
In equilibrium, then, every investor must hold some proportions of . But since in aggregate,
there is no net borrowing or lending, one has that in aggregate, all investors must have portfolio
holdings that sum up to the market portfolio, which is therefore the value-weighted portfolio
of all the existing assets in the economy. This argument is formally developed in the appendix.
We turn to characterize the market portfolio. We need to assume that the interest rate is
su ciently low to allow the CML to be tangent at the e cient portfolio frontier. The technical
condition that ensures this is that the return on the safe asset be less than the expected return
on the global minimum variance portfolio, viz
(1.13)
Let
where
1>
20
1
Sh
>
1 = ,
(1.14)
that solves
1
Sh
if
c
by
A. Mele
CML
A
Z
Q
C
r
vM
FIGURE 1.2.
i.e.,
Sh
=
Then, we plug this value of
(1.15)
in Eq. (1.14) and obtain,4
(1.16)
Once again, the market portfolio belongs to the e cient portfolio frontier. Indeed, consider the
following reasoning. On the one hand, the market portfolio cannot be above the e cient portfolio
frontier, as this would contradict the e ciency of the
curve (obtained by investing in
the risky assets only). On the other hand, by construction, the market portfolio belongs to
the CML and so it cannot be below the e cient portfolio frontier, as the CML dominates the
e cient portfolio frontier. The Appendix makes this reasoning rigorous, and shows that the
market portfolio does indeed satisfy the tangency condition.
1.2.6 Tobins re-interpretation of Keynesian speculative demand for money
Tobin (1958) relies on portfolio selection and shows that demand for money can be explained
while making reference to the agents attitude vis-`a-vis the risk of alternative ways to invest
savings. His contribution aims to revise some foundational issues regarding the monetary theory
in Keynes (1936). Tobin explains that the Keynesian explanation for money demand may imply
that agents end up making dichotomic choices: they hold either money or bonds. That is, at
the individual level, a given agent either holds money or bonds based on his own expectations
of the future interest rate levels. However, at the aggregate level, money demand is inversely
4 While
the market portfolio depends on , this portfolio does not obviously include any share in the safe asset.
21
c
by
A. Mele
related to the nominal interest rate, albeit at in correspondence of small values of this ratea
liquidity trap. The Appendix provides a parameteric example that claries the details of how
these mechanisms operate.
Tobin formulates a theory of money demand in which agents do not make previous dichotomic
choices. Consider the following specication of Example 1.1. We interpret the safe asset as
money, which is therefore such that its return and volatility are 1 0 and 1 0; instead,
bonds are risky, in that they provide a superior return but at the cost of some volatility.
Therefore, the expected return and volatility of a portfolio comprising money and bonds are
= 2 2 and = 2 2 , with straightforward notation. We have
=
2
2
1
2
(1.17)
The top panels of next two pictures plots the rst of Eqs. (1.17), along with indi erence
curves of an hypothetical representative agent. The agent nds his optimum is achieved at ,
which is the point of tangency between the rst of Eqs. (1.17) and the indi erence curve
.
Money demand, 1
2 , is determined by the second of Eqs. (1.17), and is shown in the bottom
panels. As the expected return on the bond, 2 , increases, the new optimum shifts to 0 , the
tangency point between the new relation in Eqs. (1.17) and the indi erence curve 0 0 . Money
demand decreases as a result.
p
P
U
U
P
P
U
U
b2/s 2
1/s 2
p2
p2
vp
vp
1/s 2
p2
I
II
This framework of analysis can be used to study the e ects of a decreased interest rate
volatility. Suppose the central bank has the power to lower both interest rates and interest
rate volatility in such a way to keep the ratio 2 / 2 unchanged. As the second picture above
illustrates, the optimum is still , although then the second of Eqs. (1.17) becomes steeper than
before the policy action, by shifting from the line to the line . Money demand decreases
22
c
by
A. Mele
as a result. We can say more. As is clear, money demand decreases as interest rate volatility,
2 , decreases. Therefore, the central bank might keep money supply constant and achieve lower
interest rates by simply targeting low interest rate volatility.
[In progress]
)
p + (1
(1.18)
2
2
(1
)
+ 2(1
)
+ 2 2
+ 2 | =0
(1
) 2 + (1 2 )
1
2
=
=
=
| =0
=0
Therefore,
( )
( )
=0
23
(1.19)
c
by
A. Mele
CML
A
M
i
C
r
vM
FIGURE 1.3.
)/
(1.20)
Eq. (1.20) is the celebrated Security Market Line (SML). Appendix 3 contains an alternative
derivation of the SML.
The SML can be interpreted as a projection of the excess return on asset (i.e.
) on the
)+
= 1
(1.21)
2 2
( )
= 1
( ) 0 does, instead,
The quantity 2 2 is referred to as systematic risk. The quantity
capture the notion of idiosyncratic risk. In the next section, we shall show that idiosyncratic
risk can be eliminated through a well-diversied portfolioroughly, a portfolio that contains
a large number of assets.
1.3.2 Zero-beta CAPM
Suppose the risk-free asset is not available for trading, and consider a portfolio that only contains
risky assets in the portfolio frontier, which can be generated by Eq. (1.12), with a xed weigth
24
c
by
A. Mele
( )
>( )
, the return
(1.22)
(1.23)
and then,
( ) =
( ) =
(1.24)
Eqs. (1.23) and (1.24) can be solved for and 1 , which can then be plugged into Eq. (1.22),
leaving, after simple calcuations,
( )
( ) 1
( ) (
)
=
( )
( )
We can simplify this expression, once we assume that
=
1 +
( )
(
( )
( ) = 0, in which case,
)
(1.25)
Eq. (1.25) is the zero-beta, or Blacks (1972), CAPM. Let us summarize its meaning. Even if
there are no riskless assets, we can express expected returns in terms of benchmark returns,
as in the security market line of Eq. (1.20). First, let us be given a benchmark portfolio
return, and, second, an asset return that has zero correlation with it, (whence, the zerobeta qualication). Then, Eq. (1.25) is the counterpart to the security market line in the case
where we dont have riskless assets to invest in. Note, however, that Eq. (1.25) does not rely on
relaxing the assumption of the existence of no riskless assets. The next section provides more
equilibrium foundations to Eqs. (1.20) and (1.25).
1.3.3 Equilibrium with expected utility
Equilibrium with mean-variance & exponential utility. [In progress] [...]
1.3.4 Applications
1.3.4.1 Hedging
Interestingly, the CAPM can be interpreted by having regard to a classical hedging framework.
Suppose we hold an asset that delivers a return equal to perhaps, a nontradable asset. We
wish to hedge against this stochastic rerturn, by going long a portfolio comprising a proportion
in the market portfolio, and a proportion 1
in a safe asset. We use as a hedging criterion
the variance of the overall exposure of the position, min
[ ((1
) + )]. The solution
2
to this basic problem is,
( )
. That is, the proportion to hold is simply the
c
by
A. Mele
The CAPM is a model for the required return for any asset. As such, it might be used as a
very rst tool to assess risky projects. Let denote the future, random, cash ow of a certain
that
( )= +
(
), which is, then, interpreted, as the risk-adjusted discount
rate for this project. Hence,
( )
=
(1.26)
1+
Furthermore, we can express the value of the project,
equivalent of the random cash ow, . We have:
( )
=1+
=1+ +
=1+ +
=1+ +
=1+ +
where
leaves:
( )
2
)
)
( )
( )
( )
( )
( )
(1.27)
1+
Next, consider a safe project with the same value as the original, , but a cash ow constant
: = ( ) =
1+
1+
By Eq. (1.27), then,
=
asset returns we observe are generated by the following linear factor model,
+
1
26
( )[
( )]
(1.28)
c
by
A. Mele
where and are a vector and a matrix of constants, and is a -dimensional vector of factors
supposed to a ect the asset returns, with
. The vector is the vector of risks that a ects
developments in asset returns.
Let us normalize [ ( )] 1 = , so that =
( ). With this normalization, we have,
(1
..
.
(
= +
= +
=1
=1
(1
..
.
(
)
(1.29)
)
Next, consider a portfolio of risky assets and a riskless asset. The wealth generated by this
portfolio is given by Eq. (1.2), which applied to this model leaves,
0
>
)+
>
(1.30)
An arbitrage opportunity arises, in this context, if there exists some portfolio such that the
wealth generated by this portfolio, 0 in Eq. (1.30) is certain, and di erent from the safe gross
interest rate , i.e. if
: > = 0 and > (
1 ) 6= 0. Mathematically, this is ruled out
whenever
R : =1 +
. Substituting this relation into Eq. (1.28) leaves,
=1
=1
( ) +
( )
) = +
=1
(
{z
)
}
= 1
(1.31)
Project evaluation under the exact APT obtains under a straightforward generalization of Eq.
(1.26). For any project with random cash-ow equal to , we have that its random return is
=
1, such that the value, , is given by
=
()
1+
(1.32)
where is as in Eq. (1.31). In Section 1.4.3, we shall explain how to represent the value of any
project, in terms of an alternative probability.
27
c
by
A. Mele
The APT collapses to the CAPM, once we assume that the only factor a ecting the returns is
the market portfolio. To show this, we must normalize the market portfolio return so that its
variance equals one, consistently with Eq. (1.31). So let be the normalized market return,
1
dened as
, so that
( ) = 1. We have,
= +
where
( ) =
= 1
( ). Then, we have,
= 1
= +
In particular,
)=
( ) =
(1.33)
=
which is known as the Sharpe ratio for the market portfolio, or the market price of risk.
By replacing = 1 ( ) and the expression for above into Eq. (1.33), we obtain,
= +
( )
2
= 1
Let us develop intuition on such a beautiful result, by elaborating a simple example. Assume
that each of the risks , is standard normal: ( ) = 12 exp( 12 2 ) and next, that we tilt
their densities by a factor equal to ( ) = exp( 12 2
), where is the unit-risk premium,
as dened in Section 1.4.1. This tilt denes a new probability under which each factor
is
distributed. Let us determine this probability, by tilting through ,
1
1
1 2 1 2
1 2
(1.34)
=
( )= ( ) ( )=
exp
exp
2
2
2
2
2
where
=
+
28
(1.35)
c
by
A. Mele
Note that the new density, , is still that of standard normal variate. Yet under , it is to
have zero expectation, not . In other words, we have that under , (i) is standard normal
and (ii) is normal with unit variance, but expectation equal to
. That is, assuming that
0,
has a lower expectation under than under under , this expectation is zero,
and under , it is
.
We label the new probability risk-neutral probability, for the following reasons. Consider
Eqs. (1.29) and (1.31), which say that under ,
= +
=1
(1.36)
=1
=1
and
To summarize, the return on each assets , has the following distributions under
under ,
2
P
2
+
(1.37)
=1
>
,
, as in Eq. (1.3), and
and
denote the Normal densities
where 2 =
under and under . That is, under , the expected return on each asset equals , whence
the risk-neutral probability label. In later chapters, we shall use the celebrated Girsanovs
theorem to elaborate on these topics, and label the tilt in Eq. (1.34), as the Radon-Nikodym
of the probability against (see, e.g., Chapter 4, Section 4.3.3).
Why complicating everything through the previous probability changes? In fact, the entire
building block underlying asset evaluation, relies on similar risk-neutral tilts. Consider the
following example of derivative evaluation. We wish to price a quadratic derivative, i.e. one
that pays o the square of the cash-ow promised by the rst asset, 12 . It is challenging
to evaluate this derivative through APT software in this Gaussian market, because 12 is
obviously not normally distributed, which complicates the reasoning underlying its exposure to
the factors . In fact, in this Gaussian market, we cannot restrict the behavior of the expected
return on this derivative, without assuming something more. Let us explain.
Suppose we want to construct a portfolio of the existing assets to replicate the payo of the
quadratic derivative, for each possible value this derivative could take. Can we do this? The
answer is in the negative. We cannot use a nite number of assets to span an asset payo , which
could take a continuum of values, such as 12 . We say markets are incomplete in this context.
In the next chapter, we shall see that the price of such, and related, derivatives, can be found
in a preference-free format, as soon as the number of assets is at least as large as the number
of statesmarkets are, then, complete. In this case, a portfolio that replicates the derivatives
29
c
by
A. Mele
payo can be found, and its value is the same as the derivatives, for two assets are worth the
same whenever they promise the same payo .
Chapter 2 explains that in a world with complete markets, the price of the existing traded
assets can be inverted for the shadow value of some elementary assets, those that pay o one
unit of numeraire in a given state of the world, and zero otherwise. The price of these elementary
securities can then be used to price any derivative, which is redundant indeed. Chapter 4 of
these Lectures shall explain how these results can be generalized to markets with a continuum
of states, as soon as we assume that there exist a number of su ciently diverse elementary
securities, which guarantee a payo could be delivered for each state of nature. To illustrate
through the example of this section, consider the following elementary security, which promises
the following payo ,
1 if 1 (
+ )
( )=
(1.38)
0 otherwise
and let ( ) be its current price. We shall refer these securities as Arrow-Debreu securities
in these Lectures, for reasons explained in the next chapter.
We could utilize all of these Arrow-Debreu securities, i.e. for all
R, and replicate any
R. Indeed,
generic function of the state, ( ), including our original payo , ( ) = 2 ,
note that by purchasing ( ) units of the security that pays o
( ) in Eq. (1.38), we pay
( ) ( ) today, and are guaranteed to receive 1 ( ) tomorrow in state 1 (
+ ),
and zero otherwise. Therefore, by purchasing all the securities that span R, we shall receive
( 1 ) for any possible value of 1 , for sure, tomorrow, and pay, today,
Z
( ) ( )
(1.39)
C
We call
such a portfolio. We claim that the value of the derivative,
say, is just C . For
suppose not, and assume, for instance, that
C . Then, we could sale short the derivative
for , invest C into the portfolio
, and retain an arbitrage prot equal to
C . It is
an arbitrage prot, because the portfolio
delivers the exact payo we need to honour the
short-sale of the derivative.
The crucial point is to determine the value of . We claim that,
( )=
1
1+
(1 + )
2 2
1 1
(1.40)
2
) denotes the density of the risk-neutral normal distribution
in (1.37), but
where ( ;
2
with mean and variances
and
as given in Eq. (1.40). Indeed, let us take expectations of
1 = 1 (1 + 1 ) under , such that,
( 1 )
1+
(1.41)
On the other hand, let us apply Eq. (1.39) to determine the value of the derivative that pays
o , ( 1 ) = 1 , which is
Z
1
( )
30
(1.42)
c
by
A. Mele
2
1
2
1
1+
+1+
where 1 is the value of the rst asset, determined as usual through Eq. (1.32). Note how simple
this formula is. It links the value of the derivative to the square of the value of the underlying
risk, 12 , and the discounted expectation of 21 , reecting that after all, a quadratic derivative
is about a play in volatility.6
Remarkably, the price of this derivative does not require any knowledge of the risk-premium
components, . It is, thus, a preference-free formula. It is the task of the next chapter to
develop the deep reasons why derivatives can sometimes be expressed in such as simple way.
To anticipate, the derivative relies on a risk, which is already traded in the market, in that the
cash ow 1 is traded at 1 . All risks have, then, already been embedded into the market price,
1.
1.4.3 The APT with idiosyncratic risk and a large number of assets
[Ross (1976), and Connor (1984), Huberman (1983).]
How can idiosyncratic risk be eliminated? Consider, for example, Eq. (1.21). Intuitively, we
may form portfolios with a large number of assets, so as to make idiosyncratic risk negligible, by
the law of large numbers. But would the beta-relation still hold, in this case? More in general,
would the APT relation in Eq. (1.31) be still valid? The answer is in the a rmative, although
it deserves some qualications.
5 We can check that Eq. (1.40) is consistent with the pricing of a pure discount bond. Such a bond has a payo equal to
( )=1
1
=
( ) , which it does, by Eq. (1.40).
for all , such that by Eq. (1.39), 1+
6 In continuous-time, the price at time of a quadratic derivative, which promises to pay o the square of an asset price, 2 ( )
2
)
at , is given by 2 ( ) ( + )(
, provided is a Geometric Brownian motion with volatility parameter equal to (see Chapter
4, Section 4.3.3.1).
31
c
by
A. Mele
Consider the APT equation (1.28), and add a vector of idiosyncratic returns, , which are
independent of , and have mean zero and variance 2 :
= +
We wish to show that in the absence of some appropriate notion of arbitrage, to be dened
below, it must be that the number of assets such that Eq. (1.31) does not hold, ( ) say, is
bounded as gets large, i.e.:
|
((
) + )|
= 1
( )
(1.44)
where
lim
( )
(1.45)
In other words, we wish to show that in a large market, Eq. (1.31) does indeed hold for most
of the assets, an approach close to that in Huang and Litzenberger (1988, p. 106-108).
By the same arguments leading to Eq. (1.1), the wealth generated by a portfolio of the assets
satisfying (1.44), 0 ( ) say, is,
0
>
>
=
1
+
+
+
(
)
(
)
(
)
(
)
(
)
( )
( )
( )
and
are (i) the vector of the expected returns, (ii) the return volatility (or
where ,
factor exposures) matrix and (iii) the vector of idiosyncratic return components a ecting these
assets, and, nally,
and
are the portfolio and the initial wealth invested in these assets.
In this context, we may dene an arbitrage as the portfolio
( ) that in the limit, as the
number of the existing assets gets large, is riskless and yet delivers an expected return strictly
larger than the safe interest rate, viz
lim
( )]
and
lim
( )
( )]
(1.46)
We want to show that this situation does not arises, under the condition in (1.45), thereby
establishing that the linear APT relation in Eq. (1.31) is valid for most of the assets, in a large
market.
So suppose the linear relation,
1 =
, doesnt hold. Then, there exists a portfolio
such that,
>
>
= 0 and
(
1 ) 6= 0.
(1.47)
Consider the portfolio:
=
>
>
[ 0 ] = >
+ 2 = 2 >
where
[ 0 ( )]
where the second equality follows by the rst relation in (1.47). Clearly, lim
as ( )
. Hence, in the absence of arbitrage, the condition in (1.45) must hold.
32
c
by
A. Mele
Well-diversied portfolios.
)+
= 1
where
denote time-series residuals. Fama and MacBeth (1973) consider the following
procedure. In a rst step, one obtains estimates of the exposures to the market, say, for all
stocks, using, for example, monthly returns, and approximating the market portfolio with some
broad stock market index.7 In a second step, one runs cross-sectional regressions, one for each
month,
=
+ +
= 1
denote cross-sectional residuals. The time-series of crosswhere is the sample size and
sectional estimates of the intercept
and the price of risk , and say, are, then, used
to make statistical inference. For example, time-series averages and standard errors of and
lead to point estimates and standard errors for
and . If the CAPM holds, estimates of
should not be signicantly di erent from zero.
1.5.2 Macroeconomic forces
Chen, Roll and Ross (1986) use the Fama-MacBeth two-step procedure to estimate a multifactor
APT model, such as that in Section 1.4. They identify macroeconomic forces driving asset
returns with the innovations in variables such as the term spread, expected and unexpected
ination, industrial production growth, or the corporate spread. They nd that these sources
of variation in the cross-section of asset returns are signicantly priced.
1.5.3 Fama & French
Consider the Security Market Line in Eq. (1.20), which predicts that each asset display an
average excess return lying precisely on the SML. Assets delivering average excess returns and
betas above the SML, as the points , , , and
in Figure 1.4 below, would be simply
evidence that this single factor version of the APT does not work. Consider, for example, the
7 In
tests of the CAPM, one uses proxies of the market portfolio, such as, say, the S&P 500. However, the market portfolio is
unobservable. Roll (1977) points out that as a result, the CAPM is inherently untestable, as any test of the CAPM is a joint test
of the model itself and of the closeness of the proxy to the market portfolio.
33
c
by
A. Mele
asset corresponding to point . A regression of the excess return of this asset onto the excess
return on the market would produce a positive intercept, some
0, such that its average
excess return would equal + (
), thereby invalidating Eq. (1.20). There exist at least
two pieces of evidence against the one-factor CAPM, which were systematically pointed out by
Fama and French (1992, 1993):
(i) Size e ect (Banz, 1981): Average returns for small rms, or low capitalized rms (in
terms of market equity, dened as stock price times outstanding shares) are too high given
their beta.
(ii) Value e ect (Stattman, 1980; Rosenberg, Reid and Lanstein, 1985): Average returns on
stocks of rms with high book-to-market (BM, henceforth) ratios, or value stocks, are
too high given their beta. In general, average returns on value stocks are higher than those
on growth stocks, i.e. those stocks with low BM ratios. As an example, the points ,
, , and in Figure 1.4 might typically refer to stocks with low-to-high BM ratios.
A third piece of evidence against the standard CAPM is the momentum e ect:
(iii) Momentum e ect (Jegadeesh and Titman, 1993): Stocks with the highest returns in the
previous twelve months will outperform in the next future.
Average excess return
A
B
C
Se curity Mar ket Line
D
M - r
FIGURE 1.4.
The one-factor CAPM has no power in explaining the cross-section of asset returns, sorted by
size, BM or momentum. Assets sorted in this way command a size premium, a value premium,
and a momentum premium. For example, one can create portfolios sorted by size and BM, say
25 portfolios, out of a 5 5 matrix with dimensions given by size and BM. The puzzle, then,
at least from the standard CAPM perspective, is that this model cannot explain the returns
on these porfolios. Fama and French (1993) show that the returns on these portfolios can be
very much better understood by means of a multifactor model, where both size and value
premiums are explicitly taken into account. They consider three factors: (i) the excess return
on the market; (ii) an HML factor, dened as the monthly di erence between the returns
34
c
by
A. Mele
on assets with high and low BM ratios (high minus low); an SMB factor, dened as the
di erence between the asset returns of rms with small and big size (small minus big). The
HML and SMB factors are dened as the di erences between the returns on the appropriate
cells of a 2 3 matrix, obtained through percentiles of the distribution of asset returns over the
previous year.
Book-to-Market
L
M
H
Size
S
L
The resulting model is the celebrated Fama-French three factor model. Carhart (1997) extends
this model to a four-factor model with a momentum factor: the monthly di erence between the
returns on the high and low prior return portfolios.
1.5.4 The high-beta stocks anomaly
High-beta stocks should command higher returns, to compensate for their higher volatility. Yet
historically, it is low-beta stocks to have performed better, on a risk-adjusted basis (i.e., in
terms of alphas). Theoretically, then, one could buy low-beta stocks and leverage them through
debt. Its indeed feasible when youve got the opportunity to do so. Two papers to read are,
Frazzini, Kabiller and Pedersen (2012), and Frazzini and Pedersen (2012).
( )
( )= ( )
35
( ) ( )
]. First, we show
c
by
A. Mele
[ (2 )]
( ) = 0 and
[ (1 )] =
( ) = 1. Therefore,
0
( )[
1(
)
R
2(
)]
( )( 2( )
R
(
1 ( )) =
1
2
( )
( ))
0 for
1 ( )) =
(1.48)
c
by
A. Mele
Proof. As for ( )
( ), consider the function,
( ) = max {
0}. It is increasing
and concave and, hence, a candidate utility function. Therefore, it satises,
Z
( )) [
)[
= [ 1( )
Z
=
[ 1( )
( )
( )
( )]
2
( )]
( )]
[
( )
( )]
( )]
where the last equality follows by an integration by parts. Next we prove that ( )
( ). We
have:
Z
[ (1 )]
[ (2 )] =
( ) [ 1( )
2 ( )]
Z
0
= ( ) [ 1( )
( ) [ 1( )
2 ( )]|
2 ( )]
Z
0
=
( ) [ 1( )
2 ( )]
0
00
=
( ) 1 ( ) 2 ( )
( ) 1 ( ) 2 ( )
=
=
where ( ) =
( )
Z
00
00
( ) 1 ( )
2 ( )
( ) 1 ( )
2 ( )
( ) 1 ( )
2 ( )
( )
( )]
( )
( )] = 0
have the same mean. Now,
[ (1 )]
[ (2 )], i.e., 1
We can now consider random variables that add up risk without a ecting the mean: suppose
that there exists a random variable : 1 has the same distribution as 2 + , and ( | 2 = 2 ) =
0. We can think of an experiment in which after receiving a payo 2 , another payo could be
added which has conditional expectation zero, and which therefore adds noise. Clearly 1 is a
mean preserving spread of 2 . It is easy to show that this mean-preserving spread implies that
37
c
by
A. Mele
00
[ (2 + )]
[ ( (2 + )| 2 =
2 )]
[ ( ( 2 + | 2 =
2 ))]
[ ( ( 2 | 2 =
2 ))]
[ (2 )]
38
c
by
A. Mele
=
where
and
1(
>
>
2(
1
2 1
> =
21
>1 =
(1A.1)
= 1> =
We can solve for
2,
obtaining,
>
21
1 )
{z }
1
(
2 1
1
1
1
1 +
1
=
2 1
(1A.2)
( )
>
1 > 1
1
+
(
| {z } 2 1 | {z }
>
>
1
2
1
)=
| {z }
1
+
2 1
(1A.3)
= >
=
>
1
+
2 1
2
1
2 1
>
>
1
2
1
1 +
2 1
(1A.4)
( )
2
1
2
39
+
2
2
1
1
(1A.5)
c
by
A. Mele
11
( )
(1A.6)
( )
By rearranging terms in the previous equation, we obtain Eq. (1.9) in the main text.
Finally, we substitute Eq. (1A.6) into the second equation in (1A.5), and obtain:
2
1 h
1+
2 i
( )
which is Eq. (1.10) in the main text. Note, also, that the second condition in (1A.5) reveals that,
1
2
2
0, the previous equation conrms the properties of the global minimum variance
Given that
portfolio stated in the main text.
0( )
0
s.t.
( ) =
= arg min
and
[1A.P2-dual]
where 1 and
second one,
>
> =
= >1 ;
(1A.7)
are two Lagrange multipliers. By replacing the rst condition in (1A.7) into the
= > =
1 >
+ 1> 1 )
2 | {z }
2 | {z }
(1A.8)
(1A.9)
Next, let
1 >
2|
1 + 1> 1 1 )
{z }
2 | {z }
40
and
2
are,
c
by
A. Mele
0 ( )
1
1
= 2 > = >
1
2
1
+ >
1 =
2
+
2
(
(
)2 1
+
2)
2
1
1
2
1
2
1
2
2 1
2
1>
2
0
0
1
0
0
It is negative (semi) denite whenever the leading principal minors (formed through the last columns
and corresponding rows, for = 4
+ 2) have determinants with signs that alternate, with the
rst one (formed with the last 4 rows and corresponding columns) having the sign of ( 1)2 = +1.
2.
This is possible whenever 1 0, which is true, by Eq. (1A.6), whenever
41
c
by
A. Mele
= 1
Cap
where is the number of assets outstanding in the market. The market capitalization of all the
assets is simply
X
Cap
Cap
=1
The market portfolio, then, is the portfolio with relative weights given by,
Cap
Cap
= 1
Next, suppose there are investors and that each investor has wealth , which he invests in two
be the wealth investor invests in the safe asset
funds, a safe asset and the tangent portfolio. Let
the remaining wealth the investor invests in the tangent portfolio. The tangent portfolio
is dened as
, for some
solution to [1.P2], and is obviously independent of
(see Eq.
(1.16) in the main text). The equilibrium in the stock market requires that
and
Cap
=1
X
=1
= Cap
P
where the second equality follows because the safe asset is in zero net supply and, hence,
= 0;
=1
and the third equality holds because all the wealth in the economy is invested in stocks, in equilibrium.
Sh =
(1A.10)
The left hand side of this equation is the slope of the CML, obtained through Eq. (1.6). The right hand
side is the slope of the e cient portfolio frontier, obtained by di erentiating ( ) in the expression
in
for the portfolio frontier in Eq. (1.11), and setting =
q
2
( )
2) =
= ( 2 1) 1 (
( )
and where the second equality follows, again, by Eq. (1.11). By Eqs. (1A.10) and (1.15), we need to
show that,
1
=
2
By plugging
= +
Sh
42
c
by
A. Mele
43
c
by
A. Mele
1
=
(
1 )
( ) =
=
(1A.11)
where we have used the expression for the market portfolio given in Eq. (1.16). Next, premultiply the
previous equation by
>
or
to obtain:
=
>
>
)=
1
)2
Sh
(1A.12)
)=
Sh
)=
where the last two equalities follow by Eq. (1A.12) and by the relation,
terms, we obtain Eq. (1.20).
44
(
Sh =
)
. By rearranging
c
by
A. Mele
( 0)
+1
1+
( )
( 0 ) = 0, given by,
( )
1+ ( )
0 ( )
( )
'
1+ ( )
( )
Given this approximation, when 0 = , every agent believes rates can only rise within the reference
period, such that no one is willing to purchase any bond, as this purchase would lead to a sure loss.
This situation is known as a liquidity trap: when 0 = , changes in money supply, be they positive or
negative, do not a ect interest rates. Indeed, at 0 = , the only investor holding bonds is simply the
marginal investors , who is indi erent about whether to hold money or bonds. If the central bank
increases money supply by purchasing the bonds, this marginal investor would be perfectly ready to
accept this new money and tender the bonds, as he is obviously indi erent between investing in bonds
or hoarding money. Likewise, if the central bank decreases money supply through a bonds sale, the
marginal investor would buy these bonds.
Yet an important point of Keynesian theory is that money demand is negatively sloped, at a
macroeconomic level. We now develop an analytical example where this property holds true. Assume
there are a continuum of agents on [0 1], ordered such that the distribution of ( ) is uniform:
( ) = + (
[0 1]
+ (
)
1 + + (
)
1
( )=
0
( )
I0 (
(1+ ) 0
( )(1
=
{ :0 ( )
0}
45
0)
=1
( 0)
( 0) =
(
1+ 0
)(1
0)
c
by
A. Mele
1+
1+
. The
interest rate relating to the liquidity trap is 1+ . The purpose of Section 1.2.6 is to explain how Tobin
(1958) coped with this degeneracy of interest rate expectations.
46
c
by
A. Mele
References
Banz, R.W. (1981): The Relationship Between Return and Market Value of Common Stocks.
Journal of Financial Economics 9, 3-18.
Black, F. (1972): Capital Market Equilibrium with Restricted Borrowing. Journal of Business 45, 444-454.
Campbell, J.Y. and L.M. Viceira (2002): Strategic Asset Allocation. Oxford: Oxford University
Press.
Carhart, M. (1997): On Persistence of Mutual Fund Performance. Journal of Finance 52,
57-82.
Chen, N-F., R. Roll and S.A. Ross (1986): Economic Forces and the Stock Market. Journal
of Business 59, 383-403.
Connor, G. (1984): A Unied Beta Pricing Theory. Journal of Economic Theory 34, 13-31.
Fama, E.F. and J.D. MacBeth (1973): Risk, Return, and Equilibrium: Empirical Tests.
Journal of Political Economy 38, 607-636.
Fama, E. F. and K. R. French (1992): The Cross-Section of Expected Stock Returns. Journal
of Finance 47, 427-465.
Fama, E. F. and K. R. French (1993): Common Risk Factors in the Returns on Stocks and
Bonds. Journal of Financial Economics 33, 3-56.
Frazzini, A., D. Kabiller and L. Pedersen (2012): Bu ets Alpha. Working paper.
Frazzini, A. and L. Pedersen (2012): Betting Against Beta. Working paper.
Huang, C-f. and R.H. Litzenberger (1988): Foundations for Financial Economics. New York:
North-Holland.
Huberman, G. (1983): A Simplied Approach to Arbitrage Pricing Theory. Journal of Economic Theory 28, 1983-1991.
Jegadeesh, N. and S. Titman (1993): Returns to Buying Winners and Selling Losers: Implications for Stock Market E ciency. Journal of Finance 48, 65-91.
Keynes, J. M. (1936): The General Theory of Employment, Interest and Money. London:
Palgrave Macmillan.
Markovitz, H. (1952): Portfolio Selection. Journal of Finance 7, 77-91.
Roll, R. (1977): A Critique of the Asset Pricing Theorys Tests Part I: On Past and Potential
Testability of the Theory. Journal of Financial Economics 4, 129-176.
Rosenberg, B. K. Reid and R. Lanstein (1985): Persuasive Evidence of Market Ine ciency.
Journal of Portfolio Management 11, 9-17.
47
c
by
A. Mele
Ross, S. (1976): Arbitrage Theory of Capital Asset Pricing. Journal of Economic Theory
13, 341-360.
Rothschild, M. and J. Stiglitz (1970): Increasing Risk: I. A Denition. Journal of Economic
Theory 2, 225-243.
Rothschild, M. and J. Stiglitz (1971): Increasing Risk: II. Its Economic Consequences. Journal of Economic Theory 5, 66-84.
Sharpe, W. F. (1964): Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk. Journal of Finance 19, 425-442.
Stattman, D. (1980): Book Values and Stock Returns. The Chicago MBA: A Journal of
Selected Papers 4, 25-45.
Tobin, J. (1958): Liquidity Preference as Behavior Towards Risk. Review of Economic Studies 25, 65-86.
48
2
Arbitrage, equilibrium and pricing
2.1 Introduction
This chapter develops asset pricing implications arising whilts requiring that markets are free
from arbitrage opportunities. An important distinction between nancial securities and gambles
rests on the way we price them. Typically, but not exclusively, gambles regard risks that are
not traded. Their value is determined by supply and demand, resting then on factors such as
risk-aversion or the bargaining power of the parties involved into them. A few gambles, such as
some we see in a casino, might be repeated, so to speak. For these gambles, the Law of Large
Numbers might be a rough guidance to what we should expect their value to be, although
additional factors determing supply and demand still play a critical role, as we shall explain.
Financial securities work in a radically di erent fashion. The principle to price them relies
on absence of arbitrage. Suppose there is a security which could be replicated by a portfolio
of existing securities. In absence of any frictions, the price of this security equals the value of
the replicating portfolio, regardless of supply and demand, or any Law of Large Numbers
Its preference-free as we say. The reason is that two portfolios delivering the same payo
must be worth the same, for otherwise an arbitrage opportunity would arisethe possibility to
implement nancial transactions and make money without any risk. Granted, there certainly
are securities which are not so easy to replicate, as they might link to new risks, compared to
the existing securities. However, it might be that in practice, and for all purposes, the existing
assets can be bunched into a portfolio that mimics su ciently well this security, such that we
can design worst-case and best-case scenarios for the value of the security to evaluate. Naturally,
an alternative to all this might be nancial innovationthe process of creation of new securities
that have the potential to ll in the initial incomplete market structure.
This chapter aims to formalize these ideas, and relies on the very denition and role of nancial securities in a world with uncertainty. We start from very far, and review the classical
general equilibrium model in a context without uncertainty. This model has profound implications, leading to idealized outcomes for the society as a whole, where allocations are optimal,
according to Paretos criterion. Financial securities play a critical role, once we plug uncertainty
into this model. A su ciently high number of these securities might actually lead to the same
idealized outcomes predicted by the classic general equilibrium model. Intuitively, a high num-
2.1. Introduction
c
by
A. Mele
ber of su ciently diverse nancial securities have the potential to deliver the payo s we need
in each future contingency of the world, thereby making markets function as if we were in a
static world.
The purest kind of security we could imagine is one that only pays o a xed amount of
numeraire, say $1, in a pre-specied state of the world, and zero otherwise. These securities
are known as Arrow-Debreu, in recognition of the founders of general equilibrium theory with
uncertainty, as explained in detail in Section 2.3. Arrow-Debreu securities are conceptually very
useful, as the knowledge of their prices can be utilized to price any other asset. Not surprisingly,
then, these idealized securities are also a useful tool in the market practice of asset pricing, as
explained in Part III of these Lectures. They link to what we usually term as risk-neutral
probability, similarly as in the previous chapter, as we shall explain.
Needless to mention, many of the previous optimistic conclusions rely on a number of assumptions, such as agentss symmetric information about the assets payo s, perfect competition in
the goods markets, the presence of frictionless capital markets or, nally, market completeness
the circumstance that there is an Arrow-Debreu security for each state of the world. We shall
deal with capital market imperfections in Chapter 4, and with information problems in Chapter
9, of these Lectures. These imperfections will allow us to think about quite interesting aspects
of modern markets and economies in Part II of the Lectures. However, since this chapter, we
shall develop an introduction to the theory and methods arising in the context of incomplete
markets.
In fact, even in the presence of incomplete markets, there exist shadow prices for any ArrowDebreu security. In principle, we could use these prices to evaluate any asset. The challenging
issue is that in an incomplete markets setting, requiring absence of arbitrage does not lead
to an unique set of Arrow-Debreu prices, as in the obvious complete markets case. Further
assumptions are needed to help further characterize these shadow prices. For example, we may
naturally imagine a market economy, with agents optimizing over consumption choices, and pin
down these prices in the general equilibrium of this economy.
It is indeed an important task of this chapter to link Arrow-Debreu security prices to optimal
consumption in general equilibrium. It is natural. As nancial economists, we would naturally
like to like to understand asset prices, from the perspective of an economy where households
allocate their endowments across consumption and savings. Our households objective is to maximize utility of consumption, in an intertemporal context subject to uncertainty. This chapter
only considers two-period economies, but many of its insights lead to asset pricing equations,
which are the logical antecedents to the Eulers equations arising in multiperiod, and possibly
innite horizon economies dealt with in Chapter 3the Consumption-CAPM.
The chapter is organized as follows. The next section contains a succinct description of the
static general equilibrium model and its properties in a static context, and abstracts from
decisions taken within the production sphere of the economy. (Production-based economies are
studied in more detail in Chapter 3 and 8 of these Lectures.) Section 2.3 illustrates the role
nancial securities can play in economies with uncertainties, and the very rst examples of the
meaning and use of Arrow-Debreu securities, and their relations to the risk-neutral probability.
Sections 2.4 and 2.5 provide theory, based on the extension of the general equilibrium model
of Section 2.2, and include uncertainty. The focus in Section 2.4 is absence of arbitrage, and
the implications of this assumption on asset prices, in both complete and incomplete markets.
Naturally, absence of arbitrage does not imply equilibrium, although the converse is true, as
explained in Section 2.5. Section 2.5 then relates Arrow-Debreu security prices to risk-neutral
probabilities, in both complete and incomplete markets, and provides discussion on topics such
50
c
by
A. Mele
as the role of nancial markets as vehicles risk-sharing for a society, or nancial innovation.
Section 2.6 provides a very rst introduction to the theory of the Consumption-CAPM as well
as its predictions on the equity premium. Section 2.7 provides a framework to think about
budget constraints in innite horizon markets. Finally, Section 2.8 develops a few more topics
about the theme of incomplete markets, and the appendixes contain material omitted from the
main text.
P
P
Let
( 1
) = {( 1
):
}, a bounded, closed and
=1
=1
convex set, hence a convex set. Each agent maximizes his utility function subject to the budget
constraint:
max
{
) subject to (
[2.P1]
=1
= =
(2.1)
=1
)=
X
=1
= 1
We emphasize the economy we consider in this chapter is one that completely abstracts from
production. Here, prices are the key determinants of how resources are allocated in the end. The
perspective is, of course, radically di erent from that taken by the Classical school (Ricardo,
51
c
by
A. Mele
Marx and Sra a), for which prices and resources allocation cannot be disentangled from the
production side of the economy. In the next chapter and more advanced parts of the lectures,
we consider the asset pricing implications of production, following the Neoclassical perspective.
2.2.1 Walras Law
Let us plug the demand functions of the -th agent into the constraint of [2.P1], to obtain,
0=
=1
(2.2)
Next, dene the total excess demand for the -th commodity as
aggregating the budget constraint across all the agents,
, 0=
XX
=1 =1
. By
=1
Walras law holds by the mere aggregation of the agents constraints. But the agents constraints
are accounting identities. In particular, Walras law holds for any price vector and, a fortiori,
it holds for the equilibrium price vector,
0=
X
=1
)=
X1
)+
(2.3)
=1
1.
Now suppose that the rst
1 markets are in equilibrium, or ( ) 0, for = 1
By the denition of an equilibrium, we have that sign ( ( )) = 0. Therefore, by Eq. (2.3),
we conclude that if
1 markets are in equilibrium, then, the remaining market is also in
equilibrium.
2.2.2.2 The notion of numeraire
The excess demand functions are homogeneous of degree zero. Walras law implies that if
1
markets are in equilibrium, then, the -th remaining market is also in equilibrium. We wish
to link these two results. A rst remark is that by Walras law, the equations that dene a
competitive equilibrium are not independent. Once
1 of these equations are satised, the
-th remaining equation is also satised. In other words, there are
1 independent relations
52
c
by
A. Mele
and
unknowns in the equations that dene a competitive equilibrium. So, there exists an
innity of solutions.
Suppose, then, that we choose the -th price to be a sort of exogeneous datum. The result
is that we obtain a system of
1 equations with
1 unknowns. Provided it exists, such
a solution is a function of the -th price, = ( ), = 1
1. Then, we may
refer to the -th commodity as the numeraire. In other words, general equilibrium can only
determine a structure of relative prices. The scale of these relative prices depends on the price
level of the numeraire. It is easily checked that if the functions are homogeneous of degree
one, multiplying
by a strictly positive number does not change the relative price structure.
Indeed, by the equilibrium condition, for all = 1
,
0
=
(1 2
) = ( 1( ) 2( )
( 1 2
) = (1 2
)
where the second equality is due to the homogeneity property of the functions , and the
last equality holds because the excess demand functions are homogeneous of degree zero. In
particular, by dening relative prices as = / , one has that
=
is a function
that is homogeneous of degree one. In other words, if
1 , then,
1
(1
) = ( 1
)
1
0
2.2.3 Optimality
Let
= ( 1
) be the allocation to agent , = 1 . The following denition is
the well-known concept of a desirable resource allocation within a society, according to Pareto.
1
Definition
if it
P 2.2 (Pareto optimum). An allocation = ( ) is a Pareto optimum
1
is feasible,
)
0, and if there are no other feasible allocations = (
)
=1 (
such that ( )
( ), = 1 , with one strict inequality for at least one agent.
c
by
A. Mele
The previous theorem can be interpreted as one that supports an equilibrium with transfer
payments. For any given Pareto optimum , a social planner can always give to each
agent (with = , where
is chosen by the planner), and agents choose . Figure 2.1
illustratres such a decentralization procedure within the Edgeworths box. Suppose that the
objective is to achieve . Given an initial allocation
chosen by the planner, each agent is
given . Under laissez faire, will obtain. In other words, agents are given a constraint of
the form
= . If
and are chosen so as to induce each agent to choose , then is a
supporting equilibrium price. In this case, the marginal rates of substitutions are identical, as
established by the following celebrated result:
Theorem 2.5 (Characterization of Pareto optima: I). A feasible allocation = (1 )
is a Pareto optimum if and only if there exists a R++ 1 such that
!
= = 1 , where 5
(2.4)
5
1
arg max
R+
( )
X
(
subject to
= 2
= 2
( , = 1
( ,
=1
1(
)+
X
=2
( )
X
=1
=1
11
1
and for
= 2
54
c
by
A. Mele
5 =
of the con-
But because a competitive equilibrium is also a Pareto optimum, then, by Theorem 2.5,
5 =
Hence, represents the vector of relative, shadow prices arising within the centralized allocation
process.
We provide a further characterization of Pareto optimal allocations.
Theorem 2.6 (Characterization of Pareto optima: II). A feasible allocation = (1 )
is a Pareto optimum if and only if there exists
0 such that is solution to the following
program:
(
) = 1max
=1
subject to
=1
= 1
[2.P2]
Proof. The if part is simple and at the same time instructive. Let us solve the program in
[2.P2]. The Lagrangian is,
=
X
=1
=1
= 1
>
=1
>
(2.5)
That is, equals the same vector of constants for all the agents, just as in Theorem 2.5. The
converse to this theorem follows by an application of the usual separating theorem, as in Du e
(2001, Chapter 1). k
Note, if 1 = 1 and =
for = 2 , then,
=
( = 1
) and so the rst
order conditions in Theorem 2.5 and 2.6 would lead to the same allocation. More generally, we
have:
Theorem 2.7 (Centralization of competitive equilibrium through Pareto weightings). The
outcome of any competitive equilibrium can be obtained, through a central planner who maximizes the program in [2.P2], with system of social weights equal to
= 1 , where
is the
marginal utility of income for agent .
So agents with high marginal utility of income for a given price vector, will receive little
social weight in the centralized planner allocation procedure. This result is particularly useful
55
c
by
A. Mele
when it comes to study nancial markets in economies with heterogeneous agents. Theorem 2.7
is also a point of reference, where to move from, when it comes to study asset prices in a world
of incomplete markets. Chapter 8 contains several examples of these applications.
Proof of Theorem 2.7. In the competitive equilibrium,
=
(2.6)
are the Lagrange multipliers for the agents budget constraint, so that
where
marginal utility of income:
=
(1 (
))
is the agent
X
=1
By comparing the competitive equilibrium solution in Eq. (2.6) with the Pareto optimality
property of the equilibrium in Eq. (2.5), we deduce that, a competitive equilibrium ( ) can
be implemented, by a social planner acting as in Theorem 2.6, when =
P1 . Then, it also
follows that, necessarily,
= , by the aggregate resource constraint,
, which,
=1
intuitively, has to hold both in the competitive and the centralized economy. Indeed, we have:
=
=1
X
=1
= 1
(2.7)
where and
are the inverse functions for consumption, as implied by the private allocation
in Eq. (2.6) and the social, in Eq. (2.5). The rst of Eqs. (2.7) determines the general equilibrium
price vector,
say. The second of Eqs. (2.7) is the aggregate constraint faced by the central
planner with = 1 , and clearly, this constraint is satised by a Lagrange multiplier , say,
which exactly matches , viz
= , in which case
= by construction. Moreover, this
is the unique solution for as
is monotonically decreasing. k
A commodity is characterized by its physical properties, the date and the place at which
it will be available.
For example, by freezing time and physical properties, we have a theory of international
commerce, and by freezing places and physical properties, we a theory of nance. The previous
denition does not include the notion of uncertainty. To cope with uncertainty, Debreu (1959,
Chapter 7) extends the previous denition, highlighting that a commodity should be described
56
c
by
A. Mele
through a list of physical properties, with the structure of dates and places being replaced
by an event structure. The following example illustrates the di erence between two contracts
underlying the delivery of corn arising under conditions of certainty (case A) and uncertainty
(case B):
A The rst agent will deliver 5000 tons of corn of a specied type to the second agent, who
will accept the delivery at date and in place .
B The rst agent will deliver 5000 tons of corn of a specied type to the second agent, who
will accept the delivery in place and in the event
at time . If
does not occur at
time , no delivery will take place.
In both cases, the contract is paid at the time it is agreed. There are instances where we
can actually use the model of the previous section to deal with contracts such as those in
case B. Consider, for example, a two-period economy, and suppose that in the second period,
mutually exhaustive and exclusive states of nature may occur. We can, then, recover the
model of the previous section, once we replace
(the number of commodities described by
physical properties, dates and places) with
, where
=
. With
replacing ,
the competitive equilibrium in this economy is dened as the competitive equilibrium in the
economy of the previous section.
The assumption underlying this trick is that markets exist, where commodities for all states
of nature are traded. Such contingent markets are complete, in that a market is open for each
commodity in each state of nature. Therefore, agents can implement any feasible action plan.
In particular, resource allocation is Pareto-optimal. However, the existence of contingent
markets is a strong assumption. Next, we show how nancial securities help mitigate this
assumption.
2.3.2 Financial securities
What role nancial securities could play in an uncertainty world? Arrow (1953) develops the
following interpretation. Rather than signing contracts for delivery of commodities that are
contingent on the realization of events, agents might agree on contracts generating payo s
that are contingent on the realization of eventsi.e. nancial securities, or assets. The payo s
delivered by the assets in the various states of the world might then be collected to nance
state-contingent consumption plans.
Let us illustrate. Consider a two-period economy. An asset A is a contract that promises
to pay a payo
( ) in some state
S in the second period, where S denotes the set of
all possible future events. We assume there exist
assets, and that markets re-open for each
commodity in each state of nature in the second period. Assets and commodities are linked as
follows. At time zero I purchase units of the asset A . If state of nature occurs in the second
period, I will utilize the payo
( ) and nance net transactions on the commodity markets
re-opening in the second period, viz
( ) ( )=
X
=1
( )
(2.8)
where ( ) and ( ) denote the commodity price and excess demand vectors, which are both
contingent on the realization of state .
57
c
by
A. Mele
Thus, nancial assets transfer value across states of nature. Security markets shrink this
uncertain economy to one similar to that of the previous section, with one added feature: there
are no such complicated markets to open at time zero for each state of nature in the future.
There are, simply, commodity and asset markets in the rst period, and commodity markets
that re-open in the second period. This economy is more realistic than in the previous section
although perfectly isomorphic to the previous once Eq. (2.8) holds true. However, Eq. (2.8) does
not hold, in general. It would, should the number of assets be equal to the number of events in
S, i.e. = . We say security markets are complete in this case.
The natural question arises as to what is the fair value of these nancial securities? Section 2.4
develops a comprehensive theory of security evaluation, which relies on both a precise notion of
arbitrage opportunties and how absence of arbitrage links to contexts with maximizing agents,
and the existence of prices for a specic class of securities, known as Arrow-Debreu securities,
and their relation to the notion of complete markets. The remainder of this section aims to
develop in detail the notion of Arrow-Debreu securities, and a number of introductory examples
illustrating their importance. We begin with two examples, which aim to draw attention to
what we really mean by absence of arbitrage and how this notion can help di erentiate between
gambles and securities.
2.3.3 Gambles and securities
Part of the reasoning underlying the very fundamental asset pricing formulae in these Lectures
relies on the notion of absence of arbitragetwo portfolios yielding the same should be worth
the same, in the presence of frictionless markets. The aim of this section is to draw a distinction
between gambles and securities, by emphasizing the idea that some securities can be priced by
requiring absence of arbitrage.
As usual, securities are contracts that are traded at a certain price, which might reect a
number of factors such as supply and demand, bargaining power, the market microstructure
and, of course, absence of arbitrage, as we shall explain soon. Instead, gambles are risks that
are typically not traded, like in a casino, with some nuances to be made below. Naturally, the
fair value of a gamble also depends on many factors, such as the bargaining power of the parties
entering into play, and their risk-attitudes.
If we were capable to slice a gamble into elementary risks that are actually traded, we would
be able to price the gamble through no-arbitrage: the fair value of the gamble would equal that
of a portfolio that has one unit of each of the risks which the gamble is split into, as in the
examples of the previous section. In fact, such a gamble would not be like those in a casino
anymore, but a derivative, which we could price through the price of the underlying risks.
Note, this gamble might not be traded per seonly its constituent risks arealthough then, it
might be replicated.
All in all, there are securities that can, and securities that cannot, be replicated, through
the set of existing assets. Those that can be replicated, are priced without regard to anything
else but the prices of the assets that add up to the replicating portfolio. Arbitrage is the
possibility to prot from price inconsistencies, by trading the risk constituents, and absence of
arbitrage imposes discipline on the set of economically viable security prices. Likewise, there
are gambles that can actually be replicated, as soon as their constituent risks are traded. What
these gambles have in common with replicable securities, is that their price is free from anything
relating to factors such as risk-aversion, or supply and demand, say, being then determined by
no-arbitrage. Gambles that cannot be sliced in this way have a value that likely depends on
58
c
by
A. Mele
the gamblers risk-aversion or the bookmakers bargaining power, say. Securities that cannot be
replicated have this trait in common with unreplicable gambles: their value is not tied down to
anything that is already traded, and subject to a number of factors that we may only speculate
about. Still, the price of traded securities cannot be anything. To avoid arbitrage, security prices
satisfy a quite fundamental economic restriction, known as the martingale restriction, which is
the focus of much of this chapter.
2.3.3.1 Short-run bets and asymptotic arbitrage
What is the price of a coins game? If we could only toss the coin many times, then, by the Law
of Large Numbers, the value of this gamble would approach 50% of a stake as the number of
draws grows, to avoid an asymptotic arbitrage, i.e. the possibility to gain from betting for
tails, if the price for the tails is less than 50% of the stake over such a large number of trialsor
selling tails, whenever this price is higher than 50%. Note that by allowing for the possibility
to sell tails, we are meaning that this gamble is, in fact, a tradeable asset.
Yet assume, and realistically, that the number of trials is small, in which case the value
of this gamble might well deviate from 50% of a stake, reecting for instance preferences,
bargaining power, or in general, supply and demand. The previous asymptotic arbitrage could
not take place because tossing a coin only once, say, is far from guaranteeing an almost sure
50% frequency of tails outcomes.
This example shows that the value of an unreplicable gamble might well depend on preferences
and other factorsin short, by supply and demand. In fact, in the next section, we show that
factors such as bettors preferences are not only necessary to value a gamble. These preferences
would sometimes even need to be restricted for us to conclude about the logical consistency of
the very same gamble.
2.3.3.2 St Petersburg paradox
Consider the following, celebrated gamble, leading to the so-called St Petersburg paradox. We
toss a coin a number of times, and the gamble ends as soon as a tail (say) arises as an outcome,
with a payo doubling all the time, being equal to $2 if the gamble stops at . The probability
to receive $2 at the -th trial is, naturally, the probability that
1 heads occur over the
rst
1 trials and one tail in one trial, which in total is 2 , given the independence of the
trials. Therefore, the fair value of this gamble is,
X
=1
(2 ) 2
This gamble is obviously not trivial, as there is a risk anyway, in that the payo , although
positive, is not certain. Because the payo is positive, we have to pay something to enter this
gamble, although it is unlikely that anyone would be willing to pay a large amount of money
for it. Why? A moment reection suggests the payo prole is quite unattractive. For example,
there are almost 95% chances to obtain less than $ 16, which illustrates how implausible it
seems that anyone would be willing to spend a large amount of money to enter this gamble.
Bernoulli (1738) proposes to solve this puzzle by replacing outcomes with concave utilities of
outcomesconcave utilities would dampen the occurrence of very positive returns. For example,
a decision maker with log-utility, would perceive an expected utility from this gamble, equal to:
X
=1
(ln 2 ) 2
59
= ln 4
c
by
A. Mele
Therefore, he would not be willing to pay more than $ 4, to enter this gamble. (This amount of
money, $4, is the certainty equivalent of the gamble, CE, say, i.e. ln CE = (ln ), where is
the random outcome from the gamble.) As anticipated in the previous section, gambles might
be quite subtle. Not only does their value depend on preferences. Factors such as risk-aversion
are critical to the very existence of such a game. In the St Petersburg paradox example, the
game cannot exist were the bettors risk-seeking or risk-neutral.
2.3.3.3 World Cup pricing, and arbitrage pricing
The risks underlying the gambles of the previous examples cannot be evaluated in a preferencefree fashion. As for the coins game, we would need a large number of draws to conclude its
fair value is 50% of a stake. As for the coins game leading to the St Petersburg paradox, we
would actually need risk-aversion to keep the value of the gamble nite. We now develop a third
example where, again, the value of the gamble cannot be determined without making reference
to supply and demand although then we could price some of the elementary risks that make up
this gamble, in a preference-free fashion, for the very simple reason that these risks are actually
traded.
Consider, then, the odds posted by a bookmaker, displayed in Table 2.1, against four hypothetical teams competing for the World Cup. In bookmaking language, odds set at - against
a certain event mean that on a $ stake, the bettor receives $ if the given event occurs, and
loses his stake of $ , otherwise. Therefore, odds of 5-1 against team B imply that the bookmaker stands ready to pay $500 for $100 stake, should B win the World Cup. Let be the
probability of the event that a given team wins, under which the bettors expected gain is zero,
: +( ) (1
) = 0, such that equals the odds ratio, 1 , with being xed to one in
this example. Also shown in Table 2.1 are these implied break-even probabilities, calculated
as
= 1+1 , where
are the odds against team , and
are the corresponding break-even
probability.
Team
A
B
C
D
odds
against
2-1
5-1
6-1
9-1
implied
prob.
33.33%
16.67%
14.28%
10.00%
As in the coins game of Section 2.3.3.1, the breakeven probabilities do not necessarily lead
to the fair value of the gamble. Indeed, bookmakers typically make prots through what is
known as the overoundquoted probabilities that exceed 100%. Overounds are obviously
not arbitrage opportunities, because a given World Cup is quite an unique event, with its own
weather conditions, teams components and many other factors, such that the chance of any
team to win the competion has only a weak linkage to the objective probability of winning over
an hypothetically large number of matches. Once again, the price and, then, the quoted odds,
depend on both the bettors risk-appetite and the bookmakers bargaining power.
While the single odds depend on supply and demand, we can price other gambles which cover
events relating to those the bookmakers are making a market for, and independently of supply
and demand. For example, what are the odds against either team A or team B winning the
60
c
by
A. Mele
World Cup? It is a derivative gamble, because its value depends on that of the constituent
events. Let us convert the odds into tickets, meaning that the cost of a ticket for a payo equal
to $1 for a - odds is equal to $ + , meaning that if the event occurs, we obtain $ + $ ,
and $ otherwise, we which we interpret as the cost of the ticket. So, the unit ticket for team
A winning is worth $0 33333, and that for team B is $0 16667. It easy to see, then, that the
unit ticket value for the composite event needs to be $0 33333 + $0 16667 = $0 5. If it was
lower, say, $0 4, we could sell the two bets for A and B for $0 5, and use $0 4 to ensure that
we could honour the payo that A or B will win. If the unit ticket value for the composite event
was, instead, $0 6, we could sell the composite bet and, then, bet for A and B separately, to
honour the payo relating to the composite event.
Pricing derivatives in a preference-free format as in this example relies on the fact that
markets are complete, in that we could replicate the composite risk by betting on each single
team. The next sections extend this notion to various situation of interest that include nancial
security transactions. We begin with introducing the simplest securities, those that only pay
o in specic states of nature, similarly as the single bets of the World Cup of this section.
2.3.4 Arrow-Debreu securities
We now develop a very rst evaluation model relying on the notion of a special type of quite
simple securities, those that pay o one unit of numeraire in state , if state will prevail in the
future, and zero otherwise. We usually refer this set of assets to as Arrow-Debreu securities, in
honour of Kenneth Arrow and Gerard Debreu. Arrow-Debreu securities are perhaps the most
elementary we might imagine, and constitute the bricks of the entire asset evaluation framework
in Financial Economics, a bit like atoms in Physics, so to speak. Naturally, Arrow-Debreu
securities do not exist in reality, for the very simple reason that it is impossible to determine
states at such a pure level, in practice. States are simply part of a models assumptions.
While Arrow-Debreu securities are model-dependent, they can be used to price assets
after all, asset pricing is model-dependent by nature. This section develops a very intuitive link
between asset prices and Arrow-Debreu security prices, and explores how asset prices can, then,
be re-cast in terms of a certain probability under which they equate their expected payo s, in
an actuarial sense. We conclude with a derivation of the price of these securities based on a
simple equilibrium model.
2.3.4.1 Pricing
Consider a two-period economy where at time = 1, three mutually exclusive states of the
world may occur. We consider three Arrow-Debreu securities: the -th, AD , pays o $1 in
state , and zero otherwise, as in the next gure.
=1
=0
AD1
AD2
AD3
state 1
$1
state 2
$1
&
state 3
$1
Let
be the price of the asset AD , and consider a portfolio that has one unit of each of
the three assets. This portfolio yields one, for sure, at time = 1. Therefore, its price must be
61
c
by
A. Mele
3
X
(2.9)
=1
where denotes the riskless interest rate. More generally, consider an asset A0 , which pays o
in state , as in the next gure. Naturally, the values of
are known at zeroalthough of
course we do not know which
will be drawn at time = 1. Also shown in this gure are the
payo s of three assets, A , = 1 2 3, which are rescaled versions of Arrow-Debreu securities.
Let us explain.
A0
=1
=0
A1
state 1
state 2
&
state 3
A2
A3
$
0
0
$
By denition, paying 1 yields 1 if state 1 occurs, and zero otherwise. Therefore, paying
1 1 yields 1 1 if state 1 occurs, and zero otherwise. We can, then, slice the risks related
to asset A0 into three, by considering the three elementary assets A1 , A2 and A3 . Each of these
assets costs
at = 0. Moreover, a portfolio that has one unit of each of these three assets
needs to have the same value of A0 , say ,
=
3
X
(2.10)
=1
P
We can say more about . Consider Eq. (2.9), which we rewrite as, 3=1
= 1, where
= (1 + ) . Thus,
is a probability distribution. Replacing
into Eq. (2.10) reveals an
important property of ,
3
1 X
=
1 + =1
1
1+
( )
The price of A0 can be expressed as the expectation of the future payo , , discounted.
It is as if we evaluated assets through actuarial methods. We shall refer to
as risk-neutral
probability for this reason, and we shall discuss the properties of this benchmark probability in
detail in the next sections.
To sumup, Arrow-Debreu securities would allow us to price any asset, through Eq. (2.10).
In Section 2.4, we shall actually show that there are no arbitrage opportunities if and only if
there exists a vector : = > , where is a -dimensional vector of security prices,
is a matrix of security payo s, and is the number of states. In this general context,
the vector carries the natural interpretation of one containing Arrow-Debreu security prices.
Note, also, that the security we are evaluating in this specic example relies on the assumption
that available for trading are an appropriate set of Arrow-Debreu securities. Alternatively, we
may assume that only a selected number of Arrow-Debreu securities are available for trading,
62
c
by
A. Mele
in which case markets become incomplete, in a sense to be made precise in Section 2.3.6. In
this case, it is unlikely the given security could be given an unique as in Eq. (2.10) as we shall
explain.
Note, nally, an interesting property of Arrow-Debreu securities. Consider a general context
with a nite number of states, i.e. not necessarily equal to three, and dene the random return
of the -th asset as = , where = 1 in state , and zero otherwise. Given the objective
probability distribution of the states, , we have that each pair of asset returns is negatively
correlated, with,
s
(1
)
( ) =
( )=
( )=
2
(1
) (1
)
It is natural to expect these prices to be negatively correlated, as they pay o in mutually
exclusive states of the world! As we decrease the number of states, the correlation obviously
increases in absolute value, with the extreme case
( ) = 1, arising when there are
only two states, such that
=1
. It is also natural, as we increase the number of states,
there is a progressively higher number of states where any two Arrow-Debreu securities do
not pay o , which brings the correlation of their returns up. This correlation is still negative,
however.
2.3.4.2 Tying down Arrow-Debreu securities to consumption: an example
Consider the previous two-period economy, where now there is a single agent, who maximizes,
max [ ( 0 ) +
0
( ( 1 ))]
s.t. (i)
and (ii)
= 1 +
( 0 ( 1 + ) )
=
0(
)
0
( 0 ( 1 ) ) X
=
0(
0)
=1
3
( 1 )
0(
0)
(2.11)
0
0
( 1 )
( 0)
is high when 1 is low. In other words, the ArrowBy concavity of the utility function,
Debreu price for the bad states of nature are high or, risk-neutral probabilities assign high
values to bad states. We shall come back to the intepretation of this result many times in this,
and subsequent, chapters.
63
c
by
A. Mele
(1 +
( )) =
(1 +
( )) =
=1
(2.12)
=1
where
is the price of the -th asset, is the number of assets to put in the portfolio, and
( ) and ( ) are the net returns of asset in the two states of nature, which are part
of the information set available to Mr Law. Finally, and remarkably, we are not making any
assumption regarding Mr Laws preferences. We only know that he needs in state .
Eqs. (2.12) form a system of two equations with
unknowns ( 1
). If
2, no
perfect hedging strategies are available to Mr Lawthat is, Eqs. (2.12) cannot be solved to
obtain the desired pair ( ) = . We say markets are incomplete in this case. More generally,
consider an economy with
states of nature, ande dene markets to be complete if and
only if Mr Law has access to
assets, which allow him to achieve any consumption plan,
independently of his preferences. Preferences would only play a role, if any, when it comes to
pick up a particular consumption plan amongst all possible..
Let us, then, dene the following payo matrix,
1 (1
( 1 ))
(1 +
( 1 ))
(1 +
( ))
...
=
1 (1
( ))
where ( ) is the return performed by the -th asset in the state . To implement any state
contingent consumption plan
R , Mr Law needs to solve the following system,
=
where
R , the portfolio. If rank( ) = = , the previous system has a unique solution,
1
given by =
. Consider, for example, the previous case where = 2, and take = 2, for
any additional assets would be redundant here. Then, we have,
(1 + 2 (
1 [(1 + 1 ( ))(1 +
(1 + 1 (
2 =
2 [(1 + 1 ( ))(1 +
1 =
))
(1 + 2 ( ))
(1 + 1 ( ))(1 +
2 ( ))
))
(1 + 1 ( ))
(1 + 1 ( ))(1 +
2 ( ))
( ))]
( ))]
Finally, assume that the second asset is safe, or that it yields the same return in the two states
of nature: 2 ( ) = 2 ( )
. Let
= 1 ( ) and
= 1 ( ). Then, the pair (1 2 ) can be
rewritten as,
(1 + )
2 = (1 + )
1 =
)
)
1(
2 (1 + ) (
64
c
by
A. Mele
As is clear, we are dealing with an issue relating to the replication of random variables. Our
random variable is a state contingent consumption plan ( ) = , where
and
are known,
which we want to replicate for hedging purposesMr Law will need to buy either a pair of
sun-glasses or an umbrella, tomorrow.
In this example, any two-state variable can be generated by investing into two assets with
independent payo s. Our next step is to understand whether there are implications for the
price asset, A say, which would exactly deliver the same random variable ( ) = . Let be the
current price of asset A. We claim that,
1
+ 2
(2.13)
for the nancial market to be free of arbitrage opportunities, to be dened in a moment. Indeed,
if
, we can buy and sell at the same time the third asset A. The result is a sure prot,
or an arbitrage opportunity, equal to
, for generates
if tomorrow it will rain, and
if tomorrow it will not rain. In both cases, the portfolio generates the payments we need
to honour the sale of A. Likewise, we can show that the inequality
would lead to an
arbitrage. Hence, Eq. (2.13) must hold true.
We are left with the calculation of the right hand side of Eq. (2.13). We have:
=
1
[
1+
+ (1
) ]
(2.14)
Eq. (2.14) is an evaluation formula for the asset A, and says that the price, , can be
expressed as the present value of the expected payo s promised by A under a new probability,
, which for obvious reasons we term risk-neutral probability. Note that, remarkably, we are
able to price the asset A without any reference to agents preferences. The reasons underlying
this preference-free result link to the fact that we can replicate the asset A, through . Eq.
(2.14) does not obviously require that any agent is using this portfolio. For example, Mr Law
might be so poor that he could not even implement . The point is, rather, that the portfolio
could be used to implement an arbitrage opportunity, arising as soon as Eq. (2.13) does not
hold. In this case, any penniless agent could implement the arbitrage described above.
The next step is to extend Eq. (2.14) to a dynamic setting. Suppose an additional day is
available for trading, with the same description as before: the day after tomorrow, the asset A
payo s are
if it will be sunny, provided the previous day was sunny;
if it will be sunny,
provided the previous day was raining, etc. By the same arguments leading to Eq. (2.14), we
have that:
2
1
=
+ (1
)
+ (1
)
+ 2
2
(1 + )
Finally, by extending the same reasoning to
=
trading days,
1
(1 + )
( )
(2.15)
where
denotes the expectation taken under the probability .
The key assumption underlying Eq. (2.15) is that markets are complete at each trading day.
True, at the beginning of the trading period, Mr Law faces 2 mutually exclusive states of
nature for . If he did not have access to markets in each period, he would need 2 securities to
replicate the asset A. Yet in this example, 2 assets and
trading periods for these assets are
needed, to replicate and, then, price A. To emphasize this dynamic feature, we say the that the
65
c
by
A. Mele
structure of assets and transaction dates make the markets dynamically complete. Dynamically
complete markets allow to implement dynamic trading strategies that replicate the value of
the asset A, period by period. As a result, the asset A is priced, without any need to assume
anything about the agents preferences. The next section further claries these issues, and the
existence of the risk-neutral probability, .
2.3.6 Replication and pricing: the role of complete markets
What are the origins of the preference-free formula in Eq. (2.14)? Consider a general two-period
model where the assets deliver a payo matrix, , and denote as usual with the vector of
Arrow-Debreu prices, . We know that the initial price vector is, = > . In a setting with
complete markets, we can extract a unique vector of state prices from the previous relation,
> =
(2.16)
Next, we want to replicate any asset (e.g., a derivative) with nal payo
at the second period,
through a portfolio comprising the initial assets. Denote the initial value of this portfolio with
and the value in the second period with 0 ,
0
(2.17)
We want to use this portfolio to replicate the payo of the derivative in the second period,
i.e.,
1
= =
= 0=
1
into of the rst of Eqs. (2.17) and obtain the initial value of the replicating
We plug =
portfolio,
>
1
=
=
where the last equality follows by Eq. (2.16). As usual, two portfolios that yield the same thing
must be worth the same, in the presence of frictionless markets. Therefore, the price of the
derivative, , is,
>
1
()
= = =
1+
Done. We could price the given asset in this preference-free fashion because markets are
complete in that the asset payo can be replicated through securities that are actually traded
(the complete market assumption). These traded securities convey information about ArrowDebreu securities, , to the derivative, so to speak. Chapters 10 through 12 explain how this
extract-and-plug-in procedure regarding Arrow-Debreu security prices can be used, in practice, to evaluate complex derivative instruments.
In the next sections, we study Arrow-Debreu securities from a more theoretical perspective,
connecting them more deeply with the notion of absence of arbitrage, and relying on general
equilibrium model without production, and with possibly incomplete markets.
c
by
A. Mele
in state
1
= 1
,
1)
and
1)
= 1
. Consider the
...
1
Let
( ), [ 1
],
[
The budget constraint of each agent is:
0
=[
=
Let
=1
= 1
=1
We dene an arbitrage opportunity as a portfolio that has a negative value at the rst period,
and a positive value in at least one state of world in the second period, or a positive value in
all states of the world in the second period and a nonpositive value in the rst period. Let us
introduce the following pieces of notation.
Notation. [In progress ...] Given a vector
R ,
0 means that at least one component
of is strictly positive while the other components of are nonnegative. 0 means that all
components of are strictly positive.
0 means [ ]
0, = 1 , with at least one
for which [ ]
0.
0 means that [ ]
0, = 1 , i.e. it allows for [ ] = 0,
=1 .
[Insert here further notes]
Definition 2.8. An arbitrage opportunity is a strategy
initial investment
0, or a strategy that produces
0.
An arbitrage opportunity cannot exist in a competitive equilibrium, for the agents programs
would not be well-dened in this case. Precisely, consider the ( + 1) matrix,
=
the vector subspace of R
+1
,
h
i=
i,
h
i =
+1
R +1 :
67
=0
c
by
A. Mele
The interpretation of the vector subspace h i is that of the excess demand space in all the
states of nature, generated by the income transfers across states
induced by the portfoliochoice
The interpertation of Eq. (2.18) is that in the absence of arbitrage opportunities, there should
be no portfolios generating income transfers that are (i) non-negative and (ii) strictly positive in
at least one state, i.e. @ :
0. Hence, h i and the positive orthant R++1 cannot intersect.
R++ :
Theorem 2.9 provides foundations to the main architecture underlying asset evaluation in
frictionless markets. Pre-multiply the second constraint by > , obtaining,
>
)=
>
0)
where the second equality follows by Theorem 2.9, and the third equality is the rst period
budget constraint. Critically, then, Theorem 2.9 shows that in the absence of arbitrage, each
agent faces the following budget constraint,
0=
0+
>
0+
) with
=1
h i
(2.19)
where h i is the subspace generated by the payo s stemming from the portfolio choices,
R
h i=
R : =
Eq. (2.19) suggests to interpret as the price vector of commodities available for consumption
over the various states of nature, with rst-period consumption being the numeraire, in the spirit
of the Debreu (1959, Chapter 7) quote given in Section 2.3.1. Moreover, Theorem 2.9 tells us
that because = > , the vector is, then, the Arrow-Debreu state price vector, generalizing
the heuristic notions introduced in Section 2.3.4. Finally, we can interpret the subspace h i
as follows. Let 1 [1 ]> . Then, by Eq. (2.19),
h i
T +1
i
= {0}.
R
68
(2.20)
c
by
A. Mele
v2
v1
=2
= 1.
now generate any excess demand in R2 , just as in the Arrow-Debreu economy of Section 2.2.
To generate any excess demand, we multiply the payo vector 1 by 1 and the payo vector
2 by 2 . For example, suppose we wish to generate the payo the payo vector 4 in Figure
2.3. Then, we choose some 1 1 and 2 1. (The exact values of 1 and 2 are obtained by
solving a linear system.) In Figure 2.3, the payo vector 3 is obtained with 1 = 2 = 1.
We are in a position to state a fundamental result regarding the viability of the model.
Dene the second period consumption 1 [ 1
]> , where
is the second-period consumption in state , and let,
=
0
0
1
1
arg max1
( 0 )+
( ( ))
subject to
[2.P3]
0
1
1
=
0
and
are utility functions, both satisfying Assumption 2.1. Naturally, we could use
where
more general formulations of utilities than that in [2.P3], and in fact we shall in more advanced
parts of these Lectures. For the sake of this introductory chapter, we only consider additive
utility.
We have:
Theorem 2.10. The program [2.P3] has a solution if and only if there are no arbitrage
opportunities.
69
c
by
A. Mele
V3
v4
v2
V2
V4
V1
v3
v1
Proof. Let us suppose on the contrary that the program [2.P3] has a solution 0 1 ,
but that there exists a :
0. The program constraint is, with straightforward notation,
=
+ . Then, we may dene a portfolio = + , such that =
+ ( + ) =
+
, which contradicts the optimality of . For the converse, note that the absence of
arbitrage opportunities implies that
R++ : = > , which leads to the budget constraint
in (2.19), for a given . This budget constraint is clearly a closed subset of the compact budget
constraint
in [2.P1] (in fact, it is
restricted to h i). Therefore, it is a compact set and,
hence, the program [2.P3] has a solution, as a continuous function attains its maximum on a
compact set. k
X
=1
(0
) ; and for
= 1
, 0=
=1
) and 0 =
X
=1
( = 1
We now express demand functions in terms of the stochastic discount factor, and then look for
an equilibrium by looking for the stochastic discount factor that clears the commodity markets.
By Walras Law, clearing in the commodity market implies clearing in the nancial market.
Indeed, by aggregating the agents constraints in the second period,
X
=1
X
=1
0 and lim
( ) =
c
by
A. Mele
Lucas, Radner, Green. Every agent correctly anticipates the equilibrium price in each state of
nature.
[Consider for example the models with asymmetric information, to be dealt in in Chapter 9,
where we shall be concerned with inferences such as ( | () = ), where is the value of the
asset and () is the asset pricing pricing function depending on the state of nature . In these
markets,
( () ) + (1
) ( () ) + = 0, and we look for a solution () satisfying
this equation.]
[In progress]
2.5.2 Stochastic discount factors
Theorem 2.10 states that in the absence of arbitrage opportunities,
=
>
= 1
=1
=1
1
1+
(2.21)
, such that
X
(2.22)
=1
Eq. (2.22) conrms the economic interpretation of the state prices in Eq. (2.19). Because
the states of nature are exhaustive and mutually exclusive,
is the price to be paid today to
obtain one unit of numeraire, tomorrow, in state . It is actually the economic interpretation
of the budget constaint in (2.19), conrmed by Eq. (2.22), which says that the prices of all
these rights sum up to the price of a pure discount bondi.e. the asset yielding one unit of
numeraire, tomorrow, for sure.
Eq. (2.22) can be elaborated to provide us with a second interpretation of the state prices in
Theorem 2.10. Dene,
(1 + )
which satises, by construction,
X
=1
=1
1 X
1 + =1
=1
1
1+
= 1
(2.23)
Eq. (2.23) conrms Eq. (2.14), obtained in the introductory example of Section 2.5. It says
that the price of any asset is the expectation of its future payo s, taken under the probability , discounted at the risk-free interest rate . For this reason, we usually refer to the
probability
as the risk-neutral probability. Eq. (2.23) can be extended to a dynamic context, as we shall see in later chapters. Intuitively, consider an asset that distributes dividends
71
c
by
A. Mele
in every period, let ( ) be its price at time , and ( ) the dividend paid o at time .
Then, the payo it promises for the next period is ( + 1) + ( + 1). By Eq. (2.23),
( ) = (1 + ) 1
( ( + 1) + ( + 1)) or, by rearranging terms,
( + 1) + ( + 1)
()
=
(2.24)
()
That is, the expected return on the asset under equals the safe interest rate, . In a dynamic
context, the risk-neutral probability is also referred to as the risk-neutral martingale measure,
or equivalent martingale measure, for the following reason. Dene a money market account as
an asset with value evolving over time as
( ) (1 + ) . Then, Eq. (2.24) can be rewritten
as ( )
() =
[( ( + 1) + ( + 1))
( + 1)]. This shows that if ( + 1) = 0 for
some , then, the discounted process ( )
( ) is a martingale under .
1
Next, let us replace into the budget constraint in (2.19), to obtain, for ( 1
) h i,
0=
0+
)=
0+
=1
1 X
1 + =1
)=
0+
1
1+
(2.25)
For reasons developed below, it is also useful to derive an alternative representation of the
budget constraint, in terms of the objective probability , say. Accordingly, we introduce a
and
are,
ratio state-dependent ratio , which indicates how far
= 1
1
1+
)=
(2.26)
We have,
1
1+
X
=1
1
1+
1
(
1+
=1 | {z }
1
1
1+
)=
h i
= 1
(2.27)
We now derive optimality conditions and, then, solve for the equilibrium in this economy.
72
c
by
A. Mele
1
max
(
(
(
))
subject
to
0
=
( 1
0 )+
0
0 +
1
(
h i
[2.P4]
This way to formalize the agents problem and constraints is quite convenient and, it helps
understand the nature of incomplete markets, as in the cases illustrated through Figures 2.2
and 2.3. The present section will further illustrate how useful the representation of the program in [2.P4] is, while studying incomplete markets in general, through the so-called min-max
stochastic discounting approach. Magill and Quinzii (1996) contain an extensive analysis of how
this representation helps studying general equilibrium with incomplete markets in quite general
models. Truth be said, the representation in [2.P4] is not the exclusive way to study decision
problems arising in incomplete markets. In some cases, such as those covered in Section 8.5
of Chapter 8, it seems easier to make reference to the initial program in [2.P3]. The choice of
which program to consider, [2.P3] or [2.P4], quite depends on the problem at hand.
In the complete markets case, h i = R , so that the rst order conditions to the program [2.P4]
are,
0
0
(0 ) =
( ) =
= 1
(2.28)
where
is a Lagrange multiplier. So, really, the properties of this model are the same as those
of the static model in Section 2.2. Formally, the complete markets economy in this section is the
same as the static economy in Section 2.2, once we set = , where is the dimension of the
commodity space, in Section 2.2, and
= , where
is the price of the -th commodity in
Section 2.2, with 1 = 1 (the numeraire), and
is the Arrow-Debreu state price in the unied
budget constraint of Eq. (2.25).
These facts formalize the reasoning made at the beginning of Section 2.3 (see Sections 2.3.1
and 2.3.2): when markets are complete, an economy with uncertainty can be understood through
a static one. Complicated markets with heterogeneous agents, but with potentially interesting
asset pricing implications, and still, apparently, so hopelessly di cult to analyze, can be centralized through a dedicated design of Paretos weights, as formalized in Theorem 2.7.
These properties are robust to dynamic extensions, as explained in more advanced parts of
these Lectures (see Chapters 4 and 8), provided markets are dynamically completea property
explained in the next two chapters. However, the assumption seems unrealistic that agents can
trade Arrow-Debreu securities for all states of the world: one reason nancial innovation is in
practice so pervasive is that markets are incomplete. Yet as discussed in Chapter 8, centralizing
marktes is a concept that has been extended to economies with incomplete markets setting,
relying on stochastic Pareto weights.
We now derive the equilibrium implications of the rst-order conditions in the simple case of
an economy with a single agent, where the following stochastic discount factor is:
0
=
The economic interpretation of
0(
)
0)
0(
73
)
0)
c
by
A. Mele
= (1 + )
= (1 + )
)
0)
0(
(1 + ) =
0(
0)
)]
0(
)]
The case with heterogenous agents is similar provided markets are complete. By the optimality conditions applying to each agent (see Eq. (2.28)), and Eq. (2.26), the marginal rate of
substitution for each agents are:
0
0
( )
(0 )
= 1
= 1
(2.29)
That is, in equilibrium agents do have the same marginal rate of substitution when are
markets are complete. It is so because the vector of state prices is unique if and only if markets
are complete (see Theorem 2.9), implying
is unique. The marginal rates of substitution are,
then, independent of , and the equilibrium allocation is a Pareto optimum as a result, by the
discussion at the beginning of this section, and Theorem 2.5.
The fact that agents have the same marginal rates of substitution in each state of the world
is known as risk-sharing. It means that, given an initial endowment distribution, the market
mechanism is capable to shift risks around through a system of complete security markets.
Risks are borne more by the agents most willing to take them.
Suppose for example that two agents have the same utility and discount rate but that the
rst agent is less risk-averse than the second, i.e. the CRRAs satisfy: 1
2 . Then, Gr 1 =
(Gr 2 ) 2 1 , where Gr is consumption growth for the -th agent in state . In good times, when
Gr 2
1, the more risk-averse agent experiences a lower consumption growth rate ex-post,
Gr 2 Gr 1 . However, in bad times, when Gr 2 1, the more risk-averse agent experiences a
higher consumption growth rate ex-post, Gr 2 Gr 1 , as illustrated by Figure 2.4 in the case
where 2 1 = 3. In other words, capital markets, when complete, operate in such a way to
have the more risk-averse agent face a less volatile consumption growth.
74
c
by
A. Mele
1.6
Gr_1
1.4
1.2
Gr_2
1.0
0.8
0.6
0.8
0.9
1.0
1.1
1.2
Gr_2
FIGURE 2.4. Equilibrium consumption growth rates of two agents with di erent riskaversion. The dashed line depicts the consumption growth rate of the more risk averse
agent, and the solid is the consumption growth rate of the more risk-averse. The ratio of
the two CRRA is 3.
Risk-sharing carries another meaning, that of mutuality (Wilson, 1968). Suppose that the
distribution of endowments across the population is heterogeneous, in that there are some
states of the world in which some agents are better o than others. A system of complete
markets would allow the agents to insure each other in a way that they would only bear the
macroeconomic risk, not the idiosyncratic risk. Let us illustrate. By Eq. (2.29),
0
0
( )
=
(0 )
0
0
( )
(0 )
, and state
(2.30)
Suppose, then, that the aggregate endowment is higher in some state than in some other
. Then, we claim that each agent has a strictly higher consumption in state than in 0 . In
particular, this means that in absence of aggregate risk (aggregate endowment being the same in
each state), the agents bear no risk. Indeed, consider two distinct states and 0 and assume that
0 . Then, there must exist an individual
the aggregate endowment satises
such that
0
0
0 and, hence, 0 ( )
( 0 ). By Eq. (2.30), then, 0 ( )
( 0 ) and, hence,
0 for any agent . Finally, note that because the utilities () are state-independent,
the equilibrium distribution of the aggregate endowment does not depend on .
The previous conclusions imply that the equilibrium allocations are state-independent functions of the aggregate endowment,
= ( )
0
for some strictly increasing functions (). Whence, the mutuality result mentioned above: the
equilibrium allocations do not depend on the state of the economy if the aggregate endowment
does not vary across states .1 The functions () are known as Pareto sharing rules. Huang
1 To illustrate, consider an economy with two agents ( and ) and no-aggregate risk. Pareto allocations can then be characterized
within an Edgeworths box where the axes indicate consumption in the two states 0 and . Note that the 45o line of the box is
75
c
by
A. Mele
and Litzenberger (1988, Chapter 5) explain in detail further properties of these functions. In
particular, () are linear if and only if the utility functions take the following form, ( ) =
1
( + ) , for some constants and .
2.5.3.2 Incomplete markets
Agents marginal rates of substitution cannot be equal if markets are incompleteexcept perhaps on a negligible set of endowment distribution. The best outcome in this case, is a set
of equilibria known as constrained Pareto optima, i.e. constrained by ... the states of nature.
It might actually turn out that no constrained Pareto optima could even exist in multiperiod
economies with incomplete markets. [Elaborate, in progress]
When market are incomplete, the state price vector is not unique. That is, suppose that
>
is an equilibrium state price. Then, all the elements of
={
R++ : (
)>
= 0}
(2.31)
are also equilibrium state prices. That is, there exist many equilibrium state prices consistent
with absence of arbitrage. Or, in other words, there are many equilibrium state prices consistent
with the same observable asset price vector , for 0 > = > = .
How do we proceed in this case? Introduce the following budget constraint:
1
1
>
+1
1
1
C=
R++
:0= 0
h i
R++ : = >
0+
(2.32)
Because in Eq. (2.31) has many elements as discussed, the budget constraint C does then
include many constraints to account for in a context with incomplete markets: the martingale
methods in the previous sections do not apply anymore.
Yet let Val ( ) be the value of the following program in the incomplete markets at hand:
( 0 )+
( ( 1 ))
[2.PI ]
max
C
+1
R++
:0= 0
0+
C =
for some given
>
R++ :
(
=
1
>
max
( 0 )+
( ( 1 ))
C
[2.P ]
Clearly, we have, Val ( ) Val ( ) for all , for the constraint in the incomplete markets
case, C, is more stringent than that in any complete market setting, C : the solution to the
program in the incomplete markets case [2.PI ], must satisfy the budget constraints in C, formed
using all of the possible Arrow-Debreu state prices (including the Arrow-Debreu state price
1
given in C ), as the constraint in Eq. (2.32) shows. Moreover, ( 1
) h i. These remarks
suggest to dene the following min-max Arrow-Debreu state price:
= arg min Val (
the set in which the two agents are perfecty insured across states. Moreover, the agents indi erence curves are tangent along the
0
( )
0
0
0
=
. Because
= 0 ( )
= 0 , this tangency
contract curve (i.e. the set of Pareto allocations), meaning that
condition is met only on the 45o line: Pareto allocations lead to mutuality.
76
c
by
A. Mele
) = Val (
(2.33)
This is indeed the case, under regularity conditions. For the characterization of , suppose
there exists : Val ( ) = Val( ). Then, = . Indeed, suppose the contrary, i.e. there exists
0
: Val( 0 ) Val( ). Then, we would have,
Val (
Val(
Val(
= Val (
0 = ( )
=
where and
and 0 ,
and
the constraint,
0 = 0
(2.34)
(1
) =
( )
=1
( )
( )+
( )=
, lim
( ) = 0 and
+ (
) =
( )
where () denotes the inverse function of . By replacing back into Eqs. (2.34), we obtain:
0 =
1)
=
1)
0 + (
0 + (
77
c
by
A. Mele
2.6. Consumption-CAPM
To determine the general equilibrium, we need to pin down the stochastic discounting factor,
. We have unknowns ( , = 1 ), and + 1 equilibrium conditions (holding in the
+ 1 markets). By Walras law, only of these are independent. Consider the equilibrium
conditions in the markets at the second period:
X
X
1
; ( 0 ) 0 6=
1) =
= 1
0 + (
=1
=1
These conditions determine the kernel ( ) =1 which leads to compute prices and equilibrium
allocations. Finally, once the optimal are computed, for = 0 1 , the portoio
1 1
1
generated them can be inferred through =
(
).
2.6 Consumption-CAPM
We refer a theory to be consumption-based, if the equilibrium expected returns, or the riskpremiums, are determined through optimal consumption choices. Under certain conditions,
studied in this section, one of its prediction is that these risk-premiums are high for securities
that pay high returns when consumption is high (i.e. when we dont need high returns) and low
returns when consumption is low (i.e. when we need high returns). In fact, while this statement
is quite often used to explain, in a nutshell, the quintessence of the Consumption-CAPM, it
might be misleading in economies where assets are in zero net supply, such as those we studied
in previous sections.
Consider, for example, Eq. (2.27). It is an asset pricing equation (2.27), which states that for
every asset delivering a gross return,
1=
(2.35)
where is some pricing kernel. Let us elaborate on Eq. (2.35), so as to obtain a representation
of the expected return on any asset. Naturally, for a riskless asset, 1 = ( ), which combined
with Eq. (2.35) leaves [ (
)] = 0, and by rearranging terms,
( 0( +) )
[ 0 ( + )]
( ) =
(2.36)
( )
=
( )
(2.37)
We also know from previous sections, that in economies with a single agent, or in economies
with complete markets, this pricing kernel is given by:
=
(
0(
)
0)
That is, the pricing kernel directly links to optimal consumption choicewe are dealing with a
consumption-based CAPM. However, the risk-premium,
( ), does merely depend on
how the pricing kernel co-varies with the asset returns, and not directly on the cyclical properties
of dividends. For example, were endowments be constant in the second period, the risk-premium
would be zero! The next section aims to clarify how we should set the Consumption-CAPM as
an appropriate context to think about the cyclical properties of asset returns.
78
c
by
A. Mele
2.6. Consumption-CAPM
2.6.1 Risk-neutral pricing and macroeconomic risks
Up to now, we have considered assets in zero net supply. We now generalize previous ndings to
the relevant case where some assets could be in positive supply, which is a case applying many
times in these Lectures. For simplicity, consider an economy which only has an asset in positive
supply equal to 0 , which is the endowment of a representative agent. The budget constraints
of the agent over his two period of life are:
1
and 2 = +
(2.38)
where and are the risky asset price and demand, is the initial endowment, 1 is rst period
consumption, is the random endowment for the good in the second period, is the random
dividend promised by the asset in the second period, and 2 is second-period consumption. We
assume that 0.
Note that this model is one with incomplete markets, because the agent can invest using only
one asset, and yet there are two sources of risk: the asset dividend and endowment. We shall
analyze this model in Chapter 8, to explain the extent to which incomplete markets might help
rationalize the equity premium we observe, in practice, within a consumption-based perspective.
In equilibrium, = 0 , such that the asset price equation is,
0
( 0 + )
(2.39)
=
0( )
The previous sections have studied the special case where the asset is in zero-net supply,
0, the asset price
0 = 0, as mentioned. Eq. (2.39) shows that in general, and assuming
is decreasing with asset supply, 0 . It is not a mere supply-demand e ect but, rather a
risk-premium e ect, as we now explain.
We know that the state price for state is, consistently with results in Section 2.6.3,
=
+
( )
0
(2.40)
It is, as usual, the present consumption we are willing to give up, today, to obtain additional
consumption tomorrow, in state . Due to decreasing marginal
utility, we have that consumption
0 ( )
= 0 (0 ) , for = 1 , such that is
demand in state , , is decreasing in , :
decreasing in . Therefore, in equilibrium,
is decreasing in the good supply in state ,
+
.
0
The previous scarcity e ect, by which a shrinkage in supply for state determines an
increase in the state price , is quite simple yet powerful. As we know, we price assets through
Arrow-Debreu assets, by assigning high weights, i.e. by utilizing high Arrow-Debreu prices, to
the bad states of naturea scarcity e ect. Therefore, an increase in the asset supply 0 , being
uniform across all states of nature, mitigates the previous scarcity e ect and, then, lowers the
entire set of Arrow-Debreu prices, thereby reducing the asset price, . In other words, the asset
price decreases because consumption become cheaper in each state of the world, due to this
scarcity channel, which requires less demand for savings.
We can approach this problem from a di erent angle. Note that we can rewrite Eq. (2.39) as
follows,
0
( )
( )
0
=
+ 0
( 0 + )
( 0) =
(2.41)
( 0)
( )
0( + )
0
79
c
by
A. Mele
2.6. Consumption-CAPM
The rst term is an actuarial evaluation of the asset. The second term is a risk-premium, which
comes as a discount to the initial actuarial evaluation, given the assumptions of decreasing
marginal utility, and
( ) 0. The previous equation reveals that an increase in 0 entails
an heavier discounting e ect, as the interest rate, ( 0 ), increases with 0 , as a result of
a decreased demand for savings. The second term is negative, as explainedthe asset pays
o exactly when it is not neeeded. However, it becomes thinner and thinner as 0 increases,
reecting the fact that as 0 increases, and given the assumption that 0, the agent bears
less and less risk: even over poor realization of the states would guarantee handsome overall
returns, when the asset supply is large. As we know from the discussion of Eq. (2.39), however,
the discounting e ect dominates, with asset prices falling as 0 increases.
Note that the previous reasoning does not apply, once we assume the asset is in zero net
supply, 0 = = 0, because as Eq. (2.38) makes clear, capital markets do not a ect equilibrium
consumption anyway. Instead, in this case, only a pure endowment scarcity channel leads to
low Arrow-Debreu prices for the bad states of the world: those states where endowments are
lower command higher Arrow-Debreu prices, at least provided that
( )
0, such that
2
the second term in Eq. (2.41) is negative.
All in all, Arrow-Debreu prices are independent of the asset returns, when the asset is in
zero net supply. When, instead, 0 0 and, still,
( ) 0, one additional scarcity channel
is activated, which consists in a drop in consumption generated by a drop in the dividend.
Suppose, for example, that we enter bad times, when
falls as a result for example of a job
loss. Not only, then, does
fall, the dividend also likely falls, and exactly when you would need
it to compensate for the fall in . It is in this sense, which we may say that asset investing
might make consumption even more volatile.
The second part of these Lectures deals with the literature on the equity premium puzzle,
the challenges that consumption-based explanations of the equity premium have to be consistent
with the observed equity premium.
2.6.2 The beta relation
Suppose there is a such that
=
all
In this case,
( ) =
( )
[ 0 ( + )]
( )=
and
( )
[ 0 ( + )]
[ ( )
( )
( )
( )
80
c
by
A. Mele
( )
(2.42)
Using Eqs. (2.37) and (2.42),
( )
( )
)
)
(
(
[ ( )
[CCAPM]
( )
( )
: =
, then
and then
( )
[ ( )
[CAPM]
This is not the only way the CAPM obtains. As we shall explain in Chapter 6, the CAPM also
obtains through the so-called maximum correlation portfolio, which is the portfolio that is
the most highly correlated with the pricing kernel .
0)
(0) (0)
)=
(0)
(0) (0)
=1
= 1
or,
0( 0
0)
X
=1
(0)
=0
The previous relation holds in a two-period economy. In a multiperiod economy, in the second
period (as in the following periods) agents save indenitively for the future. In the appendix,
we show that,
"
#
X
0=
(2.43)
0
=0
where 0 are the state prices. From the perspective of time 0, at time
of nature and, thus, possible prices.
81
there exist
states
c
by
A. Mele
( )
0=
=1
( )
0=
=1
( )
=1
where the previous functions are the results of optimal plans of the agents. This system has
( + 1) + equations and ( + 1) + unknowns, where
. Let us aggregate the
constraints of the agents,
=1
=1
=1
=1
0 0
=1
0 =
( ) ( )
0 0
=1
= 0. Then,
=1
=1
1 1
( )
( )
1 ( 1) 1 ( 1)
=1
=1
( )
1 (
( )
1 (
>
)
Therefore, there is one redundant equation for each state of nature, or + 1 redundant
equations, in total. As a result, the equilibrium has less independent equations ( ( + 1) 1)
than unknowns ( ( +1)+ ), i.e., an indeterminacy degree equal to +1. This result does not
rely on whether markets are complete or not. In a sense, it is even not an indeterminacy result
when markets are complete, as we may always assume agents would organize the exchanges
at the beginning. In this case, onle the suitably normalized Arrow-Debreu state prices would
matter for agents.
The previous indeterminacy can be reduced to
1, as we may use two additional homogeneity relations. To pin down these relations, let us consider the budget constaint of each agent
,
0 0 =
1 1 =
The rst-period constraint is still the same if we multiply the spot price vector 0 and the
nancial price vector by a positive constant, (say). In other words, if (0 1 ) is an equilibrium, then, ( 0 1 ) is also an equilibrium, which delivers a rst homogeneity relation.
To derive the second homogeneity relation, we multiply the spot prices of the second period by
a positive constant, and increase at the same time the rst period agents purchasing power,
by dividing each asset price by the same constant, as follows:
0 0
1 1
is also an equilibrium.
c
by
A. Mele
The previous indeterminacy arises because nancial contracts are nominal, i.e. the asset payo s
are expressed in terms of some unite de compte that, among other things, we did not make
precise. Such an indeterminacy vanishes if we were to consider real contracts, i.e. contracts
with payo s expressed in terms of the goods. To show this, note that in the presence of real
contracts, the agents constraints are
0 0 =
) 1 ( ) = 1( )
= 1
1(
where
= [ 1
] is the
matrix of the real payo s. The previous constraint
now reveals how to recover + 1 homogeneity relations. For each strictly positive vector
= [ 0 1
], we have that if [0
) 1 ( )] is an equilibrium, then,
1( 1)
1(
[ 0 0 0
(
)
(
)
(
)]
is
also
an
equilibrium,
and
so
is
1
1
1
1
[0
) 1 ( )], for , = 1 .
1( 1)
1(
As is clear, the distinction between nominal and real assets has a precise meaning, when
one considers a multi-commodity economy. Even in this case, however, such a distinctions is
not very interesting without a suitable introduction of a unite de compte. These considerations
led Magill and Quinzii (1992) to solve the indeterminacy while still remaining in a framework
with nominal assets. They simply propose to introduce money as a mean of exchange. The
indeterminacy can then be resolved by xing the prices via the + 1 equations dening the
money market equilibrium in all states of nature:
=
= 0 1
=1
83
=0
is generically nonneutral.
c
by
A. Mele
2.9. Appendix 1
2.9 Appendix 1
In this appendix we prove that the program [2.P1] has a unique maximum. Indeed, suppose on the
contrary that we have two maxima:
= (1 ) and = 1
P
P
with
. To check that this
These two maxima would satisfy () = (),P
=1 =
=1 =
. Then, the consumption bundle,
claim is correct, suppose on the contrary that
=1
=
would be preferred to , by Assumption 2.1, and, at the same time, it would hold that, for su ciently
small ,
X
X
= 1+
.
=1
=1
[Indeed, we have,
.
0: + 1
. E.g., 1 =
,
0. The
=1
condition is then:
0:
.] Hence, would be a solution to [2.P1], thereby contradicting
the optimality of . Therefore, the existence of two optima would imply a full use of resources. Next,
) ,
(0 1). By Assumption 2.1,
consider a point lying between and , viz = + (1
( )=
() = ( )
+ (1
)
Moreover,
X
=1
X
=1
+ (1
=1
=1
=1
Hence,
( ) and is also strictly preferred to and , which means that and
as initially conjectured. This establishes uniqueness of the solution to [2.P1].
84
=
are not optima,
c
by
A. Mele
>
subsets
of , or
= . Because
is closed, then, by the Minkowskis separating theorem,
there exists a
R and two distinct numbers 1 , 2 such that
>
>
preferred to , we have:
=1
>
>
=1
or, by replacing
=1
with
=1
X
=1
,
>
>
=1
(2A.1)
=1
P
, and partition = (1 ). Let us apply
Next we show that
0. Let =
=1 , = 1
. We have 1
0, or
the inequality in (2A.1) to
and, for
0, to = (1 + )
0. By reiterating the argument,
0 for all . Finally, we choose = + 1
= 2 ,
1
> 1 + >1
or,
0 in (2A.1), > 1
> 1
> 1
1
for su ciently small. This means that 1 ( 1 )
1 ( )
1
>
1
>
1
= . By symmetry, = arg max
arg max 1 1 ( ) s.t.
> 1
> 1 .
( ) s.t.
>
Proof of Theorem 2.9. The condition in (2.18) holds for any compact subset of R++1 , and
therefore it holds when it is restricted to the unit simplex in R++1 ,
h
S = {0} .
85
> ,
= 1
h i,
S . By
. On the other hand,
c
by
A. Mele
+1
0 h i, which reveals that 1 0, and R++
. Next we show that > = 0. Assume the contrary,
h i that satises at the same time > 6= 0. In this case, there would be a real number
i.e.
>
h i and
with sign( ) = sign( > ) such that
2 , a contradiction. Therefore, we have
>
>
>
>
= ( (
) ) = ( 0 + ( ) ) ,
R , where ( ) contains the last components
0=
The proof of the converse is immediate (hint: multiply by ): shown in further notes.
The proof of the second part is the following one. We have that each point of R +1 is equal to
each point of h i plus each point of h i , or dim h i + dim h i = + 1. Since dim h i =
rank( ), dim h i = + 1 dim h i, and since = > in the absence of arbitrage opportunities,
dim h i = dim h i = , whence:
dim h i =
+1
>
>
In other terms, before we showed that :
= 0, or
h i . Whence dim h i
1 in
the absence of arbitrage opportunities. The previous relation provides more information. Specically,
>
dim h i = 1 if and only if = . In this case, dim{ R++1 :
= 0} = 1, which means that
>
the relation
= 0 also holds truefor
=
for every positive scalar , but there are no
0 +
>
1
other possible candidates. Therefore,
= is such that = ( ), and then it is unique.
0
0
>
By a similar reasoning, dim{ R++1 :
= 0} =
+1
dim
R++ : = >
=
.
k
(2)( )
be the price at
= 2 in state
(2)( )
if the state in
at
= 3. Let
(1)( )
= 1 is
(2)
0
]. Let
be the quantity purchased at = 1 in state of Arrow-Debreu securi[ 0
0
ties promising 1 unit of numeraire if at = 2. Let 2 be the price of the good at = 2 in state if
the previous state at = 1 was . The budget constraint is
0( 0
1
0)
1
(0) (0)
(0)( )
(0)( ) (0)( )
=1
(1) (1)
(0)( )
(1)( ) (1)( )
= 1
=1
(1)( )
is the price to be paid at time 1 and in state , for an Arrow-Debreu security yielding 1
where
unit of numeraire in state at time 2. The previous two equations can be combined to leave,
i
P (0)( ) h 1 1
(1) (1)
1
+
0( 0
0) =
or,
0=
0( 0
0)
0( 0
0)
0( 0
0)
At time 2,
2
P
P
P
(1)( )
(0)( ) 1
(0)( ) 1
(0)( ) 1
(1)( )
(2) (2)
+
+
+
P
P
(0)( ) P
P P
X
=1
86
(2)( ) (2)( )
= 1
(2A.2)
(2A.3)
c
by
A. Mele
where
denotes the price vector to be paid at = 2 in state if the state at = 1 is , for the
Arrow-Debreu securities expiring at = 3, with remaining notation being straightforward.
Plugging Eq. (2A.3) into Eq. (2A.2) leaves:
i
P P (0)( ) (1)( ) h 2 2
P (0)( ) 1 1
(2) (2)
1
2
+
+
0= 0( 0
0) +
P P (0)( ) (1)( ) 2 2
P (0)( ) 1 1
1
2
= 0( 0
(
)
+
0) +
P P P (0)( ) (1)( ) (2)( ) (2)( )
(2A.4)
+
In the absence of arbitrage,
is 0 , such that:
+1
( )( )
0
+1
= 1
R+ and has all zeros except in the -th component which is 1. Next, we restate the
where
( )
previous relation in terms of the kernel +1 0 = ( +1 0 ) =1 and the probability distribution +1 0 =
(
( )
+1
=1
( )
+1
( )
+1
0:
is
= 1
(2A.5)
Eq. (2.43) in the main text follows by replacing Eq. (2A.5) into Eq. (2A.4), and by imposing the
transversality condition:
X XX X
1 =1
2 =1
3 =1
4 =1
=1
87
1)( )
1
c
by
A. Mele
(1)
where
1( 1)
1 2
2)
1(
2
= ( 1(
1)
2 1
1(
..
=
1(
)
2
2 1
1( 1)
1(
1
1)
is the payo s matrix. We can rewrite the second period constraint as 1 1 = , where 1 is
( 1 ( 1 ) 1 ( 1 ) 1 ( ) 1 ( ))0 . The budget constraints are
dened similarly as 0 , and 1 1
then,
0 0 =
1 1 =
Now suppose that markets are complete, i.e., = and can be inverted. The second constraint
1
=
= . We have
is then:
1
1 . Consider without loss of generality Arrow securities, or
= 1 1 , and by replacing into the rst constraint,
0 =
0 0
0 0
0 0
0 0
1 1
( 1( 1) 1 ( 1)
P
+
1( ) 1 ( )
1(
=1
P1
=1
P1
=1
( ) ( )
0
0
( ) ( )
0
0
=1
P P2
=1 =1
P2
=1
( )
1 (
( )
1 ( )
88
( )
( )
))0
c
by
A. Mele
( )
where 1 ( )
1 ( ). The price to be paid today for the obtention of a good in state is equal
( )
to the price of an Arrow asset written for state multiplied by the spot price 1 ( ) of this good in this
( )
state; here the Arrow-Debreu state price is 1 ( ). The general equilibrium can be analyzed by making
. Then we are left with
reference to such state prices. From now on, we simplify and set 1 = 2
(1)
( )
(1)
( )
determining ( + 1) equilibrium prices, i.e. 0 = ( 0 0 ), 1 ( 1 ) = (1 ( 1 ) 1 ( 1 )),
(1)
( )
, 1 ( ) = (1 ( ) 1 ( )). By exactly the same arguments of the previous chapter, there
exists one degree of indeterminacy. Therefore, there are only ( + 1) 1 relations that can determine
the ( + 1) prices. (Price normalization can be done by letting one of the rst period commodities
be the numeraire.) On the other hand, in the initial economy we have to determine ( + 1) + prices
( +1)
R++ which are the solution to the system:
( ) R++
P
=1
( ) = 0
( ) = 0
=1
( ) = 0
=1
where the previous functions are obtained as solutions to the agents programs. When we solve for
Arrow-Debreu prices, in a second step we have to determine ( + 1) + prices starting from the
knowledge of ( + 1) 1 relations dening the Arrow-Debreu prices, which implies a price indeterminacy of the initial economy equal to + 1. In fact, it is possible to show that the degree of
indeterminacy is only
1.
89
c
by
A. Mele
References
Arrow, K. J. (1953): Le role des valeurs boursi`eres pour la repartitition la meilleure des
risques. Econometrie 41-48. CNRS, Paris. Translated and reprinted in 1964: The Role
of Securities in the Optimal Allocation of Risk-Bearing. Review of Economic Studies 31,
91-96.
Bernoulli, D. (1738): Specimen Theoriae Novae de Mensura Sortis. Commentarii Academiae
Scientiarum Imperialis Petropolitanae V, 175-192. Reprinted in English in 1954: Exposition of a New Theory on the Measurement of Risk. Econometrica 22, 23-36
Debreu, G. (1954): Valuation Equilibrium and Pareto Optimum. Proceedings of the National
Academy of Sciences 40, 588-592.
Debreu, G. (1959): Theory of Value: An Axiomatic Analysis of Economic Equilibrium. New
Haven: Yale University Press.
Du e, D. (2001): Dynamic Asset Pricing Theory. Princeton: Princeton University Press.
Du e, D. and W. Shafer (1985): Equilibrium in Incomplete Markets: I. A Basic Model of
Generic Existence. Journal of Mathematical Economics 13 285-300.
Hart, O. (1974): On the Existence of Equilibrium in a Securities Model. Journal of Economic
Theory 9, 293-311.
He, H. and N. Pearson (1991): Consumption and Portfolio Policies with Incomplete Markets
and Short-Sales Constraints: The Innite Dimensional Case. Journal of Economic Theory
54, 259-304.
Huang, C-f. and R.H. Litzenberger (1988): Foundations for Financial Economics. New York:
North-Holland.
Magill, M. and M. Quinzii (1996): Theory of Incomplete Markets. Cambridge: MIT Press.
Wilson, R. (1968): The Theory of Syndicates. Econometrica 36, 119-132.
90
3
Innite horizon economies
3.1 Introduction
This chapter extends the analysis of the previous two. Consumption is still an important determinant of asset prices. At the same time, this chapter analyzes asset prices in multiperiod
economies. We consider simple economies, in which agents either live forever and have access to
a set of complete markets, or belong to overlapping generations. We consider models without
and with production, without and with money. We aim to develop fundamental tools applied
or extended in subsequent chapters while dealing with nancial frictions, bubbles or sunspots.
[In progress]
0)
max
( )
+1
=0
=(
( )
[3.P1]
=0
+1
) =0 given
+1 )]
s.t.
+1
=(
(3.1)
+1
By replacing the wealth constraint into the maximand, it is easily checked that the rst-order
0
( +1 ) +1 . Therefore, the consumption policy is a function
condition for leads to, 0 ( ) =
of both wealth and the interest rate, which for sake of simplicity we denote as ( ). The value
function and the rst-order condition, then, can be written as:
( ) = ( ( )) +
((
( ))
+1 )
( ( )) =
((
( ))
+1 )
+1
c
by
A. Mele
By di erentiating the value function, and using the rst-order condition, leaves the envelope
condition:
0
( )=
Therefore,
( ( )) 0 ( ) +
+1 )
( (
+1 ))
((
( ))
+1 ) (1
( ))
+1
( ( ))
(3.2)
( ( +1 ))
=
0 ( ( ))
(3.3)
+1
The economic intuition underlying Eq. (3.3) is the same as that we saw in the two-period
economy analyzed in Chapter 2. Eq. (3.3) says that along an optimal consumption path, the
present consumption I give up at to obtain additional consumption at + 1 has to equal the
price at of a pure discount bond. That is, the bond price is the relative price of consumption
tomorrow relative to consumption today.
We can achieve the same conclusion relying on an alternative approach, based on Lagrange
multipliers. This approach is useful when dealing with more intricate issues relating to production economies or economies with nancial frictions, as we shall see in this and further chapters.
So consider the constraint in program [3.P1]. Savings at time are sav
. Using this
denition, the constraint in [3.P1] is: +1 + sav +1 = +1 sav , with sav 1 = 0 , given. Let
be a sequence of Lagrange multipliers associated to these constraints. Consider the program,
L (sav 1 )
max
sav )
=0
( )
( + sav
sav
=0
1)
where is a sequence of Lagrange multipliers. The rst-order condition for consumption is,
0
( ) = , and the rst-order condition for savings sav leads to: = +1 +1 . Putting all
together yields precisely Eq. (3.3). Finally, note that the same program can be cast, and solved,
in a recursive format,
L (sav
1)
= max [ ( )
( + sav
sav
sav
1)
+ L (sav )]
is constant,
)+
ln (
92
)) (1
).
( ( )) +
(
)+
+1 + 1
+1 ;
ln (
+1 ).
By
If
+1 ).
c
by
A. Mele
Consider the following thought experiment. At time , I give up to a small quantity of consumption equal to
. The reduction of utility at then equals 0 ( )
. But by investing
in a
safe asset, I can consume
more at + 1. These additional consumption units
+1 =
+1
lead to an expected utility gain equal to
( 0 ( +1 )
denote the expecta+1 ), where
tion conditional on time- information. If and +1 are part of an optimal consumption plan,
I should be left with no incentives to implement these intertemporal consumption transfers.
Therefore, along an optimal consumption plan, any reductions and gains in the welfare of the
type considered above need to be identical:
0
( 0(
( )=
+1 )
+1 )
can be invested in
This relation generalizes Eq. (3.3). Next, suppose that at time ,
a risky asset whose price is . I can buy
/
units of this asset. Come time + 1, I
could sell the asset for +1 , pocket its dividend +1 if any, and nance additional units of
consumption equal to
/ ) ( +1 + +1 ). The reduction in the current utility
+1 = (
0
is ( )
. The boost in the expected utility at time + 1 is
( 0 ( +1 ) +1 ). Again, if
I am on an optimal consumption plan, there should not be incentives left to implement these
intertemporal transfers. Therefore, the celebrated Lucas asset pricing equation holds:
+1 +
+1
0
0
(3.4)
( )=
( +1 )
Section 3.2.4 derives Eq. (3.4) through dynamic programming methods, which are essential,
once we wish to work through more complex models such as those including nancial frictions.
The next section, instead, elaborates on the optimality condition in Eq. (3.3) and develops
key concept in both nancial economics and macroeconomics: the intertemporal elasticity of
substitution.
3.2.3 Intertemporal elasticity of substitution
The elasticity of substitution between two consumption goods,
ratio,
ES (
)=
and
, is dened as the
and
are the prices of the two goods. It measures the percentage change in the
where
relative consumption choice of two goods after a percentage change in their relative prices.
Similarly, dene EIS (
)
ES (
) as the elasticity of intertemporal substitution of
0
consumption and at two points in time and
. By the rst order conditions, 0 (( )) =
,
0
( )/ 0 ( )
( / )
( / )
)=
=
EIS (
0
0
( ( )/ ( ))
/
( / )
where
=
0
0
( )
( )
93
c
by
A. Mele
denotes the price of a zero-coupon bond; accordingly, denotes the gross interest rate from
to . Note that we are assuming no uncertainty in these basic derivations.
The elasticity EIS (
) is a measure of the percentage increase in the desired consumption
tomorrow relative to today, after a percentage decrease of the price of consumption tomorrow
relative to today. Intuitively, high values of EIS (
) describe a situation where the agent is
quite sensitive about consuming at and : even a small increase in the interest rate from
to and, hence, a small percentage drop in , can induce him to a substantial relative increase
of consumption in the future.
As
, EIS (
) collapses to the inverse of the elasticity of marginal utility with respect
to consumption or, simply, the relative risk-aversion
1
EIS ( )
lim
1
EIS (
=
=
lim
lim
00
0
/
0 ( )/
1+
( )
( )
(
0
( )
00 (
)
0( )
( )/ 0 ( ))
( / )
where the second equality follows by a rst-order Taylors expansion of the marginal utility
of consumption at time , 0 ( ) = 0 ( ) + 00 ( ) (
) + ((
)2 ). The expression,
EIS ( ), is called instantaneous elasticity of intertemporal substitution.
For example, in the CRRA case, and again in the deterministic case, we have that along an
optimal consumption path, +1 = ( )1 , where is the CRRA: as increases, it becomes
more attractive to save and postpone consumption. In a stylized equilibrium with a representative agent, ln = ln + , where denotes the growth rate of the economy. When is
high, more consumption will be available in the future, which mitigates the incentives to save,
driving the interest rate up.
An agent with a low EIS has a quite inelastic demand for bonds. Intuitively, when the price
of consumption in the future relative to today, , drops, desired consumption tomorrow relative
to today increases. But for an agent with a low EIS, the desired relative increase in future
consumption is quite limited, and so is his demand for bondsthe instruments that allow him
to allocate intertemporal consumption.
3.2.4 Lucas model
3.2.4.1 The optimality condition
(
)=
max
( + ) F
[3.P2]
( +
+ ) =0
=0
+
+ )
s.t.
+1 = (
where F denotes the information set as of time , +1 R is F -measurable, that is, +1 needs
to be chosen at time . We can solve the program [3.P2], using the same recursive approach in
94
c
by
A. Mele
Section 3.2.1, once due account is made of uncertainty. The Bellmans equation is:
(
) = max
[ ( )+
+1
+1
+1 )| F
] s.t.
+1
=(
(3.5)
Similarly as we did for Eq. (3.1), let us replace the budget constraint into the maximand. The
following rst-order condition holds for :
0=
((
+1 )
+1
+1 )| F
(3.6)
where the subscript in the value function on the right hand side denotes a partial derivative:
)=
(
)
. The optimal policy, +1 is a function of the current state, (
),
1 (
say +1 = T (
). By di erentiating the value function with respect to , and using the
previous rst-order condition, leaves:
"
!
#
P
P
0
)=
( )
+
T1 (
) +
)
1 (
1 ( +1
+1 ) T1 (
=1
( )(
=1
+1 +
+1
0
0
(3.7)
( )=
( +1 )
It is easy to show to extend these conditions to the case where a representative agent can
also invest into a locally riskless asset, that is, an asset that expires over the next period. The
budget constraint in Eq. (3.5) is, in this case: +
+ ) + 0 , where
+1 + 0 0 +1 = (
denotes
the
amount
of
the
locally
riskless
asset,
,
and
is
the riskless interest
0
0
0
rate, and the Lucas equation for the would be:
=
[ ( +1 )].
3.2.4.2 Rational expectations equilibrium
The asset market clears when for each , P= 1 and 0 = 0. By the budget constraint,
. A rational expectation equilibrium
then, the market for goods also clears, =
=1
is a sequence of asset prices ( ) =0 such that the optimality condition in Eq. (3.7) holds, the
markets clear, = , and each asset price is a function of the state,
= ( ) say. All in
all,
Z
0
0
( ) ( )=
( +1 ) ( +1 ) +
(3.8)
( +1 | )
+1
This is a functional equation in
( +1 ).
+1 |
) =
IID shocks
( )
)=
+1 )
95
+1 )
+1
+1 )
(3.9)
c
by
A. Mele
Note that the right hand side of this equation is independent of . Therefore, 0 ( ) ( )
equals some constant (say), which we can easily nd by substituting it back into the previous
equation, leaving:
Z
0
=
( +1 ) +1 ( +1 )
1
Therefore, the solution for
( ) is:
)=
0(
)
00
( )
, which collapses to relative riskNote, the elasticity of the price to dividend equals
0( )
aversion, once we assume only one tree exists. For example, if relative risk-aversion is constant
and equal to ,
Z
)=
( )
Figure 3.1 depicts the behavior of the asset price function ( ), under the assumption that
is not increasing in . The asset price collapses to the constant, (1
) 1 ( ), in the special
case where the representative agent is risk-neutral, = 0.
Dependent shocks
0
Dene ( )
( ) ( ) and
functions, Eq. (3.8) is:
( )=
( )
( )+
+1 )
+1 )
+1
+1 |
+1 |
It is a functional equation in , which we can show it admits a unique solution, under the
conditions contained in the celebrated Blackwells theorem below:
Theorem 3.1. Let B( ) the Banach space of continuous bounded real functions on
R
endowed with the norm k k = sup | |,
B( ). Introduce an operator
: B( ) 7 B( )
with the following properties:
(i) is monotone:
and 1 2 B( ) 1 ( )
[ 1] ( )
[ 2 ] ( );
2( )
96
c
by
A. Mele
and
0,
(0 1) : [ + ] ( )
[ ]( ) + .
is a -contraction and, 0 B( ), it has a unique xed point lim
( )+
[ 0] =
| )
The existence of
and, hence, , relies on the existence of a xed point of
:
= [ ].
It is easily checked that conditions (i) and (ii) in Theorem 3.1 hold here. To establish that
: B( ) 7 B( ) as well, it is su cient to show that
B( ). A su cient condition given
by Lucas (1978) is that is bounded, and bounded away by a constant .2 Note, a log-utility
agent would not satisfy this condition, yet, this case can be easily solved in the case of a single
tree, as shown next.
Suppose, then, that ( ) = ln , and that there is one single asset, such that Eq. (3.8) collapses
to
Z
( )
( +1 )
=
+1
( +1 | )
+1
Note that this result does not depend on any distribution assumption on the dividend process.
However, in the general CRRA case, it cannot be said more, not even in the single asset case.
Indeed, by Eq. (3.8),
1
Z
( )
( +1 )
+1
=
+1
( +1 | )
+1
1
R
+1
1
( +1 | )
such that the price-dividend ratio is constant whenever the distribution of the consumption
endowment growth rate is independent of . In Chapter 6 of Part II, we develop this case in
more detail, assuming a log-normal distribution for +1
.
3.2.4.3 Arrow-Debreu state prices
The model makes sharp prediction regarding Arrow-Debreu securities prices. In terms of the
terminology of Chapter 2, we want to identify the stochastic discounting factor. By the asset
pricing equation (3.9), it is
0
( +1 )
+1
0(
)
2 In this case, concavity of
0 ( ), which implies that
implies that for each , 0 = (0)
( ) + 0 ( )(
)
0( )
for each ,
and, hence,
( )
. Then, it is possible to show that the solution is in B( ), which implies that
: B( ) 7 B( ).
97
c
by
A. Mele
The Lucas model is extraordinary complex as in general, the price of any asset depends on the
dividends paid by all the remaining assets, as Eq. (3.8) makes clear. The model can generate
contagion, in that a shock in the fundamentals a ecting some assets a ects all the other asset
evaluation, even when the dividends are not correlated. It is an interesting property, due to the
simple circumstance that there is a representative agent who is pricing the same assetsmarkets
are not segmented and a shock to the stochastic discounting factor, +1 , a ects all the asset
prices,
. We mention e orts made by the literature, discussed in deeper detail in Chapter
8 (Section 8.10): Menzly, Santos and Veronesi (2004), Cochrane, Longsta and Santa-Clara
(2008), Pavlova and Rigobon (2008), Martin (2011).
( )
= ( )
( )
The
consumers live forever. We assume each consumer o ers inelastically one unit of labor,
and that, for now, that 0 = 1 and = 0. The resource constraint for the consumer is:
+
= 1 2
(3.10)
At each time
1, the consumer saves
1 units of capital, which he lends to the rm. At time
, the consumer receives the gross return on savings from the rm,
= 0 ( ),
1 , where
98
c
by
A. Mele
1(
0)
and lend
to the
Following the approach developed in Chapter 2, we can write down a single budget constraint,
obtained iterating Eq. (3.10):
0=
=1
+Q
=1
lim
0
=1
=0
(3.11)
=1
so as to have:
max
( )
=0
( )
s.t.
0+
=1
X
=1
[3.P3]
=1
The economic interpretation of the transversality condition (3.11) is the following. The rstorder conditions of the program [3.P3] are:
0
( )= Q
(3.12)
=1
where is a Lagrange multiplier. In equilibrium, current savings equal next period capital, or
. Therefore, Eq. (3.11) is:
+1 =
0
lim
( )
+1
=0
(3.13)
That is, the value of capital is capital weighted by discounted marginal utility, and is worthless,
eventually. We shall derive Eq. (3.13) below.
The rst-order condition (3.12) leads to the usual optimality condition in Eq. (3.3), where
this time, +1 = 0 ( +1 ). In this economy, an equilibrium is a sequence (( ) ) =0 satisfying
= ( )
( +1 )
1
= 0
0( )
( +1 )
+1
0
(3.14)
and the transversality condition in Eq. (3.13). The rst equation in this system is simply this:
capital available for producing the next period, +1 , is equal to savings,
( )
.
3.3.2 The social planner solution
3.3.2.1 Recursive plan
The market solution in (3.14) can be implemented by a social planner, who solves the following
program:
X
( 0)
max
( )
[3.P4]
(
) =0
=0
s.t. +1 = ( )
0 given
99
c
by
A. Mele
max
+1 ) =0
( )
( )+ )
+1
=0
0
The rst-order condition with respect to consumption is
=
( ), and the condition for
0
capital is
=
(
).
Putting
these
conditions
together,
leads
to the second equation in
1
(3.14). The same argument can be made, following a recursive approach. We have:
L ( ) = max [ ( )
( ) + ) + L(
+1
+1
+1 )]
+1 )]
s.t.
+1
= ( )
0
( ( )
). Let us denote the policy with
The rst-order condition leads to, 0 ( ) =
= ( ). In terms of the policy function, the value function and the rst-order conditions
are:
0
0
( ) = ( ( )) +
( ( )
( ))
( ( )) =
( ( )
( ))
( )=
( ( )) 0 ( ) +
( ( )
( )) ( 0 ( )
( )) =
( ( ))
( )
By replacing back into the rst-order condition, we obtain the second equation in (3.14).
3.3.2.2 Transversality condition again
Another derivation of the transversality condition relies on the following arguments. Consider
the followig truncated program,
max
(
+1 ) =0
( )
( )+ )
+1
=0
s.t.
+1
=0
0
The rst order conditions are the usual ones. Moreover, by multiplying
( )=
by
+1
0
and utilizing the previous constraint leaves:
( ) +1 = 0. Taking the limit for large
yields Eq. (3.13).
3.3.3 Dynamics
We study the dynamics of the system in (3.14) in a small neighborhood of the stationary state,
dened as the pair ( ), solution to:
= ( )
=
100
1
( )
c
by
A. Mele
c0 = c + (v21/v11) (k0 k)
c
c = y(k) k
c0
k0
kt
k k*
FIGURE 3.1.
A rst-order expansion of each equation in (3.14) around its stationary state, yields the
following linear system:
!
0
+1
( )
1
0( )
0
(3.15)
=
00
( ) 1 + 00(( )) 00 ( )
+1
00 ( )
The solution to this system is obtained with the tools reviewed in Appendix 1 of this chapter.
It is:
= 11 1 + 12 2
= 21 1 1 + 22 2 2
(3.16)
1
2
where: are constants that depend on the initial state, are the eigenvalues of , and 11
,
21
12
(0 1) and
are the eigenvectors associated with . In Appendix 1, we show that 1
22
1.
The
proof
we
provide
in
the
appendix
is
important,
as
it
illustrates
precisely
how the
2
neoclassical model reviewed in this section, needs to be modied to induce indeterminacy in
the dynamics of capital and consumption. A critical step in that proof relies on the assumption
of diminishing returns, i.e. 00 ( ) 0.
Let us return to the equations in (3.16). First, we need to rule out an explosive behavior
of and , for otherwise we would contradict (i) that ( ) is a stationary point, and (ii)
the optimality of the trajectories. Since 2
1, the only possibility is to lock the initial
state ( 0 0 ) in such a way that 2 = 0, which yields the following set of initial conditions:
0 = 11 1 and 0 = 21 1 , or 0 = 21 .3 Therefore, the set of initial points that ensure a
0
11
( 0
). Since is a predetermined variable,
non-explosive path must lie on the line 0 = + 21
11
there exists one, and only one, value of 0 , which ensures a non-explosive path of the system
around its steady state, as Figure 3.2 illustrates. In this gure,
is dened as the solution of
1 = 0( )
= ( 0 ) 1 [1], and = ( 0 ) 1 [ 1 ].
The usual word of caution is in order. A linear approximation might turn out to be misleading.
We develop one example where the dynamics of the system could be quite di erent from those
analyze here, when we start away from the stationary state. Let ( ) = , ( ) = ln . It is
3 In
0
0
101
21
11
= 0.
c
by
A. Mele
steady state
kt
FIGURE 3.2.
+1
Figure 3.3 depicts the nonlinear manifold associated with this system, and its linear approximation. For example, let = 0 99 and = 0 3. Then, the (linear) saddlepath is, approximately,
= + 0 7101 (
where =
1,
and
where:
= (1
1 (1
= 0 3.
In its simplest version, real business cycle theory is an extension of the neoclassical model
of Section 3.3.3, in which random productivity shocks are added. The engine of uctuations,
then, comes from the real sphere of the economy. This approach is in contrast with the Lucas
approach of the 1970s, based on information and money, where uctuations arise due to information delays with which agents discover the nature of a shock (real or monetary). As further
reviewed in Chapter 9, the Lucas information-theoretic approach has been, instead, more successful in inspiring work on the formation of asset prices, leading to the development of market
microstructure theory and, more generally, to information driven explanations of asset prices.
Despite the remarkable switch in the economic motivation, the paradigm underlying real
business cycle theory is the same as the information-based approach of Lucas, as it relies on
rational expectations: macroeconomic uctations and, then, as we shall explain, asset prices
uctuations, stem from the optimal response of the agents vis-`a-vis exogeneous shocks: agents
implement action plans that are state-contingent, i.e. they decide to consume, to work and to
invest according to the history of shocks as well as the present shocks they observe.
102
c
by
A. Mele
We consider an economy with complete markets and no frictions, such that its equilibrium
allocations are Pareto-optimal. To characterize these allocations, we implement them through
the following program of a social planner:
"
#
X
( 0 0 ) = max
( )
(3.17)
( )
=0
=0
denote new
(3.18)
of this capital is
At time
1, the available productive capital is . At time , a portion
lost, due to depreciation. Therefore, at time , the productive system is left with (1
)
units
of capital. The capital available at time , +1 , equals the capital already in place, (1
) ,
plus new investments, which is exactly what Eq. (3.18) says.
Next, normalize population normalized to one, such that
= . The goods market clearing
condition is:
(
)= +
) is the production function, which is F -measurable, and is the source of
where (
randomnessthe engine for random uctuations of the endogeneous variables. By replacing
Eq. (3.18) into the equilibrium condition,
+1
= (
+ (1
(3.19)
So the planner maximizes the utility in Eq. (3.17), under the capital accumulation constraint
in Eq. (3.19).
We assume that (
)
( ), where is as in Section 3.2, and ( ) =0 is solution to:
+1
+1 ,
(3.20)
where
(0 1), and ( ) =0 is a IID sequence with support s.t.
0. In this economy, every
asset is priced as in the Lucas model of the previous section. Therefore, the gross return on
savings 0 ( ) satises:
0
( )=
( 0(
+1 ) (
+1
+1 )
+1
))
(3.21)
( )
1
1
0(
1
)+1
4 A stochastic equilibrium is the situation where there is a stationary measure (denition: (+) =
) =1 .
the transition measure) generating (
103
(+
( ), where
is
c
by
A. Mele
(3.22)
+1
, and = ( )>
where we have dened
=
1 ( ) = , and, nally,
=
)
00 ( )
)> ,
=(
( )
1
0(
), leaves:
00
( ) 1+
0(
)
00 ( )
00
0(
)
00 ( )
( )
( ( ) 00 ( ) +
( ))
00
( )
+
2
0 = det (
)=(
)
+1+
( )
00 ( )
1 (
),
0 0
1 0
0 1
A solution is 1 = . By the same arguments applying to the deterministic case (see Section
3.3.3 and Appendix 1), one nds that 2 (0 1) and 3 1.5 Similarly as in the deterministic
1
case, we diagonalize the system by rewriting =
, where is a diagonal matrix that
has the eigenvalues of on the diagonal, and is a matrix of the eigenvectors associated to
the roots of . Eq. (3.22) is then:
+1 =
where
and
(3.23)
+1
+1
3 3
(3.24)
3 +1
and 3 explodes unless 3 = 0 for all , which is only possible when 3 = 0 for all .6
The condition that 3
0 carries an interesting economic interpretation: it tells us that the
only sources of uncertainty in this system can stem from shocks to the fundamentals, or that
there cannot be extraneous sources of noise, or sunspots. The reasons for this are easy to
1
explain. Let =
. We have:
0 = 3 =
31
32
33
(3.25)
Eq. (3.25) tells us that the three state variables, , and , are mutually linked in a twodimensional plane, a saddlepoint, where they exhibit a stable behavior. This saddlepoint is
formally dened as:
S=
R3 3 = 0
3 = ( 31
32
33 )
Furthermore, Eq. (3.25) implies that a linear relation exists between the two expectational
errors:
33
=
(no-sunspots).
(3.26)
For all ,
32
5 The
linearized model in this section has state variables expressed in growth rates. However, the model can be reformulated in
terms of rst di erences, by pre- and post- multiplying by appropriate normalizing matrices. For example, if i the 3 3 matrix
1
that has 1 1 and 1 on its diagonal, (3.22) can be written as: ( +1
)=
(
), where
=(
), and we would
arrive at the same conclusions. It can be easily shown that the model in this section collapses to that in Section 3.3.3, once we set
= 1, for each , and 0 = 1.
(
)
6 In other words, Eq. (3.24) implies that =
(3 + ), and for all . Because 3 1, this relation holds only when
3
3
3 = 0 for all .
104
c
by
A. Mele
Eq. (3.26) is a no-sunspots condition, as it says that the expectational error to consumption
cannot be independent of the expectational shock on the fundamentals of the economy, which in
this simple economy relates to technological shock. In other words, the source of uncertainty we
have assumed in this economy, relates to the technological shock. The remaining expectational
errors can only be perfectly correlated to the expectational shock in technology or, there are
no sunspots.
The manifold S has the same meaning as the stable relation depicted in Figure 3.2 applying
to the deterministic case. Mathematically, in this section, S is the convergent subspace, with
dim(S) = 2, i.e., the number of roots with modulus less than one. In other words, in this
economy with two predetermined variables, 0 and 0 , there exists one, and only one, value of
31 0 + 33 0
. This reasoning generalizes that made
0 S that ensures stability, given by 0 =
32
in the deterministic case (Section 3.3.3), and is generalized further in Appendix 1.
The solution to the linearized model can be determined by generalizing the reasoning in the
deterministic case. First, by Eq. (3.23) is:
=
1
X
0 +
=0
=(
3 ) =
3
X
=1
3
X
=1
3
X
=1
1
0 =
0
0 . The stability
To pin down the components of 0 , note that 0 = 0
(3)
condition then requires that the state variables be in S, or 0 = 0, which we now use to
implement the solution. We have:
1 1 10
2 2 20
3 3 30
1 1
2 2
3 3
Moreover,
P 1the term 3 3 30 + 3 3 needs to be zero, because 30 = 0. Finally, we have that
=
, and since 3 = 0, then, then 3 = 0 as well. Therefore, the solution for
3
=0 3 3
is:
= 1 1 10 + 2 2 20 + 1 1 + 2 2
3.3.4.2 Frictions, indeterminacy and sunspots
In the neoclassical model that we are analyzing, the equilibrium is determinate. As explained,
this property arises because the number of predetermined variables equals the dimension of the
convergent subspace of the economy. If we managed to increase the dimension of the converging
subspace, the equilibrium would be indeterminate, as further formalized in Appendix 1. As it
turns out, indeterminacy goes hand in hand with sunspots, the expectational shocks extraneous
to those in the economic fundamentals, as we discussed earlier, just after Eq. (3.26).
Introducing sunspots in macroeconomics has been an approach pursued in detail by Farmer
in a series of articles (see Farmer, 1998, for an introductory account of this approach). The
idea is quite interesting, as we know that the basic real business cycle model of this section
needs many extensions in order not to be rejected, empirically, as originally shown by Watson
(1993). In other words, the basic model in this section o ers little room for a rich propagation
mechanism, as it entirely relies on impulses, the productivity shocks, which we hardly read
105
c
by
A. Mele
about in the Wall Street Journal, as provocatively put by King and Rebelo (1999). Sunspots
o er an interesting route to enrich the propagation mechanism, although their asset pricing
implications in terms of the model analyzed in this section, have not been explored yet.
In a series of articles, David Cass showed that a Pareto-optimal economy cannot harbour
sunspots equilibria. On the other hand, any market imperfection has the potential to be a
source of sunspots. The typical example is the presence of incomplete markets. The neoclassical
model analyzed in this section cannot generate sunspots, as it relies on a system of perfectly
competitive markets and absence of any sort of frictions. To introduce sunspots in the economy
of this section, we need to think about some deviation from optimality. Two possibilities analyzed in the literature are the presence of imperfect competition and/or externality e ects. We
provide an example of these e ects, by working out the deterministic economy in Section 3.3.3.
(Generalizations to the stochastic economy in this section are easy, although more cumbersome.)
How is it that a deterministic economy might generate stochastic outcomes, that is, outcomes driven by shocks entirely unrelated to the fundamentals of the economy? Let us imagine
this can be possible. Then, both optimal consumption and capital accumulation in Section
3.3.3 are necessarily random processes. The system in (3.15), then, must be rewritten in an
expectation format,
+1
=
+1
+1
+1
(0 1) and
+1 =
0
+1
1 (
(0
>
+1 )
Moreover, for 2 = 2
(2 + ) to hold for all , we need to have 2 = 0, for all . Therefore,
1
>
the second element of the vector
(0
+1 ) must be zero, or, for all ,
0=
22
0=
There is no room for expectational errors and, hence, sunspots, in this model. The fact that
1 implies the dimension of the saddlepoint is less than the number of predetermined
2
variables. So a viable route to pursue here, is to look for economies such that the saddlepoint
has a dimension larger than one, i.e. such that 2
1. In these economies, indeterminancy
and sunspots will be two facets of the same coin. As shown in the appendix, the reasons for
which 2 1 relate to the classical assumptions about the shape of the utility function and
the production function . We now modify the production function, to see the e ect on the
eigenvalues of .
[Economy with increasing returns]
[Asset pricing implications in further chapters]
106
c
by
A. Mele
( )
( )
( )) =
( )
( )
( ))
( )
We assume that in each period, the rm distributes all the prots it makes, and that for a given
capital 0 , it maximizes its cum-dividend value,
"
!#
X
( 0 0) +
( 0 ) = max
(
)
0
(
1 ) =1
=1
c
by
A. Mele
)=
)) +
+1
+1 )]
) =
=
(
(
(
(
)) +
(
)) (1
)
))
(
( )+
( ))
+1
+1 ) ((1
)+
))]
By replacing this expression for the value function back into Eq. (3.29), leaves:
(
)) =
+1
+1
+1 ))
(1
+1
+1 )))]
(3.30)
Along the optimal capital accumulation path, the marginal cost of new installed capital
at time , which by Eq. (3.29) is the expected marginal return on the investment, equals the
expected value of (i) the very same marginal cost at time +1, corrected for capital depreciation,
(1
), and (ii) capital productivity, net of adjustment costs. Analytically,
0
0
(
)= (
( )) + (
( )) ( )
( )
( ))
= (
0
(
( )) = +
We now introduce a fundamental concept in investment theory.
3.4.1.2 q theory
The Tobins marginal q is dened as the ratio of the expected marginal value of an additional
unit of capital over its replacement cost:
Tobins marginal q
TQ
) = max [ (
+1 )],
+1
(1
+1
+1 )]
)+
+1
+1 ))]
(3.31)
+1
+1
=0
=0
+1
(1
))
(
) = ( +1 0 ( +1 )), where
is the expected marginal return on the
c
by
A. Mele
investment, that is, the shadow price of installed capital. Tobins marginal q is, then, the ratio
of the shadow price of installed capital to its replacement cost:
TQ =
=
(
[
+1
+1 )
+1
that is,
=
+ (1
+1 )]
(3.32)
(3.33)
The shadow price of installed capital, , has to equal the marginal cost of new installed capital,
and is larger than the price of uninstalled capital, . It is natural: to install new capital requires
some (marginal) adjustment costs, which add to the row price of uninstalled capital, .
Therefore, in the presence of adjustment costs, Tobins marginal q is larger than one.
Eq. (3.32) can be solved forward, leaving:
"
#
X
=
(1
) 1 0 +
( + +)
=1
The shadow price of installed capital is worth the sum of all its future marginal net productivity,
discounted at the depreciation rate. Moreover, Eq. (3.33) can be inverted for
, to deliver:
=
0 1
(3.34)
where 0 1 denotes the inverse of 0 , and is increasing, since 0 is increasing. Given , and the
fact that +1 is predetermined, the rm evaluates through Eq. (3.32), and then determines
the level of new investments through Eq. (3.34). These investments are increasing in the difference between the shadow price of installed capital, , and that of uninstalled capital, , as
originally assumed by Tobin (1969).
In the absence of adjustement costs, when = , Eq. (3.32) delivers the condition,
1=
+1
( (
+1
+1 ))
+ (1
))]
1 for all , meaning that the rms production is just the uninstalled capwhere we have set
ital. Empirically, however, the marginal productivity of capital, (
( )), is not volatile
enough, to rationalize asset returns, as explained in more detail in Chapter 8. Moreover, as we
argue in a moment, Tobins marginal q can be approximated by market-to-book ratios, which
are typically time-varying. Therefore, adjustment costs are important for asset pricing.
A di culty with Tobins marginal q is that it is quite di cult to estimate. Yet in the special
case we are analyzing in this section, where rms act competitively and have access to an
homogeneous production function and adjustment costs, Tobins marginal q can be proxied by
the market-to-book ratio of a given rm. Let ( ) denote the ex-dividend value of the rm,
which is its stock market value, since it nets out the dividend it pays to its holder in the current
period. It is:
( )
( )
(
( )) = [ +1 ( +1 )]
109
c
by
A. Mele
The Tobins average q is dened as the ratio of the stock market value of the rm over the
replacement cost of the capital:
Stock Mkt Value of the Firm
=
Replacement Cost of Capital
Tobins average q
)
+1
The next result was originally obtained by Hayashi (1982) in a continuous-time setting.
Theorem 3.2. Tobins marginal q and average q coincide. That is, we have,
(
)=
+1
Proof. By the homogeneity properties of the production function and the adjustment costs,
)=
(1
=1
))
"
X
=1
+1
where the second line follows by Eq. (3.27). By Eq. (3.30), and the law of iterated expectations,
"
#
"
#
X
X
=
(
) (1
)
(
))
( 0 0) 1
(
)
0 (
0
+1
=1
Hence,
=1
0)
0)
1.
This result, in conjunction with that in Eq. (3.33), provides a simple rule of thumb for
investement decisions. Consider, for example, the case of quadratic adjustment costs, where
( ) = 12 1 2 , for some
0. Then, Eq. (3.34) is:
110
c
by
A. Mele
=(
+1
"
X
) =
=1
and, then:
=1
"
X
=1
"
X
0 1
0
1
1
=1
"
X
=1
"
X
+1
#
1
=0
"
X
=2
=2
where the third line follows by the properties of the discount factor,
1 .
Therefore, the program consumers solve is:
"
#
X
( )
s.t. Eq. (3.35).
max
( )
0
1
and
=1
We now have two optimality conditions, one intertemporal and another, intratemporal:
+1
(
1
+1
+1 )
(inter temporal);
(
1(
)
(intratemporal).
)
3.4.3 Equilibrium
For all ,
(
)=
(3.36)
+
,
It is easily seen that the condition = 1 in the nancial market, implies that =
which, upon substitution of the prots in Eq. (3.28), delivers the equilibrium condition in Eq.
(3.36). Implicit in this reasoning, is the idea the adjustment costs are not paid to anyone. They
represent, so to speak, capital losses incurred along the way of growth.
We initially assume the population is constant, and made up of one young and one old. The
young agent maximizes his intertemporal utility subject to his budget constraint:
sav + 1 = 1
max [ ( 1 ) +
( 2 +1 )] subject to
[3.P5]
( 1 2 +1 )
2 +1 = sav
+1 + 2 +1
where
and
2 +1
are the endowments the agent receives at his young and old age.
111
c
by
A. Mele
c2,t+1
w2,t+1
c2,t+1 = Rt+1 c1,t + Rt+1 w1t + w2,t+1
c1,t
w1,t
FIGURE 3.3.
= sav +
(3.37)
(3.38)
P
= 2=1 , and for all .
and implies that the goods market is also in equilibrium, in that
Therefore, we can analyze the model, by just analyzing the autarkic equilibrium.
As Figure 3.4 illustrates, the rst-order condition for the program [3.P5] requires that the
slope of the indi erence curve be equal to the slope of the lifetime budget constraint, 2 +1 =
+1 1 +
+1 1 + 2 +1 , and leads to:
0
2 +1 )
0(
1 )
(3.39)
+1
2 +1 )
(3.40)
( 1)
In this relation, is the shadow price of a bond issued at , and promising one unit of numeraire
at + 1: the sequence of prices, , satisfying Eq. (3.40), is such that agents are happy with not
being able to lend and borrow, intergenerationally.
The previous model is easy to extend to the case where agents are heterogeneous. The program
each agent solves is, now:
sav + 1 = 1
( 1 )+
( 2 +1 ) subject to
max
( 1
2 +1 = sav
+1 + 2 +1
2 +1 )
+1
with obvious notation. The rst-order condition is, for all time and agent ,
0
(
0
+1 )
)
112
1
+1
c
by
A. Mele
sav
=0
(3.41)
=1
sav
+1 )
+1
2 +1
+(
+1 )
The rst term in the numerator reects an income e ect, while the second is a substitution
e ect. The coe cient 1 is the elasticity of intertemporal substitution, as explained in Section
3.2.3. Consider, for example, the logarithmic case, where = 1, and:
1
1
=
1+
+1
+1
+1
1+
+1
+1 )
sav
=1
2
=1
1
=
1+
+1
1
+1
(3.42)
(3.43)
+1
+ 1 = 1
( ( 2 +1 )| F )] subject to
[3.P6]
max [ ( 1 ) +
( 1 2 +1 )
2 +1 = ( +1 +
+1 ) + 2 +1
where denotes the asset price and the units of the asset the agent chooses in his young age.
The agent born at time
1 faces the constraints
+
1
1 + 1 1 =
1 1 and
2 +(
) 1 = 2 . By combining the second period constraint of the agent born at time
1 with
the rst period constraint of the agent born at time ,
(
The clearing condition in the asset market, = 1, implies that the market for goods also clears,
for all :
+ 1 + 2 = 1 + 2 . A characterization of the solution to the program [3.P6]
can be obtained by eliminating from the constraint,
max [ (
)+
( ((
+1
+1 )
)| F )]
+1
+ +1 +
113
2 +1 ) (
+1
+1 )| F
c
by
A. Mele
1
1
+1 F
where sav
=
,
=1
sav
sav +1 + 2 +1
1
. We
(3.44)
In a deterministic setting,
1
1
sav
sav
1
+1 +
+1 ,
where sav = 0
(3.45)
2 +1
which leads to the equilibrium bond price in Eq. (3.43). Eqs. (3.44) and (3.45) are formally
equivalent. Their fundamental di erence is that in the tree economy, savings have to stay
positive, as the tree must be held by the young agent, in equilibrium: sav
0. In an
economy without a tree, instead, the interest rate, , has to be such that savings are zero for
all , sav = 0.
Eq. (3.44) can be solved explicitly for the price of the tree, , once we assume 2 = 0 for
all . In the absence of a tree, we cannot assume endowments are zero in the old age, since
the autarkic economy in this case would be such that the old generation would not consume
anything. In the presence of a tree, instead, this assumption is innocuous, conceptually, as the
autarkic equilibrium in this case is such that the old generation could consume the fruits of the
tree, as well as the proceedings arising from selling the tree to the young generation. Solving
Eq. (3.44) for
when 2 = 0, then, leads to a price for the tree, equal to:
=
1+
( (
)).
3.5.3 Money
We consider a version of the previous model with endowment (not with capital), and assume
that agents can now transfer value through a piece of paper, interpreted as money. The young
agent, then, maximizes his intertemporal utility, subject to a new budget constraint:
+
(
max [ (
1
2 +1 )
1 )+
+1 )] subject to
2 +1
[3.P7]
2 +1
+1
(3.46)
+1
Then, the budget constraint for program [3.P7] is formally identical to that for program [3.P5].
The di erence is that in the monetary economy of this section, the young agent may wish to
114
c
by
A. Mele
transfer value over time, by saving money, earning a gross interest rate equal to the rate
of deation: the lower the price level the next period, the higher the purchasing power of the
money he transfers from the young to the old age. Naturally, then, by aggregating the budget
constraints of the young and the old generation, we obtain, formally, Eq. (3.37), where now,
sav and +1 are as in (3.46). However, in the setting of this section, sav is not necessarily
zero, as money can be transferred from a generation to another one. In equilibrium, sav = ,
where denotes money supply. Therefore, the real value of money is strictly positive, if the
equilibrium price
stays bounded over time, which might actually occur, as we shall study
below. As we see, the role of money as a medium for transferring value, is, in this context,
similar to that of a tree in the stochastic overlapping generations economy of Section 3.5.1.2.
Substituting the equilibrium savings sav = and +1 = +1 into Eq. (3.37), we obtain,
), which used again in Eq. (3.37), delivers,
1= + ( 1 + 2
sav
= sav
(3.47)
) sav
= sav
(3.48)
The last relation can be obtained even more simply, noting that by denition, (1 + ) 11 1 =
. The previous relation can be generalized when population grows. Suppose that at time ,
individuals are born, and that
= (1 + ), for some constant . Let money supply be
1
, and assume that for all ,
= . Then, by a reasoning similar to that
given by
1
leading to Eq. (3.48),
1+
sav ( ) = sav ( +1 )
(3.49)
1+
where now, we have set the real savings equal to a function of the interest rate, sav 1 sav ( ),
as it should be, by the solution to the program [3.P7].
Next, suppose that
is independent of , and that lim
= , say, a constant. Eq.
(3.49) leads to two stationary equilibria:
(a)
= 1+
. This stationary equilibrium relates to the Golden Rule, once we set = 0,
1+
as we shall
6= 0, the price is, in this stationary equilibrium,
say in Section 3.6.2. For
1+
= 1+
=
= 0 00 , and (ii) +1 = 0 00 1+
. All in all, the
0 . Then, we have: (i)
1+
agents budget constraints are bounded and the real value of money is strictly positive.
In this stationary equilibrium, agents trust money.
(b)
=
= 1+
, and since lim
1+
+1
stationary equilibrium, agents do not trust money.
+1
+1
+1
. As for
0, then lim
, we have that
0. In this
+1
+1
7 In this section, we assume that money transfers are made to the young generation: the money the young generation has to
absorb is that from the old generation, 1 , and that created by the central bank,
1 . One might consider an alternative
model in which transfers are made to old.
115
c
by
A. Mele
(ii) sav0 ( ) = 0. Income and substitution e ects compensate with each other.
(iii) sav0 ( )
The introductory example of this section leads to an instance of gross substituability (see
Eq. (3.42)). Note that an equilibrium cannot exist in that economy, once we assume agents
do not have endowments in the second period, 2 +1 = 0, as in this case, savings would
be strictly positive, such that the equilibrium condition in Eq. (3.41) would not hold. These
issues do not arise in the monetary setting of this section, where savings have to be positive
and equal to , in order to sustain a monetary equilibrium. Assume, for example, the CobbDouglas utility function, ( 1 2 +1 ) = 11 22 +1 , which leads to a real saving function equal
to sav(
+1 )
2
1
1+
reorganizing,
2 +1
+1
2
1
. If
2 +1
= 0, then, sav(
+1 )
1+ 2
2
and, by
an equation supporting the Quantitative Theory of money. In this economy, the sequence of
gross returns satises, +1 = +1 = +1 1 1 +1 , or
+1
(1 + ) (1 +
1 + +1
+1 )
1 +1
+1
1
1
, equals the monetary creation factor, corrected for the growth rate of the
Gross ination,
economy as measured by +1 , the youngs endowments growth rate.
(
( 1)
( 1)
As a nal example, consider the utility function ( 1 2 +1 ) =
+ (1
) 2 +1
1
which collapses to Cobb-Douglas once
1. We have:
+1
+1
2 +1
2 +1
+1
+1
1+
2 +1
1
+1
sav (
+1 )
+1
+1
2 +1
+1
1
where
. To simplify, set (i)
= 1, (ii) 2 = = = 0, and (iii) 1 = 1 +1 . It can
0
be shown that in this case, sign (sav ( )) = sign (
1). Moreover, the dynamics of the gross
interest rate, , are given by:
+1
1)1
(1
(3.51)
The stationary equilibria are solutions to = ( ), and it is easily seen that one of them is
= 1, and corresponds to the monetary steady state.
When
1, the slope in Eq. (3.50) has always the same sign, and the mapping in Eq.
(3.51) has two xed points,
= 0 and = 1, with
being stable and being unstable, as
illustrated by Figure 3.5 when = 2.
116
1)
c
by
A. Mele
f(R)
1.5
1.0
0.5
0.0
0.0
FIGURE 3.4.
0.2
0.4
( )=(
0.6
0.8
1)
1.0
1,
1.2
with
= 2.
When
1, the situation is quite delicate. In this case,
is not well-dened, and = 1 is
not necessarily unstable. We may have sequences of gross interest rates, , converging towards
, or even the emergence of cycles. Mathematically, these properties
can be understood by
+1
examining the slope of the map in Eq. (3.50), for = 1,
= 1.
+1 =
=1
In the general case, Figure 3.6 depicts an hypothetical shape of the map
7
+1 , which
is that we might expect to arise in the
presence
of
gross
substituability
or,
in
fact,
even
in the
sav0 ( )
+1
case of complementary, provided sav( )
1 for all . In both cases, the slope,
0.
sav( )
+1
, is
= 1 + sav
0. The
Moreover, the slope at the monetary state, = 1+
0( )
1+
+1 =
+1
1, such that the monetary steady state
is stable. A condition for the map
=
+1 =
sav0 ( )
+1
7
to
bend
backward
is
that
1,
and
the
condition
for
+1
sav( )
1 to hold is that
neighborhood of
8 For
1+
1+
sav0 ( )
sav( )
+1 =
1
.
2
)= (
). Multiplying the two equations side by side leaves the result that
117
the proof, note that by Eq. (3.49), we have, we have that for a cycle of order 2, (i)
(
1+
1+
2
(
.
.8 Note that to
) = (
), and (ii)
c
by
A. Mele
Rt+1
A
Ra
Rt
A
R*
R**
Rt
FIGURE 3.6.
analyze the behavior of the gross interest rate, we are needing to make reference to backwardlooking dynamics, as there exists an indeterminacy of forward-looking dynamics. Finally, there
might exist more complex situations where cycles of order 3 exist, giving rise to what is known
as a chaotic system. Note that these complex dynamics, including those in Figure 3.7, rely
on the assumption that sav0 ( ) 0, which might be somehow unappealing.
3.5.4 Money in a model with real shocks
Lucas (1972) is the rst attempt to address issues relating the neutrality of money in contexts
with overlapping generations and uncertainty. This section is a simplied version of Lucas
model as explained by Stokey et Lucas (1989) (p. 504). Every agent works when young, so as
to produce a consumption good, and consumes when he is old, and experiences a disutility of
work equal to
( ), where
is his labor supply, and is assumed to satisfy 0 00
0.
Utility drawn from second period consumption is denoted with ( +1 ), and has the standard
properties. The agent faces the following program:
=
=
( ( +1 )| )] subject to
max [ ( ) +
{ }
+1 +1 =
118
c
by
A. Mele
where
denotes the information set as of time ,
is money holdings; is the agents production, obtained through his labor supply , and ( ) =0 1 is a sequence of positive shocks
a ecting his productivity. Finally,
is the price of the consumption good as of time . By
replacing the rst constraint into the second leaves, +1 =
. There+1 , where
+1
+1
( (
))]. The rst-order
fore, the program the agent solves is to max [ ( ) +
+1 )|
condition leads to,
0
( )=
[ 0(
]
+1 )
+1 |
We have, +1 = +1 +1 =
+1 , where the rst equality follows by the equilibrium in the
good market. Replacing this relation into the previous equation leaves,
0
( )
( 0(
+1 ) +1
+1 |
(3.52)
where E denotes the support of . This equation simplies as soon as productivity shocks are
IID, (| ) = (), in which case, 0 ( ( ) ( )) is independent of , and is a constant .9 This
is a result about the neutrality
of money, at least provided such a constant exists. Precisely,
R 0
0
we have that ( ) = E ( ) (). For example, consider ( ) = 12 2 and ( ) = ln , in
which case =
, ( )=
and ( ) =
.
1
1+
( ( )
+1
) , for
+1
, for
given.
given.
(3.53)
and
+1
, such
() = 1 +
(3.54)
The steady state per-capita capital satisfying Eq. (3.54) is said to satisfy the Golden Rule. A
social planner would be able to increase per-capita consumption at the stationary state, provided
0
( ) 1 + . Indeed, because ( ) is given, we can lower and have
= (1 + )
0,
immediately, and = ( 0 ( ) (1 + ))
0, in the next periods. In fact, this outcome would
9 The proof that ( ) = relies on the following argument. Suppose the contrary, i.e. there exists a point
0 and a neigh( 0 ) or (ii) ( 0 + )
( 0 ), for some strictly positive constant . We
borhood of 0 such that either (i) ( 0 + )
deal with the proof of (i) as the proof of (ii) is nearly identical. Since 0 ( ( ) ( )) is constant, and 00
0, we have that
0( (
0 ( ( )) ( )
0( (
0( (
( 0 ))
0. Next, note
0 + )) ( 0 + ) =
0
0
0 + )) ( 0 ). Therefore,
0 + )) ( ( 0 + )
( 0 ), contradicting that ( 0 + )
( 0 ).
that 0 0, such that ( 0 + )
119
c
by
A. Mele
apply along the entire capital accumulation path of the economy, not only in steady state, as
we now illustrate. First, a denition. We say that a path ( ) =0 is consumption-ine cient if
there exists another path ( ) =0 satisfying Eq. (3.53), and such that
for all , with at
least a strict inequality for one . The following is a slightly less general version of Theorem 1
in Tirole (p. 161):
Theorem 3.3 (Cass-Malinvaud theory). A path ( ) =0 is: (i) consumption e cient if
0( )
)
1 for all , and (ii) consumption ine cient if 1+
1 for all .
1+
0(
(1 + ) ( +1
+1 )
= ( )
( )
( )
(1 + )( +1
+1 )
( ) + 0( )
( )
(1 + )
+1
( )
( 0)
or +1
. Evaluating this inequality at = 0 yields 1
0 , and since 0 = 0, one
1+
1+
0( )
has that 1 0. Since 1+
1 for all , then
as
, which contradicts
has
bounded trajectories. The proof of Part (ii) is nearly identical, except that, obviously, in this
case, lim inf
. Note, in general, there are innitely many sequences that allow for
e ciency improvements. k
The reasoning in this section holds independently of whether the economy has a nite number
of agents living forever, or overlapping generations. For example, in the case of overlapping
generations, Eq. (3.53) is the capital accumulation path for Diamonds model, once we set
2 +1
= 1 + 1+
. An important issue is to establish whether actual economies are dynamically
e cient? In a seminal contribution, Abel, Mankiw, Summers and Zeckhauser (1989) provide a
framework to address this issue and conclude about dynamic e ciency.
[In progress]
3.6.2 Over-accumulation of capital
Bubbles in the Diamonds moodel
[In progress]
3.6.3 Money
We wish to nd rst-best optima, that is, equilibria that a social planner may choose, by acting
directly on agents consumption, without needing to force the agents to make use of money.10
Let us analyze, rst, the stationary state,
= 1+
. We show that this state corresponds to
1+
the stationary state where consumptions and endowments are constants, and that the agents
utility is maximized when = 0. Indeed, since the social planner allocates resources without
10 In a second-best equilibrium, a social planner would let the market play rst, by allowing the agents to use money and, then,
would parametrize such virtual equilibria by . The indirect utility functions that arise as a result would then be expressed in
terms of these growth rates . The social planner would then maximize an an aggregator of these utilities with respect to .
120
c
by
A. Mele
( 1 2) =
+ 1+2 =
2
2
1+
The rst-order condition is 21 = 1+1 . Instead, the rst-order condition in the market equilibrium is 21 = 1 . Therefore, the Golden Rule is attained in the market equilibrium, if and
only if = 0. The social planner policy converges towards the Golden Rule. Indeed, the social
planner solves:
max
()
2 +1 )
subject to
=0
P
()
2
or max =0
1+
as of time , and the notation
1+
1+
2 +1
()
121
1)
2
( )
1
1+
c
by
A. Mele
+1
= 0 1
(3A.1)
1 1 1
+ +
(3A.2)
where
and are eigenvalues and eigenvectors of , and are constants, which will be determined
below. The standard proof of this result relies on the so-called diagonalization of Eq. (3A.1). Let us
) = 0 1 , where
is scalar and a
consider the system of characteristic equations for , (
) and is
1 column vector, for = 1 , or, in matrix form,
=
, where = ( 1
1 . By post-multiplying by
1
on its diagonal. We assume that > =
a diagonal matrix with
leaves the spectral decomposition of :
1
(3A.3)
=
By replacing Eq. (3A.3) into Eq. (3A.1), and rearranging terms,
+1
where
P
The solution for is
=
, and the solution for is:
=
= ( 1
) =
=
=1
P
,
which
is
Eq.
(3A.2).
=1
)> , we rst evaluate the solution at = 0,
To determine the vector of constants = ( 1
0
=(
) =
whence
( )=
(3A.4)
0
1
11 1 + 12 2
= 0=
=
= ( 1 2)
0
2
21 1 + 22 2
where we have set
yields:
=(
)> . By replacing the second equation into the rst, and solving for
2
11 0
21 0
11 22
12 21
21
2,
11
For this system, the saddlepoint is a line with a slope equal to the ratio of the two components of
eigenvector for 1 the stable root. Figure 3A.1 depicts the phase diagram for this system, with the
divergent line satisfying the equation 0 = 22
0.
12
122
c
by
A. Mele
y0
x
x0
y = (v21/v11) x
FIGURE 3A.1.
A saddlepoint path brings the following economic content. If is a predetermined variable, must
jump to the saddlepoint 0 = 21
0 , so as to ensure the system does not explode. Note, then, that
11
a conceptual di culty arises should the system include two predetermined variables, as in this case,
there are no stable solutions, generically. However, this possibility is unusual in economics. Consider
the next example.
3A.2 Example. The system of Example 3A.1 is exactly the one for the neoclassic growth model, as
we now demonstrate. Section 3.3.3 shows that in a small neighborhood of the stationary values ( ),
the deviations ( ) of capital and consumption from ( ), satisfy Eq. (3.15), which is reported
here for convenience:
0( )
+1
1
0( )
0( )
=
00 ( ) 1 +
00 ( )
+1
00 ( )
00 ( )
0(
1, and (ii) tr ( ) =
tr( )2
4 det( ) =
+1+
+1+
are:
1 2
0(
) 00
(
00 ( )
0(
)
00 ( )
00
tr( )
tr( )2 4 det( )
,
2
and ,
0(
)=
2
( )
2
+1
= 1
1 2
1
It follows that 2 = 12 (tr( ) +
)
)
1 + 12
1. Finally, to show that
2 (1 + det( ) +
p
2
(0 1), note thatpthat since det( ) 0, one has 2 1 = tr ( )
tr( )
4 det( ) 0; moreover,
1
1
tr ( )
tr( )2 4 det( )
2, or (tr ( ) 2)2
tr ( )2 4 det( ), which is true, by
1
simple computations.
123
c
by
A. Mele
=
(
:| |
1 for = 1
and
Proceeding similarly as in Example 3A.1, we aim to make sure the system stays trapped in the
convergent space and, accordingly, require that: +1 = = = 0, or,
+1
..
.
= 0(
0
(
)1
Let
+ , where is the number of free variables and is the number of predetermined variables.
and 0 in such a way to disentangle free from predetermined variables, as follows:
Partition
0(
)1
0
(
(1)
(
free
0
1
pre
0
1
(2)
)
(1)
=
(
free
0
1
(2)
+
(
pre
0
1
or,
(1)
(
free
0
1
pre
0
1
(2)
=
(
We shall refer to as the dimension of the convergent subspace, S say. The reason for this terminology is the following. Consider the solution for ,
=
For
1 1 1
+ +
+1 +1
+1
+ +
+1
= = = 0
= ( 1 1
in which case,
=
1
1 1 1
1
+ +
i.e.,
where
( 1 1
) and
hi
1
1
R :
>
>
1
124
R}
c
by
A. Mele
(i)
(ii)
(iii)
125
c
by
A. Mele
1)
= 1
1)
where is an instantaneous rate, and = is the number of subperiods in the given time period .
= (1 + ) 0 , or
= (1 + ) / 0 . By taking limits leaves:
The solution is
/
= lim (1 + )
+ ( +1)
= 0
( +1) = 1
= 1
= . By iterating,
0 yields:
0+
=1
/
X
0+
(3.55)
or in di erential form:
=
(3A.5)
)=
(3A.6)
(3.56)
( +1)
+
+1 =
+1
(3A.7)
+1
= (1
+ , leaves
1
1
)
+
+1 = (1
{z
}
|
=
+1
That is, investments over the period from to + 1 are given by the continuous ow of investments
during this period, discounted at the appropriate depreciated rates.
126
c
by
A. Mele
Dene
We have: =
= ( )
= ( )
It is the capital accumulation contraint used to solve the program in the next section.
( )
s.t. = ( )
[3A.P1]
where all variables are per-capita. We assume there is no capital depreciation. (Note that for the
discrete time model, we assumed, instead, a total capital depreciation.) The Hamiltonian is,
+
= ( )+
( )
where is a co-state variable. As explained below (see Appendix 3), the rst-order conditions for this
problem are:
= 0
=
= 0 ( )
=
++
0(
(3A.8)
00
( )
=
0( )
)
++
00 ( )
0(
( )
(3A.9)
The equilibrium is the solution of the system consisting of the constraint of the program [3A.P1],
and Eq. (3A.9). Similarly as in Section 3.3.3, we analyze the dynamics of the system in a small
neighborhood of the stationary state, dened as the solution ( ) of the constraint of the program
[3A.P1], and Eq. (3A.9), when ( ) = ( ) = 0,
+
= ( )
+ + = 0( )
127
c
by
A. Mele
A rst-order approximation of both sides of the constraint of the program [3A.P1], and Eq. (3A.9),
near ( ), yields:
0( )
00 ( ) (
)
=
00 ( )
=
(
) (
)
where we used the equality
system can be rewritten as:
++ =
0(
). By setting
0(
and
00 (
)
)
00 (
, the previous
where
(
)> . Warning! There must be some mistake somewhere.
1 , where
and are as in Appendix 1. We have:
We diagonalize this system by setting =
=
1
where
We see that
2,
and
0=
1
2
0(
)
)
00 (
2
+4
0(
00
) 00
) (
00 (
( )
is:
=1 2
whence
=
where the
=
=
21 1
+
+
11 1
1 1
12 2
22 2
=0
2 2
2
2
1
2
21
11
As in the discrete time model, the saddlepoint path is located along a line that has as a slope
the ratio of the components of the eigenvector associated with the negative root. We can explicitely
compute such ratio. By denition, 1 = 1 1
0(
)
)
11 +
00 (
i.e.,
21
11
1
0( )
00 (
00 ( )
and simultaneously,
21
11
00 (
) =
1 11
1 21
21
1
1
128
c
by
A. Mele
s.t.
[0
+ (
(3A.10)
is
where and are an instantaneous and a terminal payo , is a subjective instantaneous rate,
a standard Brownian Motion and, nally, (
) and (
) are given drift and di usions functions.
We interpret as a state variable and as a control. In general, the control depends on the realization
of although we assume that it cannot depend on future observations of it can only depend on
past values of . We conne attention to controls known as feedbacks, i.e., such that only depends
( )), for each sample path of the state variable
( ).
on the current value of , ( ) (
The function makes the control and, then, the state variable , Markovian.
We apply what is known as the stochastic programming principle. Heuristically, we maximize up to
an intermediate point in time, + , say, assuming the maximization for the remaining time period
[ +
] holds. We have:
Z
(
)
(
)
) = max
(
) +
( )
(
(
[0
max
)
[0
+
=
max
[0
(
Z
+
)
(
+
)
( (
))
where the last two lines follow by the law of iterated expectations. By rearranging terms,
1
(
)
Z +
1
1
(
)
( ( +
(
)
+
+
)
(
= max
(
[0
For small
+
(
)
(
)
(
)+
0 = max
) +
(
)
( ), where
) +
1
2
1
2
))
(3A.11)
2(
2.
)) + (
(
)
(
129
(
))
)) +
1
(
2
))
(3A.12)
c
by
A. Mele
)
2
The function is usually referred to as the optimized Hamiltonian, and Eq. (3A.12) is the Bellman
Equation for di usion processes.
By Itos lemma:
1 2 2
+
+
(3A.13)
+
=
2 2
Moreover, by di erentiating both sides of the Bellman Equation (3A.12) with respect to ,
2
1
2
(3A.14)
where the rst equality follows by the denition of , and the third equality holds by the denition of
the optimized Hamiltonian function, (
). Plugging Eq. (3A.14) into Eq. (3A.13) leaves:
=
+
which shows that:
1
2
(3A.15)
)+
)+
)2
with respect to the control . The result is the optimized Hamiltonian function, (
).
=0
(3A.16)
1 2 2
2 2
Note that in the innite horizon case (i.e., the problem in (3A.10) when
collapses to
)+
(
)
(
))
0 = max ( (
=
130
), Eq. (3A.11)
c
by
A. Mele
[ (
)] = 0
The deterministic model in Appendix 2 of this chapter is a special case of the current setup. Indeed,
the rst of Eqs. (3A.8) reveals that is not a function of such that Eqs. (3A.8) are a special case of
Eqs. (3A.16), namely for
0.
131
c
by
A. Mele
References
Abel, A.B., N.G. Mankiw, L.H. Summers and R.J. Zeckhauser (1989): Assessing Dynamic
E ciency: Theory and Evidence. Review of Economic Studies 56, 1-20.
Cochrane, J. H., F. A. Longsta , and P. Santa-Clara (2008): Two Trees. Review of Financial
Studies 21, 347-385.
Farmer, R. (1998): The Macroeconomics of Self-Fullling Prophecies. Boston: MIT Press.
Hayashi, F. (1982): Tobins Marginal and Average : A Neoclassical Interpretation. Econometrica 50, 213-224.
Kamihigashi, T. (1996): Real Business Cycles and Sunspot Fluctuations are Observationally
Equivalent. Journal of Monetary Economics 37, 105-117.
King, R. G. and S. T. Rebelo (1999): Resuscitating Real Business Cycles. In: J. B. Taylor
and M. Woodford (Editors): Handbook of Macroeconomics, Elsevier.
Lucas, R. E. (1972): Expectations and the Neutrality of Money. Journal of Economic Theory
4, 103-124.
Lucas, R. E. (1978): Asset Prices in an Exchange Economy. Econometrica 46, 1429-1445.
Lucas, R. E. (1994): Money and Macroeconomics. In: General Equilibrium 40th Anniversary
Conference, CORE DP no. 9482, 184-187.
Martin, I. (2011): The Lucas Orchard. Working Paper Stanford University.
Menzly, L., Santos, T., and P. Veronesi (2004): Understanding Predictability. Journal of
Political Economy 112, 1-47.
Pavlova, A. and R. Rigobon (2008): The Role of Portfolio Constraints in the International
Propagation of Shocks. Review of Economic Studies 75, 1215-1256.
Prescott, E. (1991): Real Business Cycle Theory: What Have We Learned? Revista de Analisis Economico 6, 3-19.
Stokey, N. L. and R. E. Lucas, (with E.C. Prescott) (1989): Recursive Methods in Economic
Dynamics. Harvard University Press.
Tirole, J. (1988): E cacite intertemporelle, transferts intergenerationnels et formation du
prix des actifs: une introduction. Melanges economiques. Essais en lhonneur de Edmond
Malinvaud. Paris: Editions Economica & Editions EHESS, 157-185.
Tobin, J. (1969): A General Equilibrium Approach to Monetary Policy. Journal of Money,
Credit and Banking 1, 15-29.
Watson, M. (1993): Measures of Fit for Calibrated Models. Journal of Political Economy
101, 1011-1041.
132
4
Continuous time models
4.1 Introduction
This chapter is an introduction to asset pricing models cast in continuous time. As such, it
does not not introduce any new economic concept against what we have already learned in
previous chapters. Nevertheless, continuous time methods are powerful as they allow to deal
with issues arising in economies and markets more complex than those in the previous chapters.
Moreover, on an applied perspective, continuous time methods are extremely useful to evaluate
derivative instruments that draw value from complex events, such as those relating to baskets
of credit events, capital market volatility, or history-dependent developments in xed income
security markets, to name just a few, as we shall see in Part III of these lectures. Continuous
time models pose challenges to econometricianswe only observe a discrete realization of an
idealized continuous time data generating process. The next chapter surveys tools, based on
simulations, which allow us to mitigate these challenges.
This chapter aims to two scopes. The rst is to explain in detail how the principle of absence
of arbitrage works in continuous time: how do asset prices need to drift to ensure that there
is no arbitrage? How many possible drifts would we expect to see in arbitrage-free markets?
The second objective is to develop technical details about the properties of asset prices in
continuous time. For example, we shall see that asset prices, once restricted by absence of
arbitrage, satisfy partial di erential equations under regularity conditions. Yet asset prices are
discounted expectations of their future payo s, taken under the risk-neutral probability. How
are these properties tied together? We shall explain how these properties are tied to each
other, by introducing the celebrated Feynman-Kac theorem, which provides a probabilistic
representation of the solution to a partial di erential equation. Moreover, what is the relation
between the risk-neutral probability and the physical probability? How do we need to tilt the
physical probability to determine the risk-neutral? How many risk-neutral probabilities exist,
in incomplete markets or in markets with frictions? Are there natural pricing probabilities
arising in complex contexts such as those in which interest rates are random? And how these
pricing probabilities relate to the notion of numeraires? Girsanov theorem is the starting point
that we need to deal with these fundamental questions.
c
by
A. Mele
The models we consider in this chapter di usion models (with some extensions that accommodate for jumps), which are the workhorse in nance. Di usion models are, so to speak,
those where the variations of a variable of interest are driven by a deterministic component
(the drift) and a stochastic one (the di usion). Heuristically, the di usion component is
normally distributed over an innitesimal amount of time, being proportional to the variations
of what is known as a Brownian motion. We typically assume that the fundamentals of the
economy follow di usion processes, and that asset prices are rational, in that they are a function
of these fundamentals. Absence of arbitrage restricts the set of all possible pricing functions.
The fundamental tool with which we link asset prices to fundamentals is Itos lemma, a device
we need to build new processes (in our case, the asset prices) from old ones (the fundamentals
of the economy). The complication in nance is that these new processes, albeit a function of
the fundamentals, are not given in advance; instead, they are the focus of research.
The chapter is organized as follows. The next section is an introduction to methods. It deals
with details leading to the birth of continuous time nancethe Black & Scholes formula
of evaluation of European options; it also illustrates how continuous time models obtain as
limiting cases of the discrete time models in the previous chapter, and also describes basic
properties of long-lived asset prices, such as (i) the fundamental relations that link expected
returns, volatilities (the betas) and risk-premiums (the lambdas); and (ii) a representation
of the price-dividend ratio in terms of certain possibly varying discount ratesthe risk-adjusted
discount rates. These derivations turn out to be useful while discussing the properties of equity
markets in Chapter 7.
Section 4.3 ... Finally, the Appendix provides technical details omitted from the main text,
including a self-contained appendix containing notions of stochastic calculus.
[In progress]
134
c
by
A. Mele
solution to a certain partial di erential equation, the solution of which, we can represent as a
conditional expectation taken under the risk-neutral probability. In Section 4.3.3, we provide
the link between this risk-neutral probability and the original probability.
4.2.2 The origins: Black & Scholes
4.2.2.1 Self-nanced strategies
A self-nanced portfolio leads to a situation where the change in value of the portfolio between
two instants and + is determined as a mark-to-market P&L: the change in the asset prices
times the quantities of the same assets held at time : there is no injection or withdrawal of
funds between any two instants. For example, let 1 and be the number of shares and the
price of some risky asset, and 2 and be the number of some riskless assets and its price.
Then, the value of a self-nanced portfolio, = 1 + 2 , satises:
+
= 1 + 2 =
where
1 and the second equality follows by simple calculations. If the portfolio strategy
involves risky assets distributing a dividend process, and consumption, the value of the selfnanced portfolio satises (see Appendix 2 for details):
=
+
+
(4.1)
where
Why are partial di erential equations so important in nance? Suppose that the price of a stock
follows a geometric Brownian motion:
=
and that there exists a riskless accounting technology, or money market account (MMA, henceforth) making spare money evolve as:
=
where
0. Finally, suppose that there exists another asset, a call option, which gives rise
)+ at some future date , where
is the strike, or exercise
to a payo equal to (
price of the option. Let ( ) [0 ] be the option price process. We wish to gure out what this
price looks like while formulating as few assumptions as possible. We ignore dividend issues,
assume there are no transaction costs, and rule out any other frictions. We assume rational
expectations, that is, there exists a function : = (
), and assume that this function is
as di erentiable as needed for an application of Itos lemma, such that:
=(
where
is the innitesimal generator, dened as
= + 12 2 2
denoting partial derivatives. Next, we create the following portfolio:
135
+
, with subscripts
units of the risky asset
c
by
A. Mele
+
+
Now, set 0 = 0 . Moreover, let us actually conjecture that we could choose and
such
that
=
for all , i.e. that the self-nanced strategy can replicate the option price. We
obviously need to check this conjecture below. For now, note that if = for all , the drifts
and di usion coe cients of both and have to be the same, by a result stated in Appendix
1, known as the unique decomposition property. Di usion terms are the same when = .
Replacing this into the dynamics of produces:
=(
Next, take
(4.2)
for all :
= =
(4.3)
1
2
=0
[0
) R++
(4.4)
Suppose that there exists a traded asset and that its price satises,
=
1 2 2
=
+
+
2
). By Itos
136
c
by
A. Mele
1 2 2
=(
+
(4.5)
)
+
2
Therefore, and consistently with the Black-Scholes derivation in the previous subsection, a
necessary condition for
to replicate
is that
for all
[0
(4.6)
Next, suppose is the price of a traded asset, one that delivers a payo equal to (
)=
( ) at time , for some function , and that at the same time, satises the partial di erential
equation (4.7) below. We need to have that 0 = $0 where $0 denotes the market price of the
$
$
asset. For suppose not and, e.g., 0
0 . We could sale short the asset for 0 , and implement
a self-nancing strategy with , such that 0 = ( 0 0), where satises,
(
We claim that
)=
=
) for all
[0
=
(
= (
)=
(4.7)
1 2 2
+
2
1 2 2
+
)
2
(4.8)
where the rst equality follows by Eq. (4.6); the second by the self-nancing condition,
=
and again by (4.6); and the third by (4.7). Because 0 = ( 0 0), then (4.8) implies
= (
) for all , as claimed, and then
= (
) = ( ) too, which allows to
3
honour the short-sale of the asset. Note that these arguments do not require that a market for
this asset exists over the life of the asset.
The crucial assumption underlying the property that the strategy value replicates the risk
(
), = (
) for all , is that (
) is the price of a traded asset, i.e., Eq. (4.7) holds
true. We can show the converse. That is, suppose the strategy value replicates the risk through
Eq. (4.6). In this case, the L.H.S. of (4.5) is zero, and by Eq. (4.6),
+
1
2
= (
)= (
1 2 2 00
=
2
(4.9)
2
. Then, the
(4.10)
=
( ), which we could honour through
through through the self-nanced strategy , up to time , when it will deliver
the long static position in the asset initiated at time-0.
137
c
by
A. Mele
00
= (
)= (
However, this ordinary di erential equation cannot hold, unless + 2 = 0. That is, assume
that = ( ) for all ; then, the left-hand side of Eq. (4.10) cannot equal zero, contradicting
that = ( ) holds for all .
All in all, we cannot replicate ( ) = 2 through a self-nanced strategy. Naturally, we
could replicate the payo at maturity, 2 , provided a market exists for a claim to 2 (see Eq.
(4.22) in the following section), not necessarily a market over the entire life of this derivative.
The price of this asset is obviously not ( ) = 2 for each though. To determine the tracking
error of the portfolio ( ) = 2 , note that
=
=
=
1
2
00
1 2
)
2
2
2
2 +
0
00
(4.11)
= ( )
(4.12)
While Eq. (4.12) allows by construction to replicate , it also obviously generates an hedging
cost that satises:
=
. In the context of our example,
=
=
=
1
2
(
1
2
00
2
00
(4.13)
( +
2 )(
In Section 4.6, we shall return to the issues regarding replicability of claims in the context of
incomplete markets.
138
c
by
A. Mele
The option price predicted by the Black and Scholes model is independent of the drift of the
underlying asset. After reading this chapter, the reader will interpret this result as follows. Asset
prices (rescaled by the money market account) are martingales under the so-called risk-neutral
probability, say . Therefore, their value is equal to the discounted expectation of its payo
under , that is, the probability under which the stock price drifts proportionally to . In other
words, doesnt matter.
Let us analyze the details of this property in the context of the previous replicating arguments,
by relying on a very simple example. Assume that,
=
(4.14)
where is a constant,
and
are some function of calendar time and and , and
,
= 1 2, are two standard Brownian motions.
Consider the function
= (
), and assume it is solution to the following partial
di erential equation, generalizing (4.7),
=
)
(4.15)
where
(
) is the innitesimal generator of the di usion process (4.14), and is some
function, interpreted as the drift of under the risk-neutral probability. Note that once
=0
for each (
), Eq. (4.15) collapses to the Black-Scholes equation (4.7). We shall return to
this point soon.
Next, consider a self-nancing strategy invested into the asset and the money market account,
just as in the previous section. We have, generalizing Eq. (4.5), that
=(
1
+
2
1
+
2
= (
)
)=
),
The Black-Scholes equation (4.4) is a typical (in fact the rst) example of partial di erential
equations in nance. It leads to an equation of the so-called parabolic type, as we shall explain
soon. More generally, let us be given,
0
139
=0
(4.16)
c
by
A. Mele
subject to some boundary condition. This partial di erential equation is called: (i) elliptic,
if 25 4 3 4
0; (ii) parabolic, if 25 4 3 4 = 0; (iii) hyperbolic, if 25 4 3 4
0. The
typical partial di erential equations arising in nance are of the parabolic type. For example,
the Black-Scholes function =
is parabolic. The following section explains how to provide
a probabilitsic representation to these parabolic partial di erential equations.
4.2.3.2 Feynman-Kac solutions to partial di erential equations
The typical situation that we encounter in nance is that the asset price is a function
solves a parabolic partial di erential equation, i.e. a special case of Eq. (4.16):
(
)+
)+ (
)+
1
2
)=0
[0
that
)R
(4.17)
with the boundary condition, (
)= (
) for all , and the function is the nal payo .
Somehow surprisingly, dene, now, a stochastic di erential equation, with drift and di usion
and in Eq. (10.27),
= (
) + (
)
(4.18)
0 =
where
is a Brownian motion. Under regularity conditions on
, the solution
(4.17) is
)
(
(
) =
( )=
to Eq.
(4.19)
where is solution to Eq. (4.18), and the expectation is taken with respect to the distribution
of in Eq. (4.18). Note that the existence of the Feynman-Kac representation does not ensure
per se the existence of a solution to a given partial di erential equation.
Eq. (4.19) can be used to represent the solution to the Black & Scholes partial di erential
equation (4.4), with auxiliary stochastic di erential equation (4.18) collapsing to,
=
where is a Brownian motion, which is dened under the risk-neutral probability, due to the
drift of
being equal to the risk-free rate, . That is, by Eq. (4.19), the price of an option in
a Black & Scholes market is the risk-neutral expectation of the nal payo , discounted at the
risk-free rate.
The Feynman-Kac representation of the solution to partial di erential equations is quite
useful. First, computing expectations is generally both easier and more intuitive than nding
a solution to partial di erential equations through guess and trial. Second, except for specic
cases, the solution to asset prices is unknown, and a natural way to cope with this problem
is to go for Monte-Carlo methodsapproximation of the expectation in Eq. (4.19) through
simulations and use of the law of large numbers. Finally, the Feynman-Kac representation
theorem is useful for some theoretical reasons we shall see later in this chapter.
4.2.3.3 A few heuristic proofs
It is well-beyond the purpose of this section to develop detailed proofs of the Feynman-Kac
representation theorem. In addition to Karatzas and Shreve (1991, p.366), an excellent source
of reference is still Friedman (1975), which relaxes many su cient conditions given in Karatzas
140
c
by
A. Mele
and Shreve through opportune localizations of linear and growth conditions. The heuristic proof
provided below covers the slightly more general case in which
+
=0
(4.20)
where:
=
By Itos lemma,
=
[(
|
+
{z
)
}
=0
]=
is a martingale, with
and
(0
). We have,
0)
Hence,
(0
0)
)=
) +
Consider the Black & Scholes partial di erential equation (4.4). We now know that by the
Feynman-Kac theorem, we can represent the price , as a discounted expectation of the terminal
payo , (
)=(
)+ ,
(
)=
)+
(4.21)
where
denotes the expectation taken conditionally upon the information set at , and with
respect to a new probability (say), under which
is solution to,
=
where is a Brownian motion under . For obvious reasons, we refer to as the risk-neutral
probability.
Naturally, the methodology underlying Eq. (4.21) can be applied to evaluate other derivatives
than Black-Scholes. Consider, for example, a quadratic derivative, i.e. one that pays o the
square of the asset price, at time , 2 . The arguments made in the previous sections, relating
141
c
by
A. Mele
to the replication of this derivative are still the samewith a portfolio process that still includes
=
units of the risky assets, to be calculated below. The price of this derivative, then, still
satises Eq. (4.21), although the boundary condition is (
) = 2 , such that the price can
be expressed as,
2
)
(
)
(
) 2 (2 + 2 )(
(
)=
(4.22)
=
2
This implies that the hedging portfolio is = 2 ( + )( ) . Indeed, it is easy to check that
the value of the replicating strategy,
=
+ , coincides with Eq. (4.22), as = 2 ( )
and, using Eq. (4.3) and Eq. (4.22), =
(
). The hedging portfolio for Black-Scholes
will be introduced, and discussed at length, in Chapter 10.
How does
relate to
and
: heuristic details
is solution to:
=
where
(
)+
where
(
(
)
,
)
and
is solution to:
=1
1 2 2
+
=
+
+
(
)
2
Under the usual pathwise integrability conditions, this is a martingale when,
0=
1
2
(4.23)
1
2
(4.24)
(
)+ is
Comparing Eq. (4.23) with Eq. (4.24) reveals that the representation
possible with, =
, as originally claimed.
The interpretation of is that of a unit risk-premium for investing in the stock. We shall
return to this important interpretation below.
142
c
by
A. Mele
The point of the previous computations is that it looks like as if we could start from the
original probability space under which
=
(4.25)
R
Consider two probabilities and , linked through ( ) =
Radon-Nikodym derivative,
Z
Z
1
2
=
= exp
k k
2
0
( )
) for
(exp( 12
, and the
k k2
))
Z
Z
= (
=
( )=
=
)
)=
where
=
=1
Section 4.4 develops additional details regarding this change of probabilities. We now proceed
to provide details regarding the risk-premium in a slightly more general context, which also
includes evaluation of long-lived assets. We shall see that relates to the stochastic discount
factor through what is known as the pricing kernel.
4 This condition is needed to ensure that
preclude this equality
= 1 to hold.
integrates to one,
143
)=1
that would
c
by
A. Mele
We explain how asset prices link to a number of state variables by deriving a continuous time
version of the APT (see Chapter 2), in which the asset expected excess returns form a linear
combination of the asset exposures to factors, with weights equal to the unit risk-premiums the
market requires to bear the risk arising from each of these factors. We begin with a heuristic
derivation of the pricing kernel in continuous time as the limit of a discrete time model; then,
we characterize the market expected returns in terms of the APT while relying on a di usive
model.
4.2.5.1 Prices and pricing kernels, from discrete to continuous time
Let
be the price of a long-lived asset as of time , and + the dividend paid by this
asset at over a small trading period . We know that in the absence of arbitrage, there exists
a positive process
, known as the stochastic discount factor, such that the price of any asset
is the expectation of its future payo , weighted with
+ ,
=
)]
(4.26)
where
is the conditional expectation given the information set at time . For example, in an
economy with a representative risk-neutral agent, we have that
, where is the
+ =
risk-free rate per unit of time.
Given the stochastic discount factor, we dene, as usual, the pricing kernel, or state-price,
process, , as the process that grows by the stochastic discount factor:
+
[ (
1=
)] +
1
+
, which
(4.27)
R
(
)+
). Assuming that
(4.28)
,5 whence,
+(
))
(4.29)
In the presence of risk, and risk-averse agents, the innovations to the stochastic discount factor
will drive uctuations of the pricing kernel.
Many of the models in this chapter are cast within a di usion setting, such that the pricing
kernel satises the continuous time limit to Eq. (4.29),
=
5 Just
set
144
(4.30)
c
by
A. Mele
where
is a vector Brownian motion, supposed to drive uctuations of the asset prices, and
is the vector of unit risk-premiums.
(
)
The interpretation of in Eq. (4.30) is simple. Without risk, =
the stochastic
discount factor is simply the usual discount factor. In the presence of risk, the discount factor varies stochastically, driven by the same sources of variation a ecting asset prices,
.
Naturally, some components of
are zero if some of these sources of variation do not receive
compensation.
4.2.5.2 Expected returns, lambdas and betas
)=
+
+
=
+
+
(4.31)
+
=
(4.32)
We know from previous sections that the expectations in Eq. (4.32) can be expressed in
terms of partial derivatives, implying that the asset price solves a certain partial di erential
equation. We will develop this theme in detail in Chapters 7 and 8. Now, we wish to further
our interpretation of in Eq. (4.30) as a vector of unit risk-premiums.
Note that theasset
price, , is obviously driven by the same Brownian motions driving and
, such that
= Vol
, where Vol
denotes the instantaneous volatility
of asset returns; note that this volatility could be a vector when there are more than a state
variables driving the asset price, . Substituting this result into the R.H.S. of Eq. (4.32) leaves:
|{z}
(4.33)
+
=
+ Vol
lambdas
| {z }
betas
Expected returns equal the short-term rate plus a risk-premium arising due to the randomness
of the very same returns. This premium is the product of the instantaneous risks related to the
asset price uctuations (the betas, Vol
) times the unit risk-premiums that compensate
for each individual source of these instantaneous risks (the lambdas, ). Eq. (4.33) is an APT
relation, the continuous time counterpart to those developed in Chapter 1 of the lectures: the
only assumption underlying it is absence of arbitrage, i.e., a positive stochastic discounting
factor exists. We now proceed to a decomposition of these expected returns.
145
c
by
A. Mele
How to discount future cash ows in a model with multiple sources of risk? It sounds like
there might be an obvious answer to this question: we should use the APT, i.e., Eq. (4.33).
It is actually a subtle point. Eq. (4.33) provides predictions of expected returns, but expected
returns are not necessarily risk-adjusted discount rates. Naturally, were dividends and asset
prices driven by one (and the same) factor, expected returns and risk-adjusted discount rates
would be the same. However, the two notions deviate in a multifactor framework.
We illustrate these points while relying on a simplifying assumption, namely that the pricedividend ratio, say, is independent of the dividends , and driven by a vector of state variables,
, such that:
(
)= ( )
(4.34)
Such a scale-invariant property of asset price arises in many economies (see Part II of these
lectures for more detailed discussion). For example, it does if (i) the dividends are geometric
Brownian motions and (ii) the state variables do not depend on . In this case, the price in
Eq. (4.34) satises:
= +
such that by Eq. (4.32),
=R
0
|{z}
cash-ow beta
(4.35)
CF
|{z}
cash-ow lambda
and 0
Vol
(possibly a vector) and CF denote the unit-risk premium required to
compensate for the randomness in the dividend process. Note that Eq. (4.32) is decomposition:
we can always nd the appropriate vector CF such that Eq. (4.32) holds true.
We refer to R as the risk-adjusted
discount rates. They equal the safe interest rate , plus
the premium,
, arising to compensate for the stochastic uctuations of dividends.
(4.36)
= Vol
|{z}
| {z } price lambdas
price betas
This term does indeed represent a wedge between the expected returns and the risk-adjusted
discount rates (see Eq. (4.35)); therefore, it carries the potential to mitigate the equity premium
puzzle. Note that Eq. (4.36) describes the most natural channel through which expected returns
146
c
by
A. Mele
are inated: the state variables uctuate and lead the price-dividend ratio to uctuate, thereby
a ecting the realized returns. If these uctuations require compensation (meaning that their
innovations a ect the pricing kernel, 6= 0), they a ect the expected returns. The price beta
are the asset returns exposure to factors arising through this channel, and the price lambdas
are the corresponding unit premiums to bear these risks.
Chapter 7 explains that in addition to the equity premium puzzle, time variation in returns
volatility is a persavive empirical property of asset prices. This property can be rationalized by
time variation in risk-adjusted discount rates. Heuristically, note that returns volatility relates
to the price betas in Eq. (4.36). In a di usion setting, the
relates to the semi price
beta
0( )
elasticity of the price-dividend ratio with respect to , Vol
= ( ) Vol( ); that is, the
critical ingredient of the price beta is the very same price-dividend ratio in Eq. (4.28). Note,
also, that the price-dividend ratio can be re-expressed in terms of the risk-adjusted discount
rates, as follows:
Z
1 2
R( )
( )=E
(4.37)
= ( 0 2 0 )( )+ 0 ( ( ) )
Consider the Lucas (1978) model with one tree and one perishable good taken as the numeraire
the continuous time version of the model in Chapter 3. We assume that the dividend is a
geometric Brownian motion,
=
(4.38)
for two positive constants and 0 . We assume no-sunspots, and denote the rational pricing
function with
( ). By Itos lemma,
=
( ) + 12 20
( )
00
( )
( )
( )
Below, we shall show that in the absence of arbitrage, there must be some process , the unit
risk-premium, such that,
=
+
147
(4.39)
c
by
A. Mele
Let us assume that the short-term rate, , and the risk-premium, , are both constant. Below,
we shall show that such an assumption is compatible with a general equilibrium economy. By
the denition of
and , Eq. (4.39) can be written as,
0=
1
2
2
0
00
( )+(
0)
( )
( )+
(4.40)
Eq. (4.40) is a second order di erential equation. Its solution, provided it exists, is the rational
price of the asset. To solve Eq. (4.40), we initially assume that the solution, F say, tales the
following simple form,
)=
(4.41)
F(
where
is a constant to be determined. Next, we verify that this is indeed one solution to
Eq. (4.40). Indeed, if Eq. (4.41) holds, then, by plugging this guess and its derivatives into Eq.
(4.40) leaves, = (
+ 0 ) 1 and, hence,
F(
)=
1
+
(4.42)
0
This is a Gordon-type formula. It merely states that prices are risk-adjusted expectations of
future expected dividends, where the risk-adjusted discount rate is given by + 0 . Hence,
in a comparative statics sense, stock prices are inversely related to the risk-premium, a quite
intuitive conclusion.
Eq. (4.42) can be thought to be the Feynman-Kac representation to Eq. (4.40), viz
Z
(
)
)=E
(4.43)
F(
where E [] is the conditional expectation taken under the risk neutral probability
dividend process follows,
=(
+ 0
0)
+ (
and =
the true probability,
derivative,
(say), the
1
2
(4.44)
These pricing results relate to the assumption and are both constant. We didnt specify
the exact economic conditions this is true. It is the reason we refer the prices predicted by this
model as a family of prices. The next section provides more structure, through a restriction
on preferences that leads to the pricing results summarized so far.
4.2.6.2 Equilibrium with CRRA
How do precisely preferences a ect asset prices? In Eq. (4.42), the asset price relates to the
interest rate, , and the risk-premium, . But in equilibrium, agents preferences a ect and .
However, such an impact can have a non-linear pattern. For example, when the risk-aversion is
low, a small change of risk-aversion can make the interest rate and the risk-premium change in
the same direction. If the risk-aversion is high, the e ects may be di erent, as the interest rate
reects a variety of factors, including precautionary motives.
148
c
by
A. Mele
To illustrate these features within the simple case of CRRA preferences, consider, rst, the
dynamics of wealth under the risk-neutral probability, , such that by Eq. (4.1),
(4.45)
(4.46)
=(
(
)
=E
(4.47)
Note that Eqs. (4.43) and (4.47) imply that in equilbrium, i.e.
= , we also have that
= .
Next, consider a representative agent with instantaneous utility of consumption ( ) and
subjective discount rate , who solves the following intertemporal optimization problem,
Z
Z
(
)
( )
s.t. =
[P1]
max
where the constraint follows by a change of probability in Eq. (4.47) using Eq. (4.44) and,
accordingly,
1 2
)
= ( + 2 )( ) (
Consider the Lagrangean
Z
L
( )
)=
+ 12
)(
) 0
( )=
. In
(4.48)
( )
=
0( )
00
( )
0( )
1
+
2
2
0
000
( )
0( )
00
( )
0( )
(4.49)
On the other hand, by expanding the R.H.S. of Eq. (8A.21) leaves, by Itos lemma again,
0
0
( )
=(
( )
(4.50)
But drifts and volatilities of Eq. (4.49) and Eq. (4.50) have to be the same, whence
=
00
( )
0( )
1
2
2
0
000
( )
0( )
149
and
00
( )
0( )
c
by
A. Mele
Assume, for example, that is constant. After integrating the second of the previous relations,
1
we see that apart from an irrelevant integration constant, ( ) = 1 1 , where
is the
0
CRRA. Hence, under CRRA preferences,
1
( + 1)
2
= +
2
0
Finally, by replacing these expressions for the short-term rate and the risk-premium into Eq.
(4.42) leaves,
1
( )=
(4.51)
1
2
(1
)
0
2
= lim
lim
lim
lim
)(
1
2
2
0
1
2
)(
+ 12
(1
)(
)+(
)(
)(
1
2
)(
2)
0
=0
(4.52)
4.2.6.3 Bubbles
constants.
(4.53)
Indeed, by plugging Eq. (4.53) into Eq. (4.40) reveals that Eq. (4.53) holds if and only if the
following conditions holds true:
0=
( +
and 0 = (
0)
1
(
2
1)
2
0
The rst condition implies that equals the price-dividend ratio in Eq. (4.42), i.e.
The second condition leads to a quadratic equation in , with the two solutions,
0. Therefore, the asset price function takes the following form:
2
( )=
( )+
150
(4.54)
=
1
F(
.
0 and
c
by
A. Mele
lim
, if
lim
( ) = 0 if
( )+B( )
B( )
=0
The component, F ( ), is the fundamental value of the asset, as by Eq. (4.43), it is the
risk-adjusted present value of the expected dividends. The second component, B ( ), is simply
the di erence between the market value of the asset, ( ), and the fundamental value, F ( ).
Hence, it is a bubble.
We seek conditions under which Eq. (4.55) satises the transversality condition in Eq. (4.46).
We have,
( )
( )
( )
) + lim E
B( )
= lim E
lim E
F(
By Eq. (4.52), the fundamental value of the asset satises the transversality condition, under
the condition the denominator in Eq. (4.51) is strictly positive. Regrading the bubble we have,
( )
( ) 2
lim E
B ( ) = 2 lim E
=
lim E
1
0 )+ 2 2 ( 2
2(
1)
2
0
)(
(4.56)
where the last line holds as 2 satises the second condition in Eq. (4.54). Therefore, the bubble
can not satisfy the transversality condition, except in the trivial case in which 2 = 0. In other
words, in this economy, the transversality condition in Eq. (4.46) holds if and only if there are
no bubbles.
4.2.6.4 Reecting barriers and absence of arbitrage
(4.57)
where
is a continuous, non-negative process that increases only when the dividends
hit
. We may think of the rm as operating on an innitesimal intervention scale.
Eq. (4.57) thus generalizes Eq. (4.38) in that it allows for dividends growth to always ensure
dividends cannot decrease below a threshold. How does the price behave in this context? By
Itos lemma, and Eq. (4.57),
=
)
(
+
)
1
2
2
0
1
+
2
2
0
00
2
(
00
)
(
+
151
(4.58)
c
by
A. Mele
We claim that to ensure absence of arbitrage, the following smooth pasting condition must
hold,
0
( )=0
(4.59)
Indeed, after hitting the barrier, , the dividend is reected back for the part exceeding .
Since the reection takes place with probability one, the asset is locally riskless at the barrier
. Therefore, absence of arbitrage requires the price moves at , only by its predictable drift
component in Eq. (4.58), which it does when 0 ( ) = 0 for
= . Note then that the last
0
component in Eq. (4.58), ( )
= 0 for all . By standard arguments (i.e. Eq. (4.39)), we
then have that:
0
( )
0
( )
+
=
( )
If = then, by Eq. (4.59),
=
+
This relation tells us that holding the asset during the reection guarantees a total return
equal to the short-term rate. Once again, during the reection, the asset is locally riskless and,
hence, arbitrage is ruled out when holding the asset will make us earn no more than the safe
interest rate, . Indeed, by the previous relation, and using ( ) = 0 we have that the wealth
in Eq. (4.1), satises,
=
+
+
+
( )
=(
)
This example illustrates how the relation in Eq. (4.39) works to preclude arbitrage opportunities.
To solve the model, note that while the dividends are above the barrier,
, the price is
still as in Eq. (4.53),
(
+ 1 1+ 2 2
+ 0
As in the previous section, we need to set 2 = 0 to satisfy the transversality condition in
Eq. (4.46) (see Eq. (4.56)). However, we now determine 1 to pin down the price at the barrier
, rather than set it equal to zero as in the previous section.
We have,
F ( )/ , and
0=
)=
( )=
( )=
where the second condition is the value matching condition, which needs to be imposed to
ensure continuity of the pricing function with respect to
and, hence absence of arbitrage.
The previous system can be solved to yield
=
and
(4.60)
152
c
by
A. Mele
2
= ( )=
( )
1 F
, we dene a Radon-
R
E
( )
( )=
=
( )
( )
where ( ) is the time price of a zero coupon bond expiring at .
The next section develops the leading example of change of probabilities, the consumptionbased probabilities, as a benchmark. Section 4.4.2 reviews general notions, and examples, of
pricing, based on the numeraires being involved into the denition of a given asset pricing
problems.
4.3.1 Leading example: consumption-based probabilities
Consider a basic consumption-based model, where an investor can invest at two dates only, at
and at
, without any intermediate consumption, such that the price of an asset satises,
( ) 0
( )
(
)
=
=
E ( )
0( )
with the usual notation. In this model, we dene the risk-neutral probability
Radon-Nikodym derivative of against ,7
(
)
0
[ 0 ( )] ( )
( )
=
0( )
[ 0 ( )]
F
|
{z
}
through the
=1
7 This derivation relies on the assumption that the short-term rate is constant. Section 7.5.1 in Chapter 7 contains a derivation
of the more general case of stochastic interest rates in a Markov setting.
153
c
by
A. Mele
Assuming decreasing marginal utility as usual, we have that the risk-neutral probability distorts
the physical, by assigning more weight to the bad states of nature, i.e. those where consumption
is low, as explained in Chapter 2.
We derive implications for a continuous-time model where a representative agent has CRRA
equal to . In equilibrium,
= , such that if consumption is a geometric Brownian motion
with parameters (
), we have that the density process,
1 2
( +1) (
)
2
=
( )=
such that,
( )
=
( )
where
+(
{z
=
To reconcile prices in the risk-neutral world with prices in the true, we need the average path
of the stock price to be lower under the risk-neutral probability than under the physical. Since
2
the stock is traded, we also have that
= , as indicatedthe risk-premium is somehow
hidden in this example with complete markets. When markets are incomplete, risk-premiums
terms would, instead, show up, as in the Gamma process example of Heston (1993,b).
[Develop Gamma processes here.]
4.3.2 Numeraire pricing
4.3.2.1 Denition
The formulation of the fundamental theorems of asset pricing (FTAP) in the previous chapters
relies on the risk-neutral probabilitythe probability under which the asset prices discounted by
the value of the money market account are martingales, to prevent arbitrage. That is, the money
market account is the numeraire in this no-arbitrage context. We can formulate equivalent
versions of this theorem, utilizing di erent numeraires, and di erent equivalent probabilities.
Consider the following denition, which we shall state assuming no dividends for simplicity.
Definition 4.1 (Numeraires and equivalent probabilities). A numeraire is any asset with a
price process N
0, say. Given a numeraire, a probability N is an equivalent probability
(or measure) if any asset price process
normalized by N satises,
N
=
N
N
A critical task in this and previous chapters is the formulation of the FTAPmarkets are
arbitrage-free if and only if there exists an equivalent martingale probability. We now provide
examples of existence of numeraires and associated probabilities N .
154
c
by
A. Mele
The risk-neutral probability corresponds to the money market account numeraire, as mentioned,
with value process given by,
R
N
Suppose, for example, that the short-term rate is constant. In terms of the consumption probabilities of Section 4.1, we have that,
N =
=E
(4.61)
N
N
Alternatively, and still assuming interest rates are constant, we may express Eq. (4.61) as,
(
)
N =
=E
= ( )
N
N
!
R
=E
where
against
( )E
( )
( )E
(4.62)
N
=
(4.63)
( )
F
Accordingly, in this context of random interest rates, we can dene the zero coupon bond as a
numeraire, such that, by Eq. (4.62),
N
N = ( )
=E
(4.64)
N
N
where the equality follows by the boundary condition for the price of a zero coupon bond,
N = 1.
In Chapter 12, we shall make reference to this probability to price interest rate derivatives
without risk of default. The price of these derivatives is,
=E
( ) [ ]
(4.65)
where () is the payo , possibly a function(al) of the entire path of the short-term rate over
the life of the derivative. Note the main di culty in Eq. (4.65). The discounting factor
155
c
by
A. Mele
obviously correlates with the payo , and complicates the evaluation of this derivative. However,
we can use the forward probability to express , as follows,
( )E
{ }
(4.66)
As the Radon-Nikodym derivative dened in Eq. (4.63) makes clear, the new probability,
, distorts , in that it assigns a higher weight to events where interest rates paths are
lower. The drift of under N is indeed lower, as formally shown in Appendix 3 of Chapter
12.8 In other words, we have that upon using Eq. (4.66) instead of Eq. (4.65), we get rid o the
randomness of the discounting factor
, by multiplying the expected payo by ( ).
At the same time, this expected payo is calculated in world ruled by a probability N , where
interest rates are on average lower than under , which compensates for the fact the very same
expected payo receives an haircut of ( ) 1.
Note that the calculations underlying Eq. (4.64) can be used to check the internal consistency
of the denition of the new numeraire and equivalent probabilities with absence of arbitrage.
There are no arbitrage opportunities if there exists a probability N with Radon-Nikodym
derivative against equal to:
N
=N N
(4.67)
N N
F
N
=E
=E
N N
N N N
N
=N N
N N
F
8 Note that this statement certainly holds when the short-term rate is a Markov process. In more complex models such as those
with stochastic volatility, the bond price sensitivity to movements in the state variables may switch sign and render this statement
model-dependent.
156
c
by
A. Mele
such that, we can price all contracts in this market.9 First, the value at of a forward starting
interest rate swap, i.e. a swap agreed at time , say , is given by:
=E
=E
max {
N
0}
=E
, where:
)+
is
)
N
=E
( ), for
,
Accordingly, we can dene the forward swap rate as the process
which is obviously a martingale under N . These properties are extremely useful, when it
comes to pricing these important derivatives.
Defaultable probability
Finally, we consider the so-called survival-contingent probability, studied in Chapter 13, where
the numeraire, N , is a defaultable annuity of one dollar, paid o over a certain tenor, which
is interpreted as the period over which two counterparties exchange credit risk. The option to
enter into this deal is called a credit default swaption, and the payo is as in Eq. (4.68), with
+
N (cds
)+ , where cds is the credit default risk premium as of time , and
is
a constant.
4.3.2.3 Martingales and numeraires
The examples in the previous section can be gathered to provide a general insight. Consider a
forward starting agreement, originated at time , with payo equal to,
N (
is measurable at time
Suppose we are long an amount of a forward starting agreement at time and strike
such
that the value of this agreement at right after the trade is +
N (
) and the value
9 Note that under this annuity probability, the events that matter most, are those where interest rates paths are low, similarly
as with the forward probability in the previous section.
157
c
by
A. Mele
+1
=1
X
=1
N +1 (
+1
[N (
that clears
) + (N +1
+1
=1
N )(
)]
+1
By chopping the trading interval in small pieces we have that under standard regularity conditions,
Z
Z
0
= 1
(4.69)
R . We assume
= >(
1 )+
+ >
(4.70)
where 1 is a -dimensional vector of ones,
( 1
)> ,
>
( 1 + 11 +
) . The solution to the previous equation is
Z
Z
Z
>
>
(
1
)
=
+
+
0
158
= 1
(4.71)
c
by
A. Mele
to be strictly positive.
4.4.2 Viability
R
Let = 0 +
= 1
, where = 10
and
=
. Let us generalize
the denition of the risk-neutral probability in Eq. (4.44), and introduce the set Q of riskneutral, or equivalent martingale, probabilities, Q {
: is a -martingale}. We show
the equivalent of Theorem 2.8 in Chapter 2: Q is not empty if and only if the market is
arbitrage-free. We rely on Girsanovs theorem of Section 4.3.3. Given a the F -adapted process
, dene,
Z
+
[ ]
(4.72)
0 =
is a standard Brownian motion under a probability , equivalent to
derivative equal to,
Z
Z
1
2
>
0
= exp
k k
2
F
such that
is a martingale under
. Under
=
If is a
+ =
(4.73)
((
= , or in vector notation,
(4.74)
= (
such that,
, with Radon-Nikodym
],
(4.75)
0
0
1
1
and
Pr
0
0.
0
0
159
is an arbitrage opportunity if
c
by
A. Mele
Theorem 4.3. There are no arbitrage opportunities if and only if Q is not empty.
A proof of this theorem is in the Appendix. The if part follows easily, by Eq. (4.75). The
only if part is more elaborated, but its basic structure can be understood as follows. By the
Girsanovs theorem, the statement absence of arbitrage opportunities
Q is equivalent
to absence of arbitrage opportunities
satisfying Eq. (4.74). If Eq. (4.74) didnt hold,
we could implement an arbitrage, as follows. We could nd a nonzero : > = 0 and > (
1 ) 6= 0. Then, we could use when
1
0 and
when
1
0, therebying
obtaining an appreciation rate of larger than in spite of having zeroed uncertainty through
>
= 0. If Eq. (4.74) holds, this arbitrage opportunity would never occur, as in this case for
each , > (
1 ) = > . More precisely, dene
2
: > =0
h >i
and
h i
, for
Then, we may formalize the previous reasoning as follows. The excess return vector,
must be orthogonal to all vectors in h > i , and since h i and h > i are orthogonal,
2
h i, or
:
1 = .10
Definition 4.4 (Market completeness). Markets are dynamically complete if for each ran0
2
dom variable
( F ), we can nd a portfolio process :
= a.s.
The previous denition is the natural continuous-time counterpart to that we gave in the
discrete-time case (see Chapter 2). Consistently with the conclusions in Chapter 2, we shall
prove that in continuous-time, markets are dynamically complete if and only if (i) = and
(ii) the price volatility matrix of the available assets (primitives and derivatives) is nonsingular.
We shall provide a sketch of the proof for the su ciency part of this statement (see, e.g.,
Karatzas (1997 pp. 8-9) for the converse), which relates to the existence of fully spanning
2
dynamic strategies. So given a
( F ), let
= and suppose the volatility matrix
is nonsingular. Let us consider the -martingale:
1
F
(4.76)
0
>
=0
h i
>
>
=0
>
=0
160
>i
=0
c
by
A. Mele
) such that
>
>
0
0
and so, by identifying, the portfolio we are looking for is > = 0 > 1 . Set, then, =
.
0
1
1
0
Then,
= 0
, and in particular,
= 0
a.s. By comparing with Eq. (4.76),
0
= .
Armed with this result, we can now easily state:
Theorem 4.5. Q is a singleton if and only if markets are complete.
Proof. There exists a unique
Girsanovs theorem. k
2
0
):
=0
>
>
a.s.
Under the usual regularity conditions, can be interpreted as the process of unit risk-premia.
In fact, all processes belonging to the set:
o
n
Z=
: = +
h i
are bounded and, hence, can be interpreted as unit risk-premia processes. More precisely, dene
the Radon-Nikodym derivative of with respect to on F :
Z 2
Z
>
1
= exp
2
0
0
F
on (
Z
1
= exp
k k2
2 0
F),
161
>
[0
])
c
by
A. Mele
a strictly positive -martingale. We have the following results, which follows for example by
He and Pearson (1991, Proposition 1 p. 271) or Shreve (1991, Lemma 3.4 p. 429):
Proposition 4.6.
( )=
(1
),
F .
( ) = max
( )
( )+
( )
>
)+
1
+
2
00
( )
>
( )
>
(4.77)
( )=
( )
and
( )
00 ( )
0
>
(4.78)
To solve for the value function, replace Eqs. (4.78) into Eq. (4.77), which leaves an ordinary
di erential equation that needs to be solved by . While an analytical solution for
is in
1
general unavailable, it is easy to check that in the CRRA case, ( ) = 1 1 , we have that
1
( )=
, for two constants
,12 such that
1
=
>
)> (
>) 1(
).
162
c
by
A. Mele
Z
(
(
) = max
)+
( )
s.t. Eq. (4.71) holds.
[4.P1]
(
This optimization problem can be solved relying on the notion of Arrow-Debreu state prices,
similarly as in Chapter 2. The rst task is to derive a budget constraint that parallels that in
Chapter 2, and applied to a two period economy. For reference, in Chapter 2, we explained that
in the presence of complete markets, the budget constraint can be written compactly as:
1
0= 0
1
(4.79)
0+
(1 + )
= 0
=
(4.80)
As in the nite state case, we want to evaluate the present value of future consumption and
the current. For a given
, the value of future and present consumption is,
Z
( )
( )
( )+
( ) ( )
Z
+
=
(4.81)
where the rst equality follows by Eq. (4.80), and the second by the budget constraint.
Note that the original problem, one with an innite number of trajectory constraints, is
now reduced to one with a single constraint, just as in the two period market model. In the
163
c
by
A. Mele
Appendix, we also show that Eq. (4.81) is equivalent to one in which the expectation is taken
under the risk-neutral probability, as follows:
Z
Z
=E
+
=
+
(4.82)
0
Eq. (4.82) is the nite horizon counterpart of Eq. (4.47) of Section 4.3.4.2, derived through a
di erent routeby rewriting the constraint under the risk-neutral probability and integrating
out. The approach in this section provides the economic intuition underlying that change of
probability.
4.5.2.2 The optimization problem
) = max
(
)+
max
( (
)
) + ( )
+
(
where
Z
1
2
= exp
+ k k
2
>
(4.83)
(4.84)
)=
, and
)=
To determine the portfolio-consumption policy, note that for a generic F -measurable random
variable , the martingale,
Z
Z
1
1
E
=
+
0
0 +
By the predictable representation theorem, there exists a vector
Z
>
=
+
such that.
164
>
c
by
A. Mele
where
>
(4.85)
and
= .
4.5.2.3 Marginal utility of income
The Lagrange multiplier in Eq. (4.84) carries the interpretation of marginal utility of income,
similarly as for some derivations in Chapter 2. We provide a proof of this quite intuitive envelope property by relying on an innite horizon framework and instantaneous utility at equal
to
( ), as this is a framework often utilized in Part II of the Lectures. We, then, consider
the following program, a special case of program [4.P1],
Z
Z
max 0
( )
s.t. 0
= 0
1
[4.P2]
( 0)
0
( )
where
( ) =
(0 ) =
(4.86)
as usual. Furthermore, by assuming enough regularity conditions, we can di erentiate the intertemporal utility in [4.P2],
Z
Z
0
0
( 0) = 0
( )
=
=
(4.87)
0
0
where the second equality holds by the rst of the optimality conditions (4.86), and the third
by di erentiating L.H.S. and R.H.S. of the budget constraint in [4.P2] with respect to 0 , as
originally claimed.
Next, we determine , thereby pinning down 0 in the second of the optimality conditions
(4.86). Rather than performing this task within a general framework, we assume the agent has
1
1
CRRA equal to , such that the optimality conditions (4.86) become =
(
) . By
replacing optimal consumption into the budget constraint of [4.P2], and rearranging terms,
Z
1
1
(4.88)
= 0
0
0
1
0
165
(4.89)
c
by
A. Mele
where the second equality follows by the expression of in Eq. (4.88). Assume, for instance,
log-utility, = 1, in which case 0 = 0 as in some of the models in Chapter 3.
This example shows what we would have to expect in general equilibrium. In equilibrium,
optimal consumption equals the dividends of the trees. Assuming one tree to illustrate, we have
that the initial endowment has to satisfy the following constraint,
= const
+1
, such that optimal consumption is = +1 1 ,
solution for the Lagrange multiplier is: =
and = +1 1 . As regards the portfolio process, one has that:
=
which shows that
(4.85),
We determine
= 0 into
=
+
+1
+1
and, hence,
=
+1
(
+1
>
(
+1
(
+1
+1
)
. The solution is:
1
1
4.5.2.5 Equilibrium
= 0,
for
(4.90)
=1
We now derive equilibrium allocations and Arrow-Debreu state price densities. First, note
that the dividend process, , satises:
=
+
166
c
by
A. Mele
=1
and
ln
=1
) = ln
= ln
1
+ k k2
2
>
(4.91)
where the rst equality holds in an equilibrium, the second equality follows by the rst order
in Eq. (4.83).
conditions in (4.84), and the third equality is true by the denition of
Finally, by Itos lemma, ln (
) is solution to:
2 !!
1
ln =
+
+
+ 20 2
(4.92)
0
2
By identifying drifts and di usion terms in Eqs. (4.91)-(4.92), we obtain, after a few simplications, the expression for the equilibrium short term rate and the prices of risk:
(
)
(
) 1 2 2
(
)
=
+
+ 0
(
)
(
)
2
(
)
(
)
>
=
0
(
)
For example, consider the CRRA utility function, if ( ) = ( ) (
= 1. Then,
1
= +
= 0
( + 1) 20
2
Appendix 2 performs Walrass consistency tests regarding Eq. (4.90).
1) (1
), and
0
0
where the second equality holds by the same arguments leading to Eq. (4.82). Replacing the rst
order condition in (4.84), and the equilibrium conditions in Eq. (??), we obtain the continuous
time version of the Consumption-CAPM,
0
Z
0
( )
( )
=
+
= 0 1
0(
0(
)
)
For example, the price of a pure discount bond, is,
0
( )
=
0(
)
where
is as in Eq. (4.83).
167
c
by
A. Mele
The value of the rst gamble is less than that of the second, 0 1 . We may interpret the
second gamble as an American option, which we can exercize at any time before its expiration,
say after having seen any of the rst two draws. The rst gamble is, instead, interpreted as a
European option, with a payo that only links to the third draw. The di erence between the
168
c
by
A. Mele
values of the two gambles is a premium we need to pay for the optionality to exercise early,
such that in general, the value of the gamble after draws is:
( ) = max { E (
+1 )}
{1 6}
where we consider the unconditional expectation E () due to the fact that the draws of the dice
are independent and identically distributed.
Note that the exercise boundaries are
5 after the rst draw and
4 after the second,
meaning that the region where it is optimal to stop widens as the game goes throughthe
value to wait obviously decreases as the expiration approaches. Moreover, these boundaries
are obviously endogenous, in that we calculate them as a part of the decision process. Finally,
the boundaries are time-varying in this free-boundary problem, although in the next section,
we shall consider innite horizon decision problems where the free-boundary to be selected is
constant.
4.7.2 Gambles and securities again
Chapter 2 explains a fundamental distinction exists between gambles and securities. Securities
are traded while gambles are not, such that the value of gambles depends on supply and demand.
For example, the evaluation in the previous section relies on the assumption that the bettor
is risk-neutral. Alternatively, we could consider a risk-averse bettor, for example, one with
logarithmic utility and, hence, a value function at time and state equal to,
( ) = max {ln
+1 )}
P
The value function to be expected from a third draw is 3 16 6=1 ln = 1 0965, and because
ln 2 3 and ln 3 3 , the decision after the second draw is to draw again only when the draw
delivers particularly poor outcomes,
= 1 or 2. Likewise,
draw
decision
draw again
draw again
draw again
stop
stop
stop
169
draw
decision
draw again
draw again
stop
stop
stop
stop
c
by
A. Mele
As explained, the reason a gamble like this is so relying on factors such as risk-aversion is
that the asset we consider (throwing a dice) is not traded and, as such, does not satisfy the
martingale restriction. The next section develops an evaluation framework for a traded asset
with a payo relating to another traded asset, such that standard risk-neutral pricing applies.
4.7.3 Real options theory
We built upon the intuition developed through the example of the previous section, and derive
a continuous-time approximation to the solution of the resulting problem, by hinging upon a
series of heuristic arguments. We consider an American option, i.e. an option that we can
exercise at any time before the expiry date, . Once the option is exercized, it yields a payo
equal to a function of the underlying asset price, say ( ). Let C be the price of an American
option as of time . In discrete time, we have:
C = max
( )
E [C + ]
We assume that the nature of the option, summarized by the payo
( ), is such that there
are two regions, a stopping region and a continuation region, dened as follows:
E [C + ]
(4.93)
The expected return on the option under the risk-neutral probability is less than that on
a bank deposit, which further claries why it is optimal to exercise early. Naturally, the
fact the option is yielding less than the safe interest rate is not an arbitrage. We could
simply not short the derivative, as no one else is willing to buy it, as it is not optimal to
do so.
(ii) Continuation region, where time-to-maturity and the price of the asset underlying the option are such that it is optimal to wait, C = max
( )
E [C + ] =
E [C + ],
or
1
E [C + ] C
C
0=
(4.94)
The expected return on the option under the risk-neutral probability is the same as that
on a bank deposit.
Note that the existence of these two regions is not guaranteed. For example, we shall see
that it is never optimal to exercise early American calls written on assets that do not distribute
dividends. When the two regions are, instead, well-dened, they dene an exercise envelope, a
function of the asset price underlying the option and time-to-maturity. It is a free boundary
problem: we need to nd a boundary that triggers some action, in this case, exercising the
option, and the boundary is free in that it is not given in advance as in the case of, say, the
barrier options of the following section.
This problem can be quite complex, but sometimes, simplies for those derivatives with an
innite expiry date, . This simplication arises as in this case, the option price and, hence,
170
c
by
A. Mele
the envelope, only depends on the underlying asset price. Under this assumption, and the
assumption that the price of the asset underlying the option is a geometric Brownian motion
with volatility parameter , we have that the option price satises, in the limit
0:
L [C]
L [C]
Stopping region:
Continuation region:
C 0 and C =
C=0
( )
(4.95)
(4.96)
where L [C] = 12 2 2 C 00 + C 0 . Eqs. (4.95)-(4.96) are the continuous time counterparts to Eqs.
(4.93)-(4.94). To these two equations, we need to add a number of conditions, discussed in the
two examples in the subsections below.
We can understand this evaluation of this asset under an equivalent angle, one cast in terms
of an optimal stopping time problem,
C = sup
( )
(4.97)
0
That is, we would like to nd the time at which it is best to undertake an action, in this case,
exercizing the option. This time links to the free-exercise boundary , say, as follows:
= inf { :
0
Stopping region (
Continuation region (
(4.98)
(4.99)
where is the strike price of the option. Eq. (4.98) is a value-matching condition. It ensures
that the pricing function is continuous as we move from the continuation region towards the
stopping region.
Second, we require the following boundary condition:
lim
( )=0
(4.100)
That is, as the asset price gets large, the value of the put option needs to approach zero, as the
probability the derivative is ever exercised becomes negligible.
Finally, the pricing function, ( ), satises the following smooth-pasting condition, obtained after taking the derivative in Eq. (4.98),
0
( )=
171
(4.101)
c
by
A. Mele
We conjecture that in the continuation region, the pricing function that solves Eq. (4.99) has
the form ( ) =
, for two constants and . Plugging this guess into Eq. (4.99) reveals
that actually, the pricing function satisfying it, has the following form:
( )=
(4.102)
where + and
are two constants, to be pinned down, + = 1 and
= 22 . To satisfy
the boundary condition in Eq. (4.100), we need that + = 0, which leaves ( ) =
.
Evaluating this function at , as in Eq. (4.98), and using the smooth pasting condition in Eq.
(4.101), yields:
( )=
=
(4.103)
1
0
( )=
= 1
The endogenous variables of this system are the two constants
=
and
=(
1
2
and
. We have:
(4.104)
, such that
( )=(
A few comments are in order. First, Eq. (4.104) shows that the value to wait increases with
. Second, when the short-term rate is zero,
= 0, meaning it is never optimal to exercise,
and the option is worthless. Intuitively, in the stopping region, the expected return on the
option under the risk-neutral probability is less than that on a bank deposit. When = 0, this
expected return is negative, which destroys the time-value of money argument underpinning
early exercise.
2
): ( ) =
): L [ ]
=0
(4.105)
(4.106)
The solution to Eq. (4.106) has the same functional form as that in Eq. (4.102), with the
same values of
and + . However, due to the obvious conjectures about the location of the
stopping and continuation regions in Eqs. (4.105)-(4.106), it satises the boundary condition
lim 0 ( ) = 0, rather than lim
( ) = 0, as the put price does in Eq. (4.100). Therefore,
and because + = 1, we must have that ( ) = + , or
( )=
where the second equality follows by the value matching condition in Eq. (4.105). Solving for
yields,
=
1
172
c
by
A. Mele
If the underlying asset distributes any payouts over the life of the option, say due to storage
costs, or even dividends, the problem has, instead, a well-dened solution. Assume that under
the risk-neutral probability, the underlying asset price satises,
=(
Stopping region (
Continuation region (
where now, the innitesimal generator is L [ ] =
solution to Eq. (4.108) is,
( )=
1
2
2 00
+(
(4.107)
(4.108)
=0
)
2+2 2 ,
2 + 2 2 , and
= 12
+ 12 2 . Clearly, we
+
where, + = 12
+
have that
0, and
0, such that the conjectures about the location of the stopping and
continuation regions in Eqs. (4.107)-(4.108) deliver the boundary condition lim 0 ( ) = 0,
and then,
( )=
(4.109)
The value matching condition and the smooth pasting conditions equivalent to the two Eqs.
(4.103) are, now
( )= + +=
1
0
( )= + + + =1
The solution to this system is,
=(
such that the call option price in Eq. (4.109) can be written as:
( )=(
)
173
c
by
A. Mele
S*/K
1.06
1.05
1.04
1.03
0.20
0.25
0.30
0.35
0.40
0.45
0.50
The triggering ratio blows up as the payout ratio shrinks to zero. In general, the expected
optimal stopping time is inversely related to the payout ratio .
4.7.5.2 American calls and incomplete markets
American call options might be worth even in the absence of dividends paid out by the underlying over their life, in the presence of incomplete markets. Consider the following example of a
perpetual American call option written on the instantaneous variance (not an average expected
volatility) of an asset return. Suppose this variance, say, is solution to,
= (
): ( ) =
): L [ ]
=0
1
+
2
2
2 2
2
(4.110)
( )=1
and can be satised, because the fact is not traded makes the drift of show up in the
innitesimal generator of Eq. (4.110)the fact is not traded makes its drift under the riskneutral probability play a role similar to dividends and leads to a positive economic value for
in this context.
174
c
by
A. Mele
( )
( )
( 0 ) = sup
0
That is, the value of the rm is formally the same as that a perpetual American option in
(4.97). Accordingly, we conjecture that the optimal stopping time is determined as the rst
time
crosses a free-boundary from below,
say. In the continuation region,
0=L ( )
where L
1
2
2 2
00
( ) for
(4.111)
. At the free-boundary,
( )=
(4.112)
The solution to Eq. (4.111) is, ( ) = 1 1 + 2 2 , where it is easy to show that 1 0 and
1 assuming as we are that that
, and 1 and 2 are constants to be determined. Note
2
that it must be that 1 = 0, for otherwise the value function would increase disproportionately
as the output potential becomes arbitrarily small. Therefore, the economically viable solution
to Eq. (4.111) is the bounded value function,
( )=
for
(4.113)
The free-boundary
and the constant 2 can now be determined by requiring that the two
following conditions hold at the threshold : (i) the value matching condition, i.e. the value
function in Eq. (4.113) is the same as that in (4.112), and (ii) the smooth pasting condition,
the derivatives agree. The result is:
=
(
2
)
1
( )1
=
2(
That is, the rm should start produce immediately, unless it had to pay a cost in which
case the threshold to start produce increase proportionally with as the previous expression
for
suggests.
175
c
by
A. Mele
(4.114)
) (
). Assume that were the exchange rate free to
, would follow an arithmetic Brownian motion
=
1 2 00
0
( )+
( )
(4.115)
( )= +
2
which is solved by,
F(
Note that exchange rate volatility does not play any role in this context. In fact, there are
many additional exchange rate functions on top of F ( ), which satisfy Eq. (4.115),
( )=
( )+
(4.116)
p
2 2+2
2 ), and
where 1 2 = 1 2 (
1 and
2 are two arbitrary constants. However,
F ( ) is the only solution that we can interpret as one of expected fundamentals consistent
with Eq. (4.114), in that,
Z
1
1(
)
( )=
=
+
F( )
176
c
by
A. Mele
How would exchange rates behave, once central banks are credible enough to modify the
fundamentals and make the exchange rates lie within a given band? Krugman (1991) considers
a model in which the exchange rate is maintained withing a target zone, say
[ ] for
two given constants and . The models main assumption is that central banks control the
fundamentals through innitesimal interventions, by injecting or withdrawing monetary base
, as soon as the fundamentals become too weak or too strong for the current exchange
rate to be maintained within the band [ ]. The fundamentals are then solution to,
=
where
and
are continuous processes that only increase when the fundamentals reach hit
some critical values ( ) and ( ) that ask for intervention, which will be determined below.
By arguments entirely analogous to those utilized while pricing assets with dividends that
have reecting barriers (see Eq. (4.58) in Section 4.2.6.4), we now have that the exchange rate
is solution to,
1 2 00
0
0
( )=
( )+
( )
+ 0( )
+ 0( )
( )
2
Therefore, the exchange rate is still solution to the di erential equation (4.115), and its
general solution is therefore given by Eq. (4.116). We can now determine the two constants,
1 and
2 . First, note that analogous to the reecting barrier case in Section 4.2.6.4 (see Eq.
(4.59)), to rule out arbitrage, we need to ensure that the following smooth pasting conditions
hold at the critical values and :
0
( ) = 0 and
() = 0
(4.117)
() =
(4.118)
and
Eqs. (4.117)-(4.118) are four independent equations with four unknowns: the two constants
of the exchange rate function 1 and 2 , and the two critical values of the fundamentals that
trigger intervention, and , which are all implicitly determined.
[In progress: discuss the literature on speculative attacks & policy credibility]
4.8.3 Liquidity constraints and optimal dividend policy
Jeanblanc-Picque and Shirayev (1995) consider a model in which a rm maximizes its value
through a dividend policy that accounts for periods of inaction. Inaction occurs because of
liquidity constraints. While it will be assumed that the capitals rm has positive growth, there
might be bad times in which it would not be optimal for the rm to distribute any dividends
as this might trigger bankruptcy. This issue would not arise if shareholders were able to inject
new funds or the rm could borrow from a bank or issue new securities.
Consider the simple frictionless case, in which the cash generated by the rm is,
=
(4.119)
c
by
A. Mele
Z
( 0 ) = sup
(4.122)
The evaluation in (4.121) quite di ers from that in Eq. (4.120), as we now show, depending as
it is on the dividend policy.
( )
with 0
( )
(4.123)
The limiting case, = , is considered below. By Eqs. (4.121)-(4.122), and (4.123), the
problem of the rm is then,
Z
( )
s.t.
=(
( )) +
(4.124)
( 0 ) = sup
0
sup
0
( )
( )+
( )]
( )
0
( )+
1
2
00
( )
( )+
sup
0
[ ( ) (1
( ))]
(4.125)
( )
0
: 0 ( ) 1 (inaction)
( )=
: 0 ( ) 1 (dividend distribution)
(4.126)
The structure of the problem suggests the following interpretation of the solution: the rm
should pay o as soon as the level of capital reaches a threshold level, at which the payout takes
place at the maximal speed, . This interpretation needs to be corroborated by verifying that
the marginal value of capital, 0 ( ), is indeed decreasing in . In this case, the interpretation
of the optimal policy in (4.126) is quite neat: the rm pays out whenever the marginal value of
capital is low (less than unity).
Substituting Eqs. (4.126) into Eq. (4.125), and conjecturing that
is concave (decreasing
returns to capital) over the inaction region, yields that the rm does not distribute any dividends
until reaches a threshold
(say),
0=
1
0=
1 2
2
0
( ) + 12
( )+
00
2
( )
00
( )
( )
(inaction region)
( ) + (1
178
(4.127)
0
( ))
(payout region)
c
by
A. Mele
1 +
2 +
( )= 1 1 + 2 2
and
( )=
1
2
for some coe cients
and
to be determined, and 1 0,
The bankruptcy condition stipulates that 0 = 0, such that
1
2
( )= 1
for
2
1
0,
0 and
2 , and then,
1
0.
(4.128)
must be positive
( )=
for
(4.129)
To summarize, the value of the rm is given by Eq. (4.128) and Eq. (4.129), with three
constants to be determined: the two coe cients,
. The
1,
1 , and the free-boundary,
0
constants are determined by imposing (i)
( ) =
( ) (value matching), (ii)
( ) =
0
( ) (smooth pasting), (iii) 00 ( ) = 00 ( ) (super-contact). The super-contact condition
makes the second order derivative continuous as required for an application of Itos lemma to
a function of a variable that takes values over the real line, ( ). Finally, it can be veried
that the thusly determined coe cients imply that
is positive, increasing and concave for
all
, thereby validating the conjectures underlying the two equations in (4.127). The
di erence between ( ) in Eqs. (4.128)-(4.129) and ( ) in Eq. (4.120) is a cost accounting
for the nancial frictions faced by the rm.
4.8.3.2 Reected Brownian motions
Next, we consider the limiting case in which = . In a subsection below, we argue that the
rms capital would behave as a reected Brownian motion in this case: when capital is below
the threshold , it is an arithmetic Brownian motion as in Eq. (4.119), but as soon as capital
reaches , it is then reected back to values lower to it, with the capital in excess distributed as
a dividend. It is also assumed that if the rm had to begin with cash larger than the threshold,
the part exceeding . Heuristically, Eqs. (4.127) would hold in this case when,
0=
( )+
1
2
00
( )
( ) for
and
( )=1
(4.130)
= 0, and
( )=
( ) for
2
1
( ) = max 0
(4.131)
2
1
2
179
c
by
A. Mele
While details are provided below to ll the reader with additional insights regarding the nature
of reected Brownian motions, a standard and rigorous source of details is in Karatzas and
Shreve (1991, p. 210-212). Dene the occupancy time of any given random process
in a set
up to time as,
Z
(
)=
I
0
spends in set up to
where I denotes the indicator function. It represents how much time
time for a given realization of
on [0 ]. Below, it will be argued that there exists a function
( ), called local time of
at a given point , such that,
(
)=
) = lim
0
1
2
((
+ ) )
(4.132)
We provide an informal derivation of (4.132) below. Note the following property. We have,
for any xed , and using the denitions of and , and that of the Diracs function,
Z
)=
(
) ( )
(
Z
Z
1
) lim
I (
=
(
+ )
0 2
0
Z Z
=
) (
)
(
0
Z
(
)
(4.133)
=
0
)=
I
Z Z
0
=
=
(
Z
)I
(
where the last equality follows by Eq. (4.133). This is Eq. (4.132).
180
c
by
A. Mele
Next, consider the solution for in Eq. (4.126), which in the limiting case = , can be
conceived as ( ) = (
), where () is Diracs delta. To simplify the presentation, set
= 0 and = 1, such that the rms capital in Eq. (4.121) now becomes,
Z
= 0
(
) +
= 0
(
)+
(4.134)
0
where the second equality follows by the property of local time in (4.133).
We argue that Eq. (4.134) describes a Brownian motion reected at : each time
hits
+
from below, it is reected back. Consider the function |
| (
) +(
) ,
where
is a Brownian motion, and apply heuristically Itos lemma,
Z
Z
|
|= | 0
|
sign (
)
(
)
(4.135)
0
Eq. (4.135) is known as Tanakas formula for Brownian local time (see, also, the Appendix).
It is possible to show that the second term on the R.H.S. of this equation is a Brownian motion
starting from zero (Karatzas and Shreve, 1991, p. 209), say Moreover, the third term is local
time, by Eq. (4.133). Therefore, Eq. (4.135) is,
|
|=
|+
such that the two processes in (4.134) and (4.135) have the same distribution.
>
181
(4.136)
c
by
A. Mele
R : ( )
(4.137)
which case
= {0} and = 0 on ; prohibition of short-selling:
= [0 ) , in which case
= and = 0 on , or: incomplete markets: = {
R : +1 = = = 0} (i.e. the
rst
assets can only be traded), in which case = {
R : 1 = =
= 0} and = 0
dened as
={
R :
0
}, illustrated in Figure 4.1 in the case = 2. We
emphasize is the polar cone of
, not the barrier cone of
dened in Eq. (4.137).
~
K
p2
p1
~
K
FIGURE 4.1
= [0
0
if
>
)=
( ) = sup (
otherwise
[0 )2
= [0
)2 . It is easy to
)2
R2 :
R2 :
R2 :
1 1
0
= 0}
1 1
1
2 2
1
182
R}
R and
= 0}
c
by
A. Mele
Panel C in Figure 4.2 illustrates this incomplete markets case. Panel A and B are examples
of less severe forms of constraints, where markets are still complete. Panels A, B and C impose
progressively tougher constraints, to the extent of the incomplete markets case in Panel C.
p2
~
K
K
p1
K
~
K
Panel A
p2
~
K
p1
~
K
Panel B
p2
2
~
K
p1
~
K
Panel C
FIGURE 4.2: Portfolio constraints and incomplete markets.
183
c
by
A. Mele
(i.e.
0
if
( ) = sup ( 1 1 ) =
otherwise
1 R
= 0)
= 1
(4.138)
where:
+ ( )
+ + ( )
(4.139)
1 2
1 >
1
= 0 exp
(4.140)
2 0
0
F
where 0 is as in Eq. (4.73). As mentioned, He and Pearson (1991) is a special instance of this
setting, which obtains once we assume incomplete markets, in which case the support function
is = 0, as explained.
Under regularity conditions, we have that:
Val ( ;
) = inf (Val ( ))
(4.141)
and optimal consumption and portfolio choices for this unconstrained problem are exactly those
chosen by the investor constrained to have
. Appendix 4 provides a heuristic derivation
of Eq. (4.141).
In the context of log-utility functions, we have that,
1 2
= arg min 2 ( ) +
+
(4.142)
where = 1 (
1 ). Part II of these lectures will work out applications of this result.
[Eq. (4.142) follows by applying martingale methods in a market where the Brownian motion
is
. Its actually very simplesee Cvitanic and Karatzas (1991, p. 790 + 797) + a portfolio
> 1
=
( + 1 ).
For example, in the incomplete markets case, we have that = 0, and the solution is then
= 0 as originally explained by He and Pearson (1991). This is the case with only one assetthe
multi-asset case is more complex and dealt with by Cvitanic and Karatzas (1991, p. 797).]
[In progress]
184
c
by
A. Mele
4.10. Jumps
4.10 Jumps
Brownian motions are well suited to model the price behavior of liquid assets or assets issued by
names or Governments not subject to default risk. There is, however, a fair amount of interest in
modeling discontinuous changes in asset prices. Fixed income instruments may undergo liquidity
dry-ups, or even default, causing price discontinuities that we wish to model. This section is
an introduction to Poisson models, a class of processes that is particularly useful in addressing
these issues.
4.10.1 Poisson jumps
Let ( ) be a given interval, and consider events in that interval which display the following
properties:
(i) The random number of events arrivals on any disjoint time intervals of (
pendent.
) are inde-
(ii) Given two arbitrary disjoint but equal time intervals in ( ), the probability of a given
random number of events arrivals is the same in each interval.
(iii) The probability that at least two events occur simultaneously in any time interval is zero.
Next, let (
) be the probability that events arrive during the time interval
. We
make use of the previous three properties to determine the functional form of (
). First,
(
) must satisfy:
) = 0(
) 0( )
(4.143)
0( +
and we impose
0 (0)
=1
(0) = 0 for
(4.144)
( +
) =
..
.
( +
) =
..
.
)+
)+
(4.145)
( +
For small ,
0
)=
1(
(
1(
)
)
1
)+
(
(
0
(
)
1
)+
) and 0 ( ) = 1
+
). By a similar reasoning,
)=
)+
)=
)
!
185
)
. Therefore,
c
by
A. Mele
4.10. Jumps
4.10.2 Interpretation
A Poisson model is one of rare events. Moreover, by:
(event arrival in
)=
)=
For this reason, we usually refer to the parameter as the intensity of event arrivals.
To provide additional intuition about the mathematics of rare events, consider the expression
for the probability of arrivals in trials, predicted by a binomial distribution:
!
=
=
0 + =1
!(
)!
where is the probability of arrival for each trial. We want to model the probability as a
function of , with the feature that lim
( ) = 0, so as to make each arrival rare. One
possible choice is ( ) = , for some constant
0. Under this assumption, we have:
=
=
=
!
!(
!
!(
)!
!
!(
)!
!
leaving,
)!
( ) (1
( ))
)! !
1
{z
times
lim
+1
} !
!
, and then make the probNext, we split the interval (
) into subintervals of length
ability of one arrival in each sub-interval proportional to each sub-interval length, as illustrated
in Figure 4.3,
( )=
The Poisson model in the previous section is thus as that we consider here, with
,
which is continuous-time, as each sub-interval in Figure 4.1 shrinks to . The probability there
is one arrival in
is
, which is also the expected number of events in
as shown below:
(# arrivals in )
= Pr (one arrival in
= Pr (one arrival in
=
) zero arrivals
The heuristic construction in this section opens the way to how we can simulate Poisson
processes. We can just simulate a Uniform random variable (0 1), with the continuous-time
186
c
by
A. Mele
4.10. Jumps
process being approximated by
where
, where:
0 if 0
=
1 if 1
1
1
is a discretization interval.
n (
t)
t
n subintervals
is a probability. We have,
X
=0
=0
=1
X
=0
=0
(4.146)
A related distribution is the exponential (or Erlang) distribution. Remember, the probability
(
)
of zero arrivals in
predicted by the Poisson model is 0 (
)=
, from which it
follows that:
(
)
(
) 1
)=1
0(
is the probability of at least one arrival in
. The function can be also interpreted as the
probability the rst arrival occurred before , starting from . The density function of is:
(
)=
)=
1
=
Variance =
Mean =
0
1 2
The expected time of the rst arrival occurred before starting from equals 1 . More generally, 1 can be interpreted as the average time from an arrival to another.13
A more general distribution than the exponential is the Gamma distribution with density:
(
)=
)(
(
(
))
1)!
= 1.
13 Suppose arrivals are generated by Poisson processes, and consider the random variable time interval elapsing from one arrival
0 which will elapse from
to next one. Let 0 be the instant at which the last arrival occurred. Then, the probability the time
0 , there is at least one
the last arrival to the next is less than
is the same as the probability that during the time interval
arrival.
187
c
by
A. Mele
4.10. Jumps
4.10.4 Asset pricing implications
This section is a short introduction to modeling asset prices as being driven by Brownian
motions and jumps processes. We model jumps by interpreting the arrivals in the previous
sections as those events upon which a certain random variable experiences a jump of size S,
where S is another random variable with a xed probability . A simple model is:
= (
+ (
)S
+ (
where
are given functions (with
0),
Poisson process with intensity equal to , i.e.
(4.147)
is a
(i) Pr ( ) = 0.
(ii)
= 1
(iii)
( 0 ) and
( )
1)
),
i.e.:
Pr (
= )=
)
!
(
)
= 1) Pr (
|
= 1) = (
)
Pr (
(
))
is a martingale.
More generally, the process (
Armed with these preliminary facts, we can provide a heuristic derivation of Itos lemma for
jump-di usion processes. Consider any function with enough regularity conditions, a rational
function of time and in Eq. (4.147), i.e. ( )
(
). Consider the following expansion of
:
)
( )=
(
+[ ( + (
+ (
)S )
) ( )
(
)]
The rst two terms in are the usual Itos lemma terms, with denoting the usual innitesimal
generator for di usions. The third term accounts for jumps. If there are no jumps from time
to time (where
=
), then
= 0. If there is a jump then
= 1, and in this
case , as a rational function, needs also instantaneously jump to ( + ( ) S ). The
jump will be exactly ( + ( ) S )
(
), where S is another random variable with
14 For
simplicity, we take
to be constant. If
Pr ( ( )
( )= )=
( )
!
exp
188
( )
=0 1
c
by
A. Mele
)
)
or
=
+
+
Z
[( ( + S
[( ( + S
supp(S)
[ ( + S
)
)
]
))
)) ]
(
(
)] ( S)
where supp (S) denotes the support of S. Therefore, the innitesimal generator for jumpsdi usion is, simply,
.
4.10.5 An option pricing formula
Merton (1976, JFE), Bates (1988, working paper), Naik and Lee (1990, RFS) are the seminal
papers.
Obviously we cannot hedge. Make the argument. Use Itos lemma and show that the strategy
cannot hedge the jump component.
189
c
by
A. Mele
where
( ; )
( )
190
c
by
A. Mele
A Brownian motion is actually nowhere di erentiable. Therefore, Eq. (4A.2) should be only understood as a shorthand for,
Z
Z
( )= 0+
( ) +
( )
( )
0
( )
sup
1( ) =
=1
where the supremum is taken over all partitions of [0 ]. We shall state conditions on how bounded
the integrator and integrands in ( ) should be to make Riemann-Stieltjes theory go through. Unfortunately, these conditions are restrictive within the context of interest in nance. For example,
Riemann-Stieltjes theory works when one takes ( ) = 1, or ( ) = in Eq. (4A.1). However, this
theory doesnt hold in more general context.
4.12.1.2 Riemann
Given is 7
( ),
(0 1). Consider two standard denitions. First, a partition,
: 0 = 0
=
1
and
=
=
1
.
Second,
an
intermediate
partition,
:
1
1
1
any collection of values
satisfying
, = 1 . Then, for a given partition
and
1
intermediate partition , the Riemann sum is dened as:
(
( )
=1
4.12.1.3 Riemann-Stieltjes
and
R1
0
()
2,
R1
0
(
R
1 1(
()
)+
2 2(
R1
R1
1 0 1(
))
()
for every
R1
2 0 2(
) .
(0 1).
The main idea is to integrate one function with respect to another function . One standard
example relates to the computation of the expectation of a random variable with distribution function
. Heuristically, we have that:
Z
()
[ ( )
191
1 )]
c
by
A. Mele
More generally, let us be given two functions and , and consider, again, the denitions of
given earlier. Let
= ( )
( 1 ) = 1 . The Riemann-Stieltjes sum is dened as:
(
)=
( )
=1
Clearly the Riemann sum is a special case obtained with the identity function ( ) = . Similarly as
when proceeding with denition of the Riemann sum, the Riemann-Stieltjes integral of with respect
(
), provided it exists and is independent of
and . It
to on (0 1) is the limit, lim
is written:
Z
1
()
t0
t2
t1
( 0 )]
[ ( 2)
( 1 )] + [ ( )
( 2 )] =
()
( )
()
()
We can see that this rst variation is a measure of the total amount of up and down motion of the path
of the function . We can formalize this reasoning as follows. Let be a function of a real variable.
Its variation in an interval [ ] is dened as
X ( )
( )
([ ]) = sup
( 1 )
( )
=1
( )
( )
( )
. By the triangle
0
1
inequality, | ( )
( )| = | ( )
( )+ ( )
( )|
| ( )
( )| + | ( )
( )|, the sums in
([ ]) can only increase as we add more and more into the partition, such that,
X ( )
( )
(
([ ]) = lim
)
(
)
1
mesh 0
=1
192
c
by
A. Mele
sup
X
=1
| ( )
0, if
1 )|
and
R1
()
1, that is,
with 1 + 1
satisfy
1.
Now, it is well-known that almost every path has bounded -variation for
2. And, as
expected,
unbounded -variation for
2, as further argued below. Consider, then, the integral,
R1
(
),
and
suppose
is
di
erentiable
with bounded derivatives. By the P
mean value theorem,
0
( 1 )|
thereP
exists a
0 such that: | ( )
( )|
(
) for
. Therefore, sup
=1 | ( )
. That is, has bounded -variation, with = 1. By Theorem 4A.2, we now
1) =
=1 (
path, the Riemann-Stieltjes integral of with respect to Brownian
have that for almost every motions,
Z
( )
( )
( )
exists for every deterministic function which is di erentiable with bounded rst-order derivative.
For example, ( ) = 1, or ( ) = . We arent done. Consider ( ) =
( ) and, then:
(
)( ) =
( )
( )
2
1. The Riemann-Stieltjes
Let = 2 + , for some
0. Hence = = 2 + , and so 1 + 1 = 2+
theory doesnt work even with this simple example. This is where the theory of It
os stochastic integrals
comes in.
Why do Brownian motions display unbounded variation? Consider the Brownian tree below.
%1
&1
Time is
and space is
193
(4A.3)
c
by
A. Mele
as
0.
A more substantive proof is one for example of Corollary 2.5 p. 25 in Revuz and Yor (1999). A
sketch of this proof proceeds as follows. We have:
X
max
1
1
1
which
is impossible,
2 as we shall now argue. Indeed, in the next section, we shall establish that
P
, implying that p lim
= . Therefore, there exists a sequence
:
1
for all
. (Convergence in probability does not imply almost sure convergence, yet it implies that
s.t. a.s. convergence, which is what we just need here.)
there exists a suitable subsequence
4.12.1.5 It
o
Let us begin with a rst example, which can help grasp the nature of the issues under study. Consider
(
)( ) =
( )
( )
=1
1 2
=
()
()
(
2
= 1
. Simple computa-
)2
=1
The quantity
We have,
)2 =
=1
Moreover,
[(
)2 ]
( )) =
X
=1
)2 ]
[(
2
2
=2
=1
2,
= 2
2 (1).
Hence,
2
=1
X
=1
194
Mesh(
= 2 Mesh(
c
by
A. Mele
( )) =
()
(
(
( )))2 =
( )) =
)2 . Therefore,
()
)2
()
-pointwise.
0 Pr {|
)2
()
Issues related Rto uniform convergence issues will be dealt with later.
To sumup, 0 ( )
( ) doesnt exist as a Riemann-Stieltjes integral. Nevertheless, the previous
facts suggest that a good denition of it could hinge upon the notion of a mean square limit, viz
=
=1
=
R
1
2
1
2
=1
()
2
where 0
=2
has the Itos sense.
R
Clearly, 0
does not satisfy the usual Riemann-Stieltjes rule of integration. (For any smooth
R
function such that (0) = 0, the Riemann-Stieltjes integral 0 ( ) ( ) = 12 2 ( ).) This doesnt
work here because we have yet to see what the chain-rule
- is. This will lead us
of
R for functions
2
= 12
. This example vividly
to the celebrated It
os lemma, which shall conrm that 0
illustrated that standard integration methods fails. In fact, the timing of the integrands is quite critical.
For example, in Riemann integration, the integrand can be evaluated at any
inthe
interval. If
P point
we apply this to the kind of integrals
we
are
studying
here
we
obtain,
lim
1
1
P
(for the left boundary) and lim
(
)
(for
the
right
boundary).
But
the
two
limits
1
do not agree. The expectation of the rst is zero (by the law of iterated expectations), while the
expectation of the second is not necessarily zero. Finally, Riemann integration theory di ers from the
integration theory underlying the previous example because of the mode of convergence utilized in the
two theories.
A short digression is in order. The so-called Stratonovich stochastic integral selects as points of the
intermediate partion the central ones:
=1
1
(
2
+ )
For the Stratonovich integral, the usual Riemann-Stieltjes rule applies, yet the Stratonovich stochastic
integral isnt Riemann-Stieltjes.
4.12.1.6 The It
os stochastic integral for simple processes
. Consider [0
if =
=
if
1
195
] and partitions
=(
[0
, = 1
= 1
:0=
]) is simple if
, s.t
c
by
A. Mele
2)
all
-adapted, = 1
2 ).
As an example, consider
we have:
, if = , and
, if
=1
X1
=1
P0
=1
. Next,
is,
=1
= 1
on [0
0.
( ) =
[0
Proof. Let us check that ( ) is a F -martingale. We have to check three conditions: (i) | ( )|
,
. Condition (i) follows by the
all
[0 ]; (ii) ( ) is F -adapted; (iii) [ ( )| F ] = ( ),
isometry property to be introduced below. Condition (ii) is trivial. To show (iii), suppose, initially,
. We have:
that
[ 1 ],
( )=
X1
=1
X1
=1
=
[
( )| F ] =
=
( )+
( )| F ] +
( )+
+
)
[(
)| F ]
)| F ] =
( )
[ 1 ],
is proven similarly. Finally, ( ) has zero expectation
The case
[ 1 ] and
( ( )) = 0 all . That is, ,
because it starts from the origin by the denition: 0 ( ) = 0
[ ( )] = [ 0 ( )] = 0 ( ) = 0.
Property 4A.P2 (Isometry).
196
, for all
[0
].
c
by
A. Mele
. We have:
=1
XX
=1 =1
2
1
=1
2
1
=1
=1
=1
X
=1
and
!2
F
1
2
F
1
1)
1)
( ) has continuous
-paths.
R
( )
simple processes ( ) s.t
0,
i.e.
(
2(
0
)
( )
is a Cauchy sequence in
Step 2: By step 1,
integral for simple processes
( ( ) ) 2
( ( ))
Therefore,
plete, and so
( )
2(
( )
( )
( 0)
2(
( )
197
c
by
A. Mele
, and is written as
0
( )
( )
) = lim
( ))
( ) in the
2(
) norm.
( )
2(
0, then
[0
],
-a.s.
L2 :
| |2
o
a.s.
R
where L denotes the set of all adapted processes. Let
H2 . The stochastic integral ( ) = 0
satises the following properties: (i) Continuous sampleR paths, and R( ) is aF -martingale; (ii) Expec2
)2 = 0
[0 ], hence
tation
on H2 , i.e. ( 0
R equal to zero;R(iii) Itos isometry
R
2
2
2
[ 0
[ 0
] = 0 ( )
; (iv) Linearity and linearity on adjacent intervals.
]
)=
0) +
1
2
00
(4A.4)
=2
198
1
2
(4A.5)
c
by
A. Mele
0)
X1
+1
+1
=0
By Taylor,
where min
( )=
+1
+1
( )
for some
)=
max
( ):
Wti
1
2
00
( )
+1
( )
is continuous,
Wti
ti
ti
Therefore,
(
X1
0) =
=0
+1
1X
2
00
=0
+1
We have
X1
00
=0
Finally,
0(
00 (
+1
)(
X1
2
+1
+1
00
)(
+1
=0
0(
00 (
More technical details in order of descending di culty can be found in Karatzas and Shreve (1991),
Arnold (1974), Steele (2001) and Mikosch (1998).
Let us reconsider the example in Eq. (4A.4). By the stochastic integral theorem, is a martingale.
This is conrmed by Eq. (4A.4). According to Eq. (4A.4),
Z
and
1
2
= all .
199
c
by
A. Mele
for some function . Randomness can be introduced via an additional noise term:
=
We already know that a -
+ (
(4A.6)
where the rst integral is Riemann and the second integral is an Itos stochastic integral.
We have the following denitions. First, we say that an It
os process is,
( )=
( )
( )
+ (
It is known that an Itos di usion process is a Markov process. The previous equation is also called a
stochastic di erential equation (SDE). In a SDE, and depend on only through . Finally, we
say that a time-homogeneous di usion process is,
=
( )
+ ( )
There is a beautiful property that is used to price nancial derivatives, using replication arguments,
as explained in the main text, called the unique decomposition property. Suppose we were given two
processes and with 0 = 0 , and that:
=
=
Then
R
( 0 |
and
and
=
=
+
almost everywhere, in the sense that
is F -adapted.
(ii) The integrals in Eq. (4A.6) are well-dened in the Riemanns and It
os sense and Eq. (4A.6)
holds
-almost surely
200
| |2
c
by
A. Mele
In other words, the denition of a strong solution requires that a Brownian motion be given in
advance, and that the solution
constructed from it be then F -adapted.
Next, suppose, instead, that we were only given 0 and two functions ( ) and ( ), and that
we were asked to nd a pair of processes ( ) on some probability space ( F ) such that Eq.
-adapted on some space, not necessarily the one in Eq. (4A.6). (Clearly
(4A.6) holds with being F
such a needs not to be F -adapted.) In this case ( ) is called a weak solution on ( F ). In the
case of a weak solution, we are given
and then
R we have to nd
R two things: a Brownian motion
-adapted process such that = 0 +
and a F
+ 0 ( ) holds
-almost
0 ( )
surely. Clearly, a strong solution is also weak, but the converse is not true. Consider the following
example.
satisfy:
= sign( )
=0
(4A.7)
(4A.8)
where is a Brownian motion. It can be shown that is G -measurable, where G is the -algebra
, where F
is the -algebra generated by . Therefore, the F
generated by | |. Clearly G
. Armed with this result, we can easily show that
algebra generated by is also strictly contained in F
there are no strong solutions to Eq. (4A.7). To show this, suppose the contrary. There is a theorem
would then be a Brownian motion. On the other hand, Eq. (4A.7) can also be written
saying that
= sign( )
or
=
=0
sign( )
By the same reasoning produced to show that the -algebra generated by is strictly contained in F
in Eq. (4A.8), we conclude that the -algebra generated by
is strictly contained in the -algebra
is a strong solution to Eq. (4A.7).
generated by . But this contradicts that
Clearly, one needs to be able to impose conditions that allow to distinguish between weak and
strong solutions. However, the only focus of the following discussion is about the regularity conditions
ensuring existence and uniqueness of strong solutionsthe case of interest in continuous-time nance.
We need two types of restrictions on and . Consider the following denition. For a given function
, we say that it satises a Lipschitz condition in if there exists a constant , such that for all
(
) R R ,
k ( )
( )k
k
k uniformly in
p
Tr ( > ). In other words, cannot change too widely. We also say satises a growth
where k k
condition in , if there exists a constant such that for all (
) R R ,
k ( )k2
1 + k k2 uniformly in .
That is,
201
c
by
A. Mele
Next, we turn to the concepts of existence and uniqueness of a solution to a stochastic di erential
(1)
(2)
(1)
equation. We say that if
( ) and
( ) are both strong solutions to Eq. (4A.6), then
( )=
(2)
( )
-a.s. We have:
Theorem 4A.7. Suppose that
satisfy Lipschitz and growth conditions in , then there exists
a unique It
os process satisfying Eq. (4A.6) which is continuous adapted Markov.
Consider the following stochastic di erential equation:
=
( )
=1
1
0
1
1
Yet is impossible to nd a global solution, i.e. one dened for all . This is exactly the kind of pathology
ruled out by linear-growth conditions. More generally, linear-growth conditions ensure that | ( )| is
unique and doesnt explode in nite time. Naturally, Lipschitz and growth conditions are only su cient
conditions to guarantee the previous conclusions.
A nal remark. The uniqueness concept used here refers to strong or pathwise uniqueness. There
are also denitions of weak uniqueness to mean that any two solutions (weak or strong) have the same
nite-dimensional distributions. For example, the Tanakas equation introduced earlier has no strong
solution, yet it can be shown that it has a (weakly) unique weak solution.
=
4.12.2.3 It
os lemma
It
os lemma is a fundamental tool of analysis in continuous-time nance. It helps build up new
processes from old processes. Two examples might clarify.
(i) A share price is certainly a function of its dividend process. If the dividend process is solution
to some SDE, then the asset price is a solution to another SDE. Which SDE? It
os lemma will
give us the answer.
(ii) Derivative products, reviewed in the third part of these lectures, are nancial instruments, with
a value depending on some underlying factors, whence, the terminology, derivative. In other
words, derivative prices are functions of these factors. If factors are solutions to SDE, derivative
prices are also solutions to SDE. Once again, Itos lemma will provide us with the right SDE.
Naturally, the functional form linking the dividend process (or the factors) to the asset prices is
unknown. But in situations of interest, no-arbitrage restrictions will help to pin down such a functional
form.
202
c
by
A. Mele
Let us proceed with a few preliminary heuristic considerations. A useful heuristic denition is that
the increments of a Brownian motion,
, can be thought of as being equal to
as
0.
+
as being normally distributed,
(0 ). Heuristically,
We may think of the increments
(0
). But then, by the previous normality property of
,
indeed,
+
2
2
(
) = 0 and
) = , and
= , hence
(
=2 2
where the second equality follows by the property 2 distributions.
2 , which is proporThe point of the previous computations is that for small , the variance of
2 , is negligible if compared to its expectation, which is
. Heuristically,
()
and
tional to
2
( )
. These heuristic considerations lead to the following, celebrated table below.
( )
(
)2
(
)
1
=
=
=
=
=
0
0
It
os multiplication table
for
1
0
0
for
2
for two independent Brownian motions
)=
1
2
)(
+
(
1
2
)+
By the It
os multiplication table,
=
)=
)+
1
2
)2
+
2
( )2 +
|{z} +
1
) +
2
)2 + 2
By rearranging terms,
)2 + Remainder,
)2
1
2
1
+
2
) as many
| {z } + 2
| {z })
0
This is It
os lemma.
Naturally, Itos lemma also holds when is a multidimensional process. A heuristic derivation of it
can be obtained through the It
os multiplication table applied to the following expansion:
(
)=
203
1X
2
c
by
A. Mele
)=L (
or more formally,
(
)=
0) +
where
) and
) (
(4A.10)
1 >
) + Tr
( )
2
) are the gradient and Hessian of
with respect to .
L (
and
L (
)=
)+
204
c
by
A. Mele
1 +1
=(
2 +1
=
=
=(
1) 1
+(
1 1
1 2
1) 2
1 1
1 1
1)
+(
and
=(
)+(
and
. We have:
Assume that
=
The budget constraint can then be written as:
=(
=(
+ (
=(
1
1
+
+
|
= (0)
and = ( (0) ), where and are both -dimensional. Dene as usual wealth as of
. There are no dividends. A self-nancing strategy satises,
time as
+1 =
= 1
Therefore,
= +
= +
=
(because is self-nancing)
1
= 1
or,
=
=1
205
c
by
A. Mele
(0)
1
= 1
(0) (0)
(0) (0)
1
(0) (0)
1
=
=
+
(because is self-nancing)
1
1
and we obtain
= (1 + )
or,
=
=1
Such an equation can also be arrived at by noticing that current wealth is nothing but initial wealth
plus gains from trade accumulated up to now:
Z
= 0+
0
+
=
+
+
+
+
max
[ ( )| F 1 ]
For = 1
P :
= (1 + )
s.t.
1
1 +
Even if markets are incomplete, agents can solve the sequence of problems {P } =1 as time unfolds.
Each problem can be written as:
"
!
#
(
) F 1
max
1+
1
1 +
=1
leads to:
0(
) +1 F
( 0(
)| F )
206
= 0
c
by
A. Mele
where =
, i.e. the prices expressed in terms of the money market numeraire.
The previous relations suggest that we can dene a martingale measure
for the price process
(expressed in terms of the money market numeraire), by dening
0(
)
=
0
( ( )| F )
F
207
c
by
A. Mele
>
0
0
0
1
[ 01
]. An arbitrage opportunity is
a.s., which comwhich implies, =
0
0
0
1
= 0
-a.s. (if a r.v. 0 and
() = 0, this
bined with the previous equality leaves:
0
1
means that = 0 a.s.) and, hence, -a.s. The last equality is in contradiction with Pr 0
0
0, as required by Denition 4.3.
Only if part. We combine portions of proofs in Karatzas (1997, thm. 0.2.4 pp. 6-7) and ksendal
(1998, thm. 12.1.8b, pp. 256-257). We let:
( )={
={
n
=
( )
( )
( ):
( )>
h i}
( )> ( ( )
( ) = 0 and
o
( )) 6= 0
( ; ) =
( )> ( ( )
0
( ))
( )
for
for
( )
( )
Z
Z
> (
1
0
> (
1
0
I
)
>
0
208
c
by
A. Mele
+
0
=
=
+
0
+
0
+
0
Z
Z
)
0
where we used the fact that is adapted, the law of iterated expectations, the martingale property of
, and the denition of 0 .
That is,
0
>
=1
Plugging the solution
0
=
0
0 +
>
0
>
>
>
R
>
>1
>
0
0
1
0
>
>
>
>1
>
=
0
>
>
0
209
0=
=0
>
>1
(4A.11)
0
, and
c
by
A. Mele
].
,
0
a martingale starting at zero. We conclude by the same arguments used in the proof of the previous
part. k
210
c
by
A. Mele
Z
)=
(
)+
(
)
(
is:
(4A.12)
where
Our aim is to characterize this density in terms of partial di erential equations. By the same reasoning
produced in Section 4.5, Eq. (4A.12) can be rewritten as:
Z
0
0 00
0( )
)=E
(
) (
)+
( ) (
)
(4A.13)
(
00
0( )
( (
)=E
(
) (
(
=
If
) and
(
Z
)+
) (
The function
),
( ( 0 )|
, and let
) be the risk-
) (
) (
) ( ( )| )
) (
) ( |
) are independent,
where:
) (
)=
) (
) (
) (
)=
211
c
by
A. Mele
We show the Greens function satises the same partial di erential equation (PDE) satised by the
security price, but with a di erent boundary condition, and with the instantaneous dividend taken
out. We have:
Z
Z Z
)=
(
;
) (
)
+
(
;
) (
)
(4A.14)
(
Consider the scalar case. By Eq. (4A.13), and the Feynman-Kac connection between PDEs and conditional expectations reviewed in Section 4.2, we have that under regularity conditions, is solution
to:
1 2
+
+
(4A.15)
0= +
2
where is the risk-neutral drift of . Next, take the following partial derivatives of ( ) in Eq.
(4A.14):
=
=
=
(
Z
1
+
2
is solution to
0=
1
2
with lim
212
)= (
c
by
A. Mele
such that
Val ( )
(4A.16)
for all
[0 ]. Two small remarks on notation. We remind that
and such that leads to
we are dening as in Section 4.7, (i) Val ( ) as the value of the problem an investor faces in the
.
unconstrained market in Eqs. (4.138), and (ii) as the normalized portfolio process,
Its really this. As Cvitanic and Karatzas (1992) put it, we simply search for a member of a family
of unconstrained problems (those arising from the market in Eqs. (4.138)), one for which the optimal
portfolio actually satises the constraint (i.e. without imposing it a priori), and thereby solves the
original problem.
Note that because
contains the origin, the support function in Eq. (4.136) satises ( ) 0
. Moreover, by construction,
for each
(4A.17)
1
=
+
+
0 +
( )+
>
where = 1 ( 1 ), and 0 is the usual Brownian under the risk-neutral probability in a market
without frictions. If the price system is as in the articial market of Eqs. (4.138), then, for any
say, are easily seen to be:
unconstrained portfolio-consumption ( ), the dynamics of wealth,
= > (
1 )+
+ >
>
>
=
+ ( )
+
+
(4A.18)
0
and given
where the second expression follows by a change in probability and the expressions for
in Eq. (4.139).
Therefore, for a given normalized portfolio-consumption ( ), we have that the wealth di erence,
, satises:
()
0
()=
|
+ ( )
{z
}
>
>
=0
()
>
0 = 0
(4A.19)
Because
0 by Eq. (4A.17), then, by a comparison theorem (e.g., Karatzas and Shreve (1991,
= 0, where the last equality follows because the solution to Eq. (4A.19) is
p. 291-295)),
= 0 , for some positive process . Therefore, we have,
with an equality if
( )+
>
= 0 for all
and
( ) +
213
>
=0
),
(4A.20)
such that
(4A.21)
c
by
A. Mele
(Meaning that its a portfolio chosen without imposing the constraint, which then happens to satisfy
the constraint.)
By the inequality in (4A.20), we have that Val ( ; ) Val ( ) for all and, hence,
Val ( ;
Val ( )
and Val ( ;
inf (Val ( ))
(4A.22)
Moreover, we have,
Val ( ;
)=
= Val ( )
(4A.23)
where the second line follows, because the value of the original problem is, of course, larger than that
of any constrained and not-optimally chosen portfolio-consumption ( ). In other words, note that
as soon as the second equality in Eq. (4A.21) holds true, then, the dynamics of wealth in Eq. (4A.18)
collapse to those in the original market with = 0, where the agent is constrained in
(
), such
that the second line follows, because ( ) are at this stage not optimally chosen.
The third line of (4A.23) follows by Eq. (4A.21) and (4A.20). The fourth line is the denition of
Val ( ). Combining the rst inequality in (4A.22) with Eq. (4A.23) leaves,
Val ( ) = Val ( ;
) = inf (Val ( ))
214
c
by
A. Mele
under
and to
(
1)
= (
1)
1)
under
As explained in Section 4.7, these are in fact densities of time intervals elapsing from one arrival to
the next one.
. The Radon-Nikodym derivative is the
Next, let
be the event of marks at time 1 2
likelihood ratio of the two probabilities and of :
1
( )
=
( )
( 1)
1
2
1
( 1)
( 2)
2
1
( 1)
3
2
( 2)
3
2
( 2)
where we have used the fact that given that at 0 = , there are no-jumps, the probability that
1
1
no-jumps would occur from to 1 is
under , and
under . Simple algebra
yields,
( )
=
( )
=
( 1)
Y
=1
"
Y
= exp ln
= exp
=1
ln
=1
= exp
( )
X
Z
( 2)
ln
2
1
1)
3
2
1)
1)
( )
( )
1)
1)
!#
where the last equality follows from the denition of the Stieltjes integral.
Consider, nally, the following denition. Let
be a martingale. The unique solution to the equation:
Z
=1+
is named the Doleans-Dade exponential semimartingale and is denoted as E( ). We now turn to the
arbitrage restrictions arising whilst dealing with asset prices driven by jump-di usion processes.
215
c
by
A. Mele
+ S
+ S(
=( + S )
Next, dene
=
Both and are
-martingales. We have:
=
+ S(
+ S
)+ S
+ S
The characterization of the equivalent martingale measure for the discounted price is given by the
following Radon-Nikodym density of with respect to :
=E
1 (
S (S)
= +
S (S)
Clearly, markets are incomplete here. It is possible to show that if S is deterministic, a representative
1
agent with utility function ( ) = 1 1 makes (S) = (1 + S) .
ln
1
+
ln
In terms of ,
is
= ( ) with ( ) =
or,
. We have:
+
+
+ jump
ln
1 (
The general case (with stochastic distribution) is covered in the following subsection.
216
c
by
A. Mele
+
-martingale,
)
=
)
=
where
nally:
)
(
)+ (
)
(
)
)
To generalize the steps made to deal with the standard di usion case, let
=
Finally, by the
+
+ (1 +
= 1.
=1
be
(1 +
where
1+
-martingales. Let
)=
is taken with respect to the jump-size distribution, which is the same under
217
and
c
by
A. Mele
We wish to nd
such that is a
=1
-martingale, viz
= E ( )
i.e.,
E( ) =
i.e.,
is a
-martingale.
By Itos lemma,
( ) =
=
=
=
Because ,
are
and
+ +
+
+ [|
+
-martingales,
Z
, 0=
But
=
and since (
)2 =
+
,
1+
218
]+
) =
{z
}+
i
)
a.s.
c
by
A. Mele
References
Arnold, L. (1974): Stochastic Di erential Equations: Theory and Applications, New York:
Wiley.
Black, F. and M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal
of Political Economy 81, 637-659.
Bremaud, P. (1981): Point Processes and Queues: Martingale Dynamics. Berlin: Springer Verlag.
Cvitanic, J. and I. Karatzas (1992): Convex Duality in Constrained Portfolio Optimization.
Annals of Applied Probability 2, 767-818.
Follmer, H. and M. Schweizer (1991): Hedging of Contingent Claims under Incomplete Information. In: Davis, M. and R. Elliott (Editors): Applied Stochastic Analysis. New York:
Gordon & Breach, 389-414.
Friedman, A. (1975): Stochastic Di erential Equations and Applications (Vol. I). New York:
Academic Press.
Harrison, J.M. and S. Pliska (1983): A Stochastic Calculus Model of Continuous Trading:
Complete Markets. Stochastic Processes and Their Applications 15, 313-316.
Harrison, J.M, R. Pitbladdo and S.M. Schaefer (1984): Continuous Price Processes in Frictionless Markets Have Innite Variation. Journal of Business 57, 353-365.
He, H. and N. Pearson (1991): Consumption and Portfolio Policies with Incomplete Markets
and Short-Sales Constraints: The Innite Dimensional Case. Journal of Economic Theory
54, 259-304.
Jeanblanc-Picque, M. and A.N. Shirayev (1995): Optimization of the Flow of Dividends.
Russian Mathematical Surveys 50, 257-277.
Karatzas, I. and S.E. Shreve (1991): Brownian Motion and Stochastic Calculus. New York:
Springer Verlag.
Krugman, P. (1991): Target Zones and Exchange Rate Dynamics. Quarterly Journal of
Economics 106, 669-682.
McDonald, R.L. and D.R. Siegel (1986): The Value of Waiting to Invest. Quarterly Journal
of Economics 101: 707-727.
Mikosch, T. (1998): Elementary Stochastic Calculus with Finance in View. Singapore: World
Scientic.
Revuz, D. and M. Yor (1999): Continuous Martingales and Brownian Motion. New York:
Springer Verlag.
Shreve, S. (1991): A Control Theorists View of Asset Pricing. In: Davis, M. and R. Elliot
(Editors): Applied Stochastic Analysis. New York: Gordon & Breach, 415-445.
219
c
by
A. Mele
Steele, J.M. (2001): Stochastic Calculus and Financial Applications. New York: SpringerVerlag.
220
5
Taking models to data
5.1 Introduction
This chapter surveys methods to estimate and test dynamic models of asset prices. It begins
with foundational issues on identication, specication and testing. Then, it surveys classical
estimation and testing methodologies such as the Method of Moments, where the number of
moment conditions equals the dimension of the parameter vector (Pearson, 1894); Maximum
Likelihood (ML) (Gauss, 1816; Fisher, 1912); the Generalized Method of Moments (GMM),
where the number of moment conditions exceeds the dimension of the parameter vector, leading
to the minimum chi-squared (Neyman and Pearson, 1928; Hansen, 1982); and, nally, the
relatively more recent developments relying on simulations, which aim to implement ML and
GMM estimation for models that are analytically quite complex, but that can be simulated.
The chapter concludes with an illustration of how joint estimation of fundamentals and asset
prices in arbitrage-free models can asymptotically lead to statistical e ciency.
While
to get
given
, with
where
=( 1
), and 0 denotes the conditional density of the data, the true law. Then,
we have three basic denitions. First, we dene a parametric model as a set of conditional laws
for , indexed by a parameter vector
R,
(
)={ ( |
; )
R}
c
by
A. Mele
) is well-specied if,
0
: ( |
0)
( |
Third, we say that the model ( ) is identiable if 0 is unique. The main goal of this chapter is
to review tools aimed at drawing inference about the true parameter 0 , given the observations.
5.2.2 Restrictions on the DGP
The previous denition of DGP is too rich to be of practical relevance. This chapter deals
with estimation methods applying to DGPs satisfying a few restrictions. Two fundamental
restrictions are usually imposed on the DGP:
Restrictions on the heterogeneity of the stochastic process, which lead to stationary random processes.
Restrictions on the memory of the stochastic process, which pave the way to ergodic
processes.
5.2.2.1 Stationarity
Stationary processes describe phenomena leading to long run equilibria, in some statistical
sense: as time unfolds, the probability generating the observations settles down to some longrun probability density, a time invariant probability. As Chapter 3 explains, in the early
1980s, theorists begun to dene a long-run equilibrium as a well-dened stationary, probability
distribution generating economic outcomes. We have two notions of stationarity: (i) Strong,
or strict, stationarity. Denition: Homogeneity in law; (ii) Weak stationarity, or stationarity of
order . Denition: Homogeneity in moments.
Even with stationary DGP, there might be situations where the number of parameters to
be estimated increases with the sample size. As an example, consider two stochastic processes:
2
one, for which
(
; and another, for which
(
+ ) =
+ ) = exp ( | |). In both
cases, the DGP is stationary. Yet for the rst process, the dependence increases with , and
for the second, the dependence decreases with . As this simple example reveals, a stationary
stochastic process may have long memory. Ergodicity further restricts DGP, so as to make
this memory play a more limited role.
5.2.2.2 Ergodicity
We shall deal with DGPs where the dependence between 1 and 2 decreases with | 2
1 |.
To introduce some concepts and notation, say two events
and
are independent, when
(
) = ( ) ( ). A stochastic process is asymptotically independent if, for some function
,
| ( 1
( 1
) ( 1+
+ )
+ )|
1+
0. A stochastic process is -dependent if
,
6= 0.
we also have that lim
A stochastic p
process is asymptotically uncorrelated if there exists
such
that
for
all
,
P
(
( )
( + ), and that 0
1 with
. For example,
+ )
=0
= (1+ ) ,
0, in which case
0 as
.
Let B1 denote the -algebra generated by { 1
} and
B ,
B + , and dene:
= sup | (
( ) ( )|
( ) = sup | (
222
( )|
( )
c
by
A. Mele
of the model,
R}
; )
Naturally, any estimator does necessarily depend on the sample size, which we write as
( ). Of a given estimator , we say that it is:
Correct, or unbiased, if
( ) =
0.
0.
( )
The di erence
0.
(1)
(2)
Finally, an estimator is more e cient than another estimator if, for any vector of
(1)
(2)
>
constants , we have that >
( )
( ) .
=
1
or,
1
1
Z
( ; )
=1
,
where 0 is a column vector of zeros in R . Moreover, for all
Z
( ; ) =
[ ln ( ; )]
0 =
Finally, we have,
0
=
=
Z
[
ln ( ; )] ( ; )
ln ( ; )] ( ; )
ln ( ; )] =
(5.1)
ln ( ; )|2 =
>
ln ( ; )|2 ( ; )
ln ( ; )]
J( )
223
c
by
A. Mele
Let ( ) some unbiased estimator of , and set the dimension of the parameter space to
We have,
Z
[ ( )] =
( ) ( ; )
Under regularity conditions,
Z
[ ( )] =
( )[
By Cauchy-Schwartz inequality, [
fore,
[
( ( ))]2
ln ( ; )] ( ; )
But if ( ) is unbiased, or
ln ( ; ))]2
(( )
[ ( )]
(( )
ln ( ; ))
[ ( )]
[ ( )]
ln ( ; )] =
= 1.
ln ( ; )]. There-
ln ( ; )]
[ ( )] = ,
[ ( )]
ln ( ; ))]
J( )
This is the celebrated Cramer-Rao bound. The same results holds in the multidimensional
case, through a mere change in notation (see, e.g., Amemiya, 1985, p. 14-17).
The density of the data, ( 1 ), maps every possible sample and parameter values of on
to positive numbers, the likelihood of occurence of any given sample, given the parameter
: R 7 R+ . We trace the joint density of the entire sample through a thought experiment,
in which we change the sample 1 . So the sample is viewed as the realization of a random
variable, a view opposite to the Bayesian perspective. We ask: Which value of makes the
sample we observed the most likely to have occurred? We introduce the likelihood function,
( | 1)
( 1 ; ). It is the function 7
( ; ) for 1 given and equal to , say:
( | )
(; )
Then, we maximize ( | 1 ) with respect to . That is, we look for the value of , which
maximizes the probability to observe the sample we have e ectively observed. The resulting
estimator is called maximum likelihood estimator (MLE). As we shall see, the MLE attains the
Cramer-Rao lower bound, provided the model is not misspecied.
5.3.2 Factorizations
Consider a series of events {
}. In the Appendix, we
Y
T
=
Pr
Pr
=1
=1
224
show that,
!
T1
=1
(5.2)
c
by
A. Mele
( ) = arg max
( )
ln
( )
ln
1
1
=1
and
ln
=1
1
1
ln ( ; )
=1
( )
(5.3)
=1
ln
( )|
ln
( )
ln
( ) =
ln
( 0) +
ln
0,
( 0 )(
0)
1
= arg max [ ( ( ))]
ln ( )
0 = arg max lim
and, nally,
(5.4)
0
is dened as the
( 0 )] = 0
To show that this is indeed the solution, suppose 0 is identied; that is, 6= 0 and
0
( | ) 6= ( | 0 ). Suppose, further, that for each
,
[ln ( | )]
. Then, we
have that 0 = arg max
[ln ( | )], and this value of is unique. The proof is, indeed,
very simple. We have,
( | )
( | )
ln
ln 0
0
( | 0)
( | 0)
Z
( | )
( | 0)
= ln
( | 0)
Z
= ln
( | ) =0
225
c
by
A. Mele
( ( ))|
( )
Next, consider again the asymptotic expansion in Eq. (5.4), which can be elaborated, so as to
have,
(
0) =
"
( 0)
ln
1X
( 0)
1 X
( 0)
=1
ln
( 0)
=1
By the law of large numbers reviewed in the Appendix (weak law no. 1),
1X
( 0)
J ( 0)
( 0 )] =
=1
Therefore, asymptotically,
(
0) = J ( 0)
1 X
( 0)
=1
We also have,
1 X
(0 J ( 0 ))
( 0)
=1
P
Indeed, let
( 0) = 1
( 0 ), and note that
=1
limit theorem reviewed in the Appendix:
P
1
p =1
( 0)
=
( 0 )]
( 0)
p
(
[
( 0 )]
( 0 )] = J ( 0 ).
where, for each ,
[
Finally, by the Slutzkys theorem reviewed in the Appendix,
(
0)
0 J ( 0) 1
( 0 ))
c
by
A. Mele
5.4. M-estimators
5.4 M-estimators
Consider a function of the unknown parameters . Given a function
function ( ) is the solution to,
max
, a M-estimator of the
; )
=1
where and are as in Section 5.2.1. We assume that a solution to this problem exists, that it
is interior and that it is unique. Let us denote the M-estimator with ( 1 1 ). Naturally, the
M-estimator satises the following rst order conditions,
0=
1X
=1
; (
; )
=1
ZZ
; )
)=
( | )
; )
( )
[ (
; )]
where 0 is the expectation operator taken with respect to the true conditional law of given
and
is the expectation operator taken with respect to the true marginal law of . The
limit problem is,
=
( 0 ) = arg max
[ (
; )]
Under standard regularity conditions,2 there exists a sequence of M-estimators ( ) converging a.s. to
=
( 0 ). Under additional regularity conditions, the M-estimator is also
asymptotic normal:
>
Theorem 5.1: Let I
( ;
( 0 )) [
( ;
( 0 ))]
and assume that the
0
( ; )] exists and has an inverse. We have,
matrix J
0[
(
0 J
( 0 ))
IJ
Sketch of the proof. The M-estimator satises the following rst order conditions,
0 =
1 X
; )
=1
1 X
=1
)+
"
1X
=1
) (
is compact; is continuous with respect to and integrable with respect to the true law, for each ;
; )] uniformly on ; the limit problem has a unique solution
=
( 0 ).
0[ (
227
1
=1
; )
c
by
A. Mele
)=
"
1X
; ))]
1 X
1 X
=1
=[
=J
=1
1 X
)=
=1
=1
0[
1 X
)] = 0. Then,
=1
]> =
(0 I)
( ; )]2
=1
. In this case,
; )=[
( ; )]2 .
(
0 J 1 IJ 1
0)
where
J =
ln ( |
1; 0)
ln ( |
i>
1; 0)
228
c
by
A. Mele
5.6. GMM
1X
1 X
and I =
; )
=1
; )
=1
; )>
5.6 GMM
Economic theory often places restrictions on models that have the following format,
[ ( ;
0 )]
=0
(5.5)
1X
( ; )
(5.6)
=1
where {
vations.
When
1;
1
>
1;
1
} is a sequence of weighting matrices, with elements that may depend on the obser= , we say the GMM is just-identied, and is, simply, the MM, satisfying:
: (
; ) = 0
When
, we say the GMM estimator imposes overidentifying restrictions.
that
We analyze the i.i.d. case only. Under regularity conditions, there exists a matrix
minimizes the asymptotic variance of the GMM estimator, which satises asymptotically,
i 1
h
>
1
( 1; ) ( 1; )
(5.7)
= lim
0
An estimator of
can be:
1 Xh
=1
( ; ) ( ; )>
229
c
by
A. Mele
5.6. GMM
where
(
0
( ) 0
( )
( ; 0)
0)
Sketch of the proof: The assumption that
conditions. Moreover, the GMM satises,
(
0 =
; )
1;
1
(5.8)
>
( ; ) =
(
(1)
0) +
1
1; 0 +
1; 0
1
; )
; )
is
,
( 1 ; )
1; 0 +
; )
; )
>
0) +
(1)
The l.h.s. of this equality is zero by the rst order conditions in Eq. (5.8). By rearranging
terms,
(
0)
We have: 1
> =
1 P
( ; )
=1
( ;
0 . Hence:
1 P
>
( ; )]>
=1
( )
=1
1
0
( )>
0)
( ( )
1 X
( )
1 X
; )
1X
0)
( ; )
=1
( ;
0)
=1
( ;
(0
( ) = 0, and
( )=
0)
=1
Therefore,
( )
1
0
( )>
0)
1
0
1
0
( )>
230
( )
1
0
( )>
>
( )
1
0
( )>
c
by
A. Mele
5.6. GMM
k
A widely used global specication test is that of the celebrated overidentifying restrictions.
Consider the following intuitive result:
> 1
>
2
;
;
( )
0
0
1
0
1
Would we be expecting the same, if we were to replace the true parameter 0 with the GMM
estimator , which is, anyway, a consistent estimator for 0 ? The anwer is no. Dene:
(
C =
; )>
; )
We have,
(
=
=
= (I
; ) =
1; 0
1; 0
1
P)
and
;
h
( )>
1; 0
> h
( )
( )
(
1
0
( )>
1
0
( )>
1
0)
1
( )
( )
1
0
>
( )
( )
>
( )
( )
1
0
is the orthogonal projector in the space generated by the columns of ( ) by the inner product
1
0 . Thus, we have shown that,
>
(I
P )> 1 (I
P)
C =
1; 0
1; 0
But,
(0
(
0)
Hansen and Singleton (1982, 1983) started the literature on the estimation and testing of dynamic asset pricing models within a fully articulated rational expectations framework. Consider
the classical system of Euler equations arising in the Lucas tree,
0
( +1 )
(1 + +1 ) 1 F = 0
= 1
0( )
where is the utility function of the representative agent, is the return on asset , is the
time-discount factor, F is the information set as of time , and
is the number of assets.
1
Consider the CRRA utility function, ( ) =
(1
). If the model is well-specied, then,
there exist some 0 and 0 such that:
#
"
0
+1
= 1
(1
+
)
1
F = 0
+1
0
231
c
by
A. Mele
To sumup, the dimension of the parameter vector is = 2. To estimate the true parameter
vector 0
( 0 0 ), we may build up a system of orthogonality conditions. This system can
be based on projecting observable variables predicted by the model onto other variables, some
instruments included in the information set F :
[ ( ;
where, for some vector of
0 )]
=0
instruments, say, In = [
( ; ) =
+1
+1
(1 +
(1 +
]> ,
1 +1 )
1 In
+1 )
1 In
..
.
(5.9)
The instruments used to produce the orthogonality restrictions, may include constants, past
values of consumption growth, +1 , or even past returns.
+1 ; 0 )
(5.10)
where
: R R 7 R , and
is a vector of i.i.d. disturbances in R . Assume the
econometrician knows the function . Let = (
. In many cases of
1
+1 ),
1X
=1
where,
=
[
|
(
( (
{z
(
))]
}
(5.11)
0)
c
by
A. Mele
The basic idea underlying simulation-based methods is quite simple. While the moment conditions are too complex to be evaluated analytically, the model in Eq. (5.10) can be simulated.
Accordingly, draw from its distribution, and save the simulated values . Compute recursively,
+1
+1
( )>
( )
(5.12)
!
( )
1X
1 P
( )=
( ) =1
=1
and ( ) is the simulated sample size, which we write as a function of the sample size , for
the purpose of the asymptotic theory.
The estimator , also known as the Simulated Method of Moments (SMM) estimator, aims to
match the sample properties of the actual and simulated processes
and . It was introduced
in a series of works, by McFadden (1989), Pakes and Pollard (1989), Lee and Ingram (1991)
and Du e and Singleton (1993). The simulated pseudo-maximum likelihood method of Laroque
and Salanie (1989, 1993, 1994) can also be interpreted as a SMM estimator.
A second simulation-based estimator relies on the indirect inference principle (IIP), and was
proposed by Gourieroux, Monfort and Renault (1993) and Smith (1993). Instead of minimizing
the distance of some moment conditions, the IIP relies on minimizing the parameters of an
auxiliary, possibly misspecied model. For example, consider the following auxiliary parameter
estimator,
= arg max ln
;
(5.13)
1
where is the likelihood of some possibly misspecied model. Consider simulating
process in Eq. (5.10), and computing,
( ) = arg max ln (
( )1 ; )
times the
= 1
The diagram in Figure 5.1 illustrates the main ideas underlying the IIP.
233
c
by
A. Mele
Model-simulated data
Model
yt
Estimation of an
auxiliary model on
model-simulated data
Auxiliary
parameter estimates
~
y( ) (~
y1 ( ), , ~
yT ( ))
H ( yt 1 , t ; )
( )
Auxiliary
parameter estimates
( y1 , , yT )
Observed data
Indirect Inference Estimator
argmin
~
T
Estimation of the
same auxiliary model
on observed data
( )
FIGURE 5.1. The Indirect Inference principle. Given the true model = ( 1 ; ), an estimator
of based on the indirect inference principle ( say) makes the parameters of some auxiliary model
( ) as close as possible to the parameters
= arg min
observations. That is,
( )
, for some norm .
Finally, Gallant and Tauchen (1996) propose a simulation-based estimation method they
label e cient method of moments (EMM). Their estimator sets,
(
)=
1 X
ln
=1
1;
where ln ( | ; ) is the score of some auxiliary model , also known as the score generator,
is the Pseudo ML estimator of the auxiliary model, and ( ) =1 is a long simulation (i.e.
is very large) of Eq. (5.10), with parameter vector set equal to . Finally, the weighting matrix
in Eq. (5.12) is taken to be any matrix I 1 converging in probability to:
(5.15)
ln ( 2 | 1 ; )
I=
ln ( |
1;
ln ( 2 |
234
1;
)=0
) =0
ln ( |
1;
), satises
c
by
A. Mele
Let,
0
h
(
( ))
> i
and
( )
1
(
)
0 (1 + )
0
>
0
1
0
,
(5.16)
0
where = lim
,
=
(
(
))
=
, and the notation
0
0
( )
is drawn from its stationary distribution.
Indeed, the rst order conditions satised by the SMM in Eq. (5.12) are,
0 =[
)]>
)=[
)]>
( 0) +
means that
( 0) (
0 )]
(1)
That is,
(
0)
=
=
=
We have,
( 0) =
=
>
0
>
0
1
0
1X
=1
1 X
( 0)
)]>
>
0
>
0
))
0)
)]>
( 0)
)
( 0)
( )
1 P
( ) =1
=1
(0 (1 + )
(5.17)
( )
( )
X
1
( ) =1
0
where we have used the fact that ( ) =
. By using this result into Eq. (5.17) produces
= 0 (i.e. if the number of simulations grows
the convergence in Eq. (5.16). If = lim
( )
faster than the sample size), the SMM estimator is as e cient as the GMM estimator. Finally,
and obviously, we need that = lim
: the number of simulations ( ) cannot
( )
grow more slowly than the sample size.
235
c
by
A. Mele
The IIP-based estimator works slightly di erently. For this estimator, even if the number of
simulations is xed, asymptotic normality obtains without requiring to go to innity faster
than the sample size. Basically, what really matters here is that
goes to innity.
By Eq. (5.17), and the discussion in Section 5.7.1, we know that asymptotically, the rst
order conditions satised by the IIP-based estimator are,
(
0)
>
0
>
0
( 0)
is as in Eq. (5.14), 0 =
( ), and ( ) is solution to the limiting problem
where
corresponding to the estimator in Eq. (5.13), viz
1
( ) = arg max lim
ln
1;
1X
( 0 ))
=1
1X
[(
0)
( 0)
0 )]
=1
0)
1X
( 0)
0)
=1
where
1
Asy.Var
( 0)
0 1+
We have,
= arg min
)>
1 X
)=
ln
=1
)>
)>
(
(
1;
)
(
)+
)(
0 ))
or
(
0)
)>
( 0
236
)>
c
by
A. Mele
)=J
(
)
(0 I)
where J =
where,
=
=I
With
>
0)
>
(0
>
>
>
(5.18)
This section provides a heuristic discussion of the conditions under which the EMM achieves the
Cramer-Rao lower bound. Consider the following denition, which is similar to that in Tauchen
(1997). Of a given span of moment conditions , say that of the EMM, we say that it also
spans the true score if,
( | )=0
(5.19)
where denotes the true score. From Eq. (5.18), we know that the asymptotic variance of the
EMM, say
EMM , satises:
1
EMM
>
( )
( )
we have,
1
MLE
where
( )=
MLE
>
( )
( |
)=
)=
=
=
=
0)
( )
)> +
( | )
(5.20)
where (
ln ( ;
ln ( ;
ln ( ;
(5.21)
) (
(
)
)
0)
ln (
)>
(
0)
0)
is the true density. Next, replace Eq. (5.21) into Eq. (5.20),
1
MLE
>
( )
( |
)=
1
EMM
( |
Therefore, the EMM estimator achieves the Cramer-Rao lower bound under the spanning condition in Eq. (5.19).
237
c
by
A. Mele
( ( ); )
( )
(5.22)
where ( ) is a Brownian motion and and are two functions guaranteeing a strong solution
to Eq. (5.22). Except in special cases (e.g., the a ne models reviewed in Chapter 12), the
likelihood function of the data generated by this process is unknown. We can then use one of
the three estimators we have presented in section 5.7.1. Alternatively, we might use simulated
maximum likelihood, a method introduced in nance by Santa-Clara (1995) (see, also, Brandt
and Santa-Clara, 2002). We only provide the idea of the method, not the asymptotic theory.
Suppose, then, that we observe discretely sample data generated by Eq. (5.22): 0 , 1 , , ,
, , where is the sample size. We need to know the transition density, say ( +1 | ; ),
to implement maximum likelihood, which we assume we do not know. Consider, then, the Euler
approximation to Eq. (5.22),
=
( +1)
(5.23)
+1
where is a sequence of i.i.d. random variables with expectation zero and unit variance. This
stochastic process is dened at the dates , for integer. Let [ ] denote the integer part of
, and for = 1 [ ], set
(
+1
if
In other words, we are chopping the time interval between two observations, [ + 1], in
( )
pieces, and then take
to be large. We know that as
,
( ) as
,
where
denotes weak convergence, or convergence in distribution, meaning that all nite
( )
dimensional distributions of converge to those of ( ) as
. The idea underlying
simulated maximum likelihood, then, is to estimate the transition density, ( +1 | ; ), through
simulations of Eq. (5.23), performed using a large value of . Note, we cannot guarantee the
transition density is recovered by simulating Eq. (5.23), not even for a large value of . We can
only perform an imperfect simulation of Eq. (5.23).
The likelihood function is,
= ( 0; )
Y1
+1 |
=0
; )
( +1)
( +1)
;
238
is
c
by
A. Mele
where ( ; ; 2 ) denotes the Gaussian density with mean and variance 2 . Moreover, we
have, approximately,
Z
( +1 | ; ) =
( +1 | ; ) ( | ; )
Z
1 2
1
( ; )
( | ; )
=
+1 ; + ( ; ) ;
where we have set = +1
in a moment, and estimate
(
+1 |
; )
1X
( +1)
; +
=1
( | ; ), as explained
from
time to time + 1
where is obtained by iterating Eq. (5.23) from
1
1
. Under regularity
0 as and get
5.7.4 Advances
The three estimators examined in Sections 5.7.1-5.7.2 are general-purpose, but in general, they
do not lead to to asymptotic e ciency, unless the true score belongs to the span of the moment conditions, as explained in Section 5.7.2.4. There exist other simulation-based methods,
which aim to approximate the likelihood function through simulations (e.g., Lee, 1995; Hajivassiliou and McFadden, 1998): for example, the simulated maximum likelihood estimator in
Section 5.7.2.3 can be used to estimate the parameters of stochastic di erential equations. While
methods based on simulated likelihood lead to asymptotically e cient estimators, they address
specic estimation problems, just as the example of Section 5.7.2.3 illustrates.
There exist estimators that are both general purpose and that can lead to asymptotic e ciency. Fermanian and Salanie (2004) consider an estimator that relies on approximating the
likelihood function through kernel estimates obtained simulating the model of interest. Carrasco,
Chernov, Florens and Ghysels (2007) rely on a continuum of moment conditions matching
model-based (simulated) characteristic functions to data-based characteristic functions. Altissimo and Mele (2009) propose an estimator based on a continuum of moment conditions,
which minimizes a certain distance between conditional densities estimated with the true data
and conditional densities estimated with data simulated from the model, where both conditional
densities are estimated through kernel methods.
5.7.5 In practice? Latent factors and identication
The estimation theory of this section does not rule out the situation where some of the variables
in Eq. (5.10) are unobservable. The principle to follow is very simple, one applies any of the
methods we have discussed to those variables simulated out of Eq. (5.10), which correspond to
the observed ones. For example, we may want to estimate the following model of the short-term
rate ( ), discussed at length in Chapter 12:
p
( ) = (
( )) + p ( ) 1 ( )
(5.24)
( )) +
( ) 2( )
( ) = (
239
c
by
A. Mele
where ( ) is the short-term rate instantaneous, stochastic variance, 1 and 2 are two standard Brownian motions, and the parameter vector of interest is = [
]. Let us consider
one of the methods discussed so far, say indirect inference. The logical steps to follow, then,
are (i) to simulate Eqs. (5.24), and (ii) to calibrate an auxiliary model to the short term rate
data simulated out of Eqs. (5.24) which is as close as possible to the very same auxiliary model
tted on true data. Note, in doing so, we just have to neglect the volatility data simulated out
of Eqs. (5.24), as these data are obviously unobservable.
The question arises, therefore, as to whether the auxiliary model one chooses is rich enough
to allow identifying the models parameter vector . There might be many combinations of
unobserved random processes ( ) that are consistent with the likelihood of any given auxiliary model. So which auxiliary model to t, in practice? Gallant and Tauchen (1996) asked
this question long time ago. Needless to mention, there are no general answers to this question. Very simply, one requires the model to be identiable, which is likely to happen once the
auxiliary model is rich enough. In an impressive series of applied work, Gallant and Tauchen
and their co-authors have proposed semi-nonparametric score generators, as a way to get as
close as possible to a rich model. Intuitively, by increasing the order of Hermite expansions,
semi-nonparametric scores might converge to the true ones. Alternatively, one might use a continuum of moment conditions, as explained in Section 5.7.4. For example, the nonparametric
density estimators of Altissimo and Mele (2009) converge to the true parameter once the bandwidth parameters used to smooth out these kernel estimates gets smaller and smaller. In the
next section, we provide a discussion of how asset prices might help convey information about
unobserved processes and lead to statistical e ciency.
( ( ); )
( )
(5.25)
where
is a multidimensional process and (
) satisfy some regularity conditions we single
out below. We analyze situations where the original partially observed system in Eq. (5.25)
can be estimated by augmenting it with a number of observable deterministic functions of the
state. In many situations, such deterministic functions are suggested by asset pricing theories
in a natural way. Typical examples include the price of derivatives or in general, any functional
of asset prices (such as asset returns, bond yields, implied volatilities).
The idea to use asset pricing predictions to improve the t of models with unobservable
factors has been explored at least by, e.g., Christensen (1992), Pastorello, Renault and Touzi
(2000), Chernov and Ghysels (2000), Singleton (2001), and Pastorello, Patilea and Renault
(2003).
We consider a standard Markov pricing setting. For xed
0, we let
be the expiration
date of a contingent claim with rational price process = { ( ( )
)} [ ) , and let
{ ( ( ))} [ ] and ( ) be the associated intermediate payo process and nal payo function,
4 This
240
c
by
A. Mele
respectively. Let / + be the usual innitesimal generator of the system in Eq. (5.25), taken
under the risk-neutral probability. Then, as we saw in Chapter 4, we have that in a frictionless,
arbitrage-free market, is the solution to the following partial di erential equation:
0=
+
(
)+ ( ) ( )
[
)
(5.26)
( 0) = ( )
where
( ) is the short-term rate. We call prediction function any continuous and twice
di erentiable function ( ;
) solution to the partial di erential equation and boundary
condition in (5.26). Examples of contingent claims with prices satisfying (5.26) are derivatives,
typically.
Next, we augment the system in Eq. (5.25) with
prediction functions, where denotes
the number of the observable variables in Eq. (5.25). Precisely, we let:
( )
( ( ( )
( ( )
))
1]
( ( )
( ( )))
1]
(5.27)
c
by
A. Mele
(y; 0, 0)
FIGURE 5.2. Asset pricing, the Markov property, and statistical e ciency. is the domain on which
)> takes values, is the domain on which
the partially observed primitive state process
(
( ))> takes values in Markovian economies, and ( ) is a contingent
the observed system
(
claim price process in R
. Let
= (
( 1) (
)), where { (
)} =1 forms an
intertemporal cohort of contingent claim prices, as in Denition 5.3. If the local restrictions of are
one-to-one and onto, statistical inference about and can be made, using information about the price
of derivative contracts, . If is also globally invertible, statistical inference can lead to rst-order
asymptotic e ciency, once conditioned upon .
in Eq. (5.25). First, the price of a given contingent claim is typically not available for a long
sample period. As an example, available option data often include option prices with a life span
smaller than the usual sample span of the underlying asset prices. By contrast, it is common
to observe long time series of option prices having the same maturity. Second, the price of a
single contingent claim depends on the time-to-maturity of the claim; therefore, it does not
satisfy the stationarity assumptions maintained in this paper. To address these issues, we deal
with data on assets having the same characteristics at each point in time. Precisely, consider
the data generated by the following random processes:
Definition 5.3. (Intertertemporal (
)-cohort of contingent claim prices) Given a prediction
function ( ;
) and a -dimensional vector
( 1
) of xed time-to-maturity,
an intertemporal (
)-cohort of contingent claim prices is any collection of contingent claim
price processes ( )
( ( ( ) 1) ( ( )
)) (
0) generated by the pricing model
(5.27).
Consider for example a sample realization of three-months at-the-money option prices, or
a sample realization of six-months zero-coupon bond prices. Long sequences such as the ones
in these examples are common to observe. If these sequences were generated by the pricing
model in Eq. (5.27), as in Denition 5.3, they would be deterministic functions of , and hence
stationary. We now develop conditions ensuring both feasibility and rst-order e ciency of the
class of simulation-based estimators, as applied to this kind of data. Let denote the matrix
having the rst rows of , the di usion matrix in Eq. (5.25). Let
denote the Jacobian of
with respect to . We have:
Theorem 5.4. (Asset pricing and Cramer-Rao lower bound) Suppose to observe an intertemporal (
)-cohort of contingent claim prices ( ), and that there exist prediction functions
in R
with the property that for = 0 and = 0 ,
( ) ( )
( )
6= 0,
-a.s. all
242
+ 1],
(5.28)
c
by
A. Mele
where
satises the initial condition ( ) = ( )
( ( ( ) 1)
( ()
)). Let
= ( ( ) ( ( ) 1) ( ( )
)). Then, any simulation-based estimator applied to
is feasible. Moreover, asssume
is also Markov. Then, any estimator with a span of moment
conditions for
that also spans the true score, attains the Cramer-Rao lower bound, with
respect to the elds generated by .
According to Theorem 5.4, any estimator is feasible, whenever is locally invertible for a
time span equal to the sampling interval. As Figure 5.2 illustrates, condition (5.28) is satised
whenever is locally one-to-one and onto.5 If is also globally invertible for the same time
span,
is Markov. The last part of this theorem says that in this case, any estimator is
asymptotically e cient. We emphasize that this conclusion is about rst-order e ciency in the
joint estimation of and given the observations on .
Naturally, condition (5.28) does not ensure that is globally one-to-one and onto: might
have many locally invertible restrictions.6 In practice, might fail being globally invertible
because monotonicity properties of may break down in multidimensional di usion models.
For example, in models with stochastic volatility, option prices can be decreasing in the underlying asset price (see Bergman, Grundy and Wiener, 1996). In models of the yield curve with
stochastic volatility, to cite a second example, medium-long term bond prices can be increasing
in the short-term rate (see Mele, 2003). These cases might arise as there is no guarantee that
the solution to a stochastic di erential system is nondecreasing in the initial condition of one
if its components, which is, instead, always true in the scalar case.
When all components of vector represent the prices of assets actively traded in frictionless
markets, (5.28) corresponds to a condition ensuring market completeness in the sense of Harrison
and Pliska (1983). As an example, condition (5.28) for Hestons (1993) model is
/
6=
0
-a.s, where denotes instantaneous volatility of the price process. This condition is
satised by the Hestons model. In fact, Romano and Touzi (1997) showed that within a fairly
general class of stochastic volatility models, option prices are always strictly increasing in
whenever they are convex in . Theorem 5.4 can be used to implement e cient estimators in
other complex multidimensional models. Consider for example a three-factor model of the yield
curve. Consider a state-vector (
), where is the short-term rate and
are additional
factors (such as, say, instantaneous short-term rate volatility and a central tendency factor). Let
()
= ( ( ) ( ) ( );
) be the time rational price of a pure discount bond expiring
at
= 1 2, and take 1
( (1) (2) ). Condition (5.28) for this model
2 . Let
is then,
(1) (2)
(1) (2)
6= 0,
-a.s.
[ + 1]
(5.29)
where subscripts denote partial derivatives. It is easily checked that this same condition must be
satised by models with correlated Brownian motions and by yet more general models. Classes
of models of the short-term rate for which condition (5.29) holds are more intricate to identify
than in the European option pricing case seen above (see Mele, 2003).
5 Local
243
c
by
A. Mele
3|
That is,
Pr
3
T
=1
= Pr (
2 ) = Pr (
2 ) Pr (
2)
3| ) =
3|
1)
2|
1 ).
T
T
T
Pr ( 3
Pr ( 3
)
2)
T1
=
Pr ( )
Pr ( 1
)
2
2)
244
= Pr (
1)
Pr (
2|
1 ) Pr (
3|
2)
2.
c
by
A. Mele
= , if
, a constant.
( )
)=1
!
S
Pr sup |
|
= lim
Pr
|
|
=0
lim
0
0
0
)
0
This is succinctly written as
} converges in quadratic
)2 ]
[(
2
245
c
by
A. Mele
The following two results are useful to the purpose of this chapter:
Slutzkys theorem. If
and
, then:
be a
>
>
>
0;
>
, then
)=
. We
1X
=1
1X
=1
1X
=1
We now state and provide a proof of the central limit theorem in a simple setting.
Central Limit Theorem. Let { } be a i.i.d. sequence, satisfying
1 P
= 2
. Let
. We have,
=1
(
)=
and
)2
(0 1)
The multidimensional version of this theorem requires a mere change in notation. For the proof, the
classic method relies on the characteristic functions. Let:
Z
i
i
()
( )
i
1
=
We have
( ) =0 = i
( ) = (0) +
Next, let =
=1
( ),
where (
1
( )
+
2
=0
2
(2) 1 2
(
)
+ = 1 + i (1)
+
2
2
=0
1 X
=1
246
c
by
A. Mele
( )=(
( )) , where
( )=
Clearly, lim
( )=
1 2
2
( )=1
, which are
+ . Therefore,
1
2
247
c
by
A. Mele
1X
ln
uniformly in
( | )
=1
( ) (
( 0) +
0)
( ) and:
( )=
1X
( | )
=1
( ) = 0. Hence,
( 0) +
) (
0)
1X
( 0 )|
=1
( 0 )|
sup |
( 0 )|
where the supremum is taken over the set of all the observations. Since
0 . Moreover, by the law of large numbers,
1X
( 0) =
0|
=1
Since
is continuous in
Therefore, as
0|
)] =
0,
J ( 0)
(5A.1)
(5A.2)
J ( 0)
( 0)
=J
( 0) =
( 0)
) = 0, the score,
( 0)
(0
( (
)))
( 0)
P
=1
), is such that
where
( (
)) = J
248
c
by
A. Mele
P
( =1
( ), and that
( ))
. If
(0 1)
=1
we say that { } is weakly dependent. Of a process, we say it is nonergodic, when it exhibits such a
strong dependence that it does not even satisfy the law of large numbers.
Stationarity
Weak dependence
Ergodicity
Let
0, lim
I|
=0
1X
and
1P
=1
=1
(0 1)
( )=
ln
( )
( )
( ;
=1
ln
( )|
( )|
=1
whence
(
0)
"
We have:
0
0)
( )|
0)
=1
1X
( 0)
=1
+1 ( 0 )|
1 X
( 0)
=1
]=0
+1 ( 0 )|2 |
)=
249
+1 ( 0 )|
J ( 0)
(5A.3)
c
by
A. Mele
R , let:
>
Clearly,
( 0)
= >J ( 0)
+1
0
!
X
X
2
=
=1
[ ( |
]=
=1
>
(|
>
( 0 )|2 )
=1
=1
>
(|
[J
( 0 )|2 |
1 ( 0 )]
=1
>
"
X
1 )]
(J
1 ( 0 ))
"
=1
Next, dene:
1X
and
=1
1X
=1
>
1X
(J
1 ( 0 ))
=1
Under the conditions underlying the central limit theorem for weakly dependent processes provided
earlier, to be spelled out below,
(0 1)
1X
(J
=1
1 2
1 ( 0 ))
1 X
( 0)
(0 I )
=1
( 0)
1X
[J
1 ( 0 )]
=1
and plim
1X
=1
(
0 J ( 0)
0)
250
[J
1 ( 0 )]
J ( 0) .
c
by
A. Mele
( ( ( + 1) M
( () M
( + 1)1
)
)| ( ( ) M
( ( ))
( () ( ()
( ) full rank
( ()
))
-a.s., and It
os lemma, satises, for
( ) =
( ) + ( ) ( )
( ) + ( ) ( )
( ) =
+ 1],
( )
( )
and
are, respectively, -dimensional and (
)-dimensional measurable functions, and
where
-a.s. Under condition (5.28), is not degenerate. Furthermore, ( ( ); )
( ) ( ) ( ) 1
). That is, for all ( + ) R R , there exists a function
( ) is deterministic in
( 1
such that for any neighbourhood (+ ) of + , there exists another neighborhood ( (+ )) of (+ )
such that,
(+ )
( () M
1 )=
: ( ( + 1) M ( + 1)1 )
=
: ( ( + 1) ( ( + 1) 1
)) ( ( + 1)
))
( (+ ))
=
: ( ( + 1) ( ( + 1)
| ( () M
1
))
|( ( ) ( ( )
) = }
( ( + 1)
1
))
( ()
( (+ ))
)) = }
where the last equality follows by the denition of . In particular, the transition laws of
given
are
not
degenerate;
and
is
stationary.
The
feasibility
of
simulation
based
method
of
moments
1
estimation is proved. The e ciency claim follows by the Markov property of , and the usual score
martingale di erence argument.
251
c
by
A. Mele
References
Altissimo, F. and A. Mele (2009): Simulated Nonparametric Estimation of Dynamic Models.
Review of Economic Studies 76, 413-450.
Amemiya, T. (1985): Advanced Econometrics. Cambridge, Mass.: Harvard University Press.
Bergman, Y. Z., B. D. Grundy, and Z. Wiener (1996): General Properties of Option Prices.
Journal of Finance 51, 1573-1610.
Brandt, M. and P. Santa-Clara (2002): Simulated Likelihood Estimation of Di usions with an
Applications to Exchange Rate Dynamics in Incomplete Markets. Journal of Financial
Economics 63, 161-210.
Carrasco, M., M. Chernov, J.-P. Florens and E. Ghysels (2007): E cient Estimation of General Dynamic Models with a Continuum of Moment Conditions. Journal of Econometrics
140, 529-573.
Chernov, M. and E. Ghysels (2000): A Study towards a Unied Approach to the Joint Estimation of Objective and Risk-Neutral Measures for the Purpose of Options Valuation.
Journal of Financial Economics 56, 407-458.
Christensen, B. J. (1992): Asset Prices and the Empirical Martingale Model. Working paper,
New York University.
Du e, D. and K. J. Singleton (1993): Simulated Moments Estimation of Markov Models of
Asset Prices. Econometrica 61, 929-952.
Fermanian, J.-D. and B. Salanie (2004): A Nonparametric Simulated Maximum Likelihood
Estimation Method. Econometric Theory 20, 701-734.
Fisher, R. A. (1912): On an Absolute Criterion for Fitting Frequency Curves. Messages of
Mathematics 41, 155-157.
Gallant, A. R. and G. Tauchen (1996): Which Moments to Match? Econometric Theory 12,
657-681.
Gauss, C. F. (1816): Bestimmung der Genanigkeit der Beobachtungen. Zeitschrift f
ur Astronomie und Verwandte Wissenschaften 1, 185-196.
Gourieroux, C., A. Monfort and E. Renault (1993): Indirect Inference. Journal of Applied
Econometrics 8, S85-S118.
Hajivassiliou, V. and D. McFadden (1998): The Method of Simulated Scores for the Estimation of Limited-Dependent Variable Models. Econometrica 66, 863-896.
Hansen, L. P. (1982): Large Sample Properties of Generalized Method of Moments Estimators. Econometrica 50, 1029-1054.
Hansen, L. P. and K. J. Singleton (1982): Generalized Instrumental Variables Estimation of
Nonlinear Rational Expectations Models. Econometrica 50, 1269-1286.
252
c
by
A. Mele
Hansen, L. P. and K. J. Singleton (1983): Stochastic Consumption, Risk Aversion, and the
Temporal Behavior of Asset Returns. Journal of Political Economy 91, 249-265.
Harrison, J. M. and S. R. Pliska (1983): A Stochastic Calculus Model of Continuous Trading:
Complete Markets. Stochastic Processes and their Applications 15, 313-316.
Heston, S. (1993): A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options. Review of Financial Studies 6, 327-343.
Laroque, G. and B. Salanie (1989): Estimation of Multimarket Fix-Price Models: An Application of Pseudo-Maximum Likelihood Methods. Econometrica 57, 831-860.
Laroque, G. and B. Salanie (1993): Simulation-Based Estimation of Models with Lagged
Latent Variables. Journal of Applied Econometrics 8, S119-S133.
Laroque, G. and B. Salanie (1994): Estimating the Canonical Disequilibrium Model: Asymptotic Theory and Finite Sample Properties. Journal of Econometrics 62, 165-210.
Lee, B-S. and B. F. Ingram (1991): Simulation Estimation of Time-Series Models. Journal
of Econometrics 47, 197-207.
Lee, L. F. (1995): Asymptotic Bias in Simulated Maximum Likelihood Estimation of Discrete
Choice Models. Econometric Theory 11, 437-483.
McFadden, D. (1989): A Method of Simulated Moments for Estimation of Discrete Response
Models without Numerical Integration. Econometrica 57, 995-1026.
Mele, A. (2003): Fundamental Properties of Bond Prices in Models of the Short-Term Rate.
Review of Financial Studies 16, 679-716.
Newey, W. K. and D. L. McFadden (1994): Large Sample Estimation and Hypothesis Testing. In: Engle, R. F. and D. L. McFadden (Editors): Handbook of Econometrics, Vol. 4,
Chapter 36, 2111-2245. Amsterdam: Elsevier.
Neyman, J. and E. S. Pearson (1928): On the Use and Interpretation of Certain Test Criteria
for Purposes of Statistical Inference. Biometrika 20A, 175-240, 263-294.
Pakes, A. and D. Pollard (1989): Simulation and the Asymptotics of Optimization Estimators. Econometrica 57, 1027-1057.
Pastorello, S., E. Renault and N. Touzi (2000): Statistical Inference for Random-Variance
Option Pricing. Journal of Business and Economic Statistics 18, 358-367.
Pastorello, S., V. Patilea, and E. Renault (2003): Iterative and Recursive Estimation in
Structural Non Adaptive Models. Journal of Business and Economic Statistics 21, 449509.
Pearson, K. (1894): Contributions to the Mathematical Theory of Evolution. Philosophical
Transactions of the Royal Society of London, Series A 185, 71-78.
Romano, M. and N. Touzi (1997): Contingent Claims and Market Completeness in a Stochastic Volatility Model. Mathematical Finance 7, 399-412.
253
c
by
A. Mele
254
Part II
Applied asset pricing theory
255
6
Neo-classical kernels and puzzles
6.1 Introduction
Asset pricing models impose a number of restrictions on security returns, which can conveniently
be summarized by a few but key properties of the pricing kernel, consistent with a data-reduction
principle. This chapter discusses methods of statistical inference based on this data-reduction,
by relying on restrictions on the moments of the pricing kernel based on the celebrated Hansen
and Jagannathan (1991) boundsevidence against any model mounts when the volatility of
the pricing kernel is below a certain threshold. We illustrate how these bounds are useful,
by revisiting a simplied version of the Lucas tree model introduced in Chapter 3. We shall
examine issues within this methodology arising in nite samples, and review the e orts needed
to tackle them.
The next section contains a simpied version of the Lucas model, which we take as a useful
benchmark in this chapter. Section 6.3 develops the central tools of analysis, a non-parametric
bound on the volatility of the pricing kernels, i.e. the risk-premium, consistent with a given level
of the short-term rate. Section 6.4 considers multifactor extensions, with closed-form solutions,
arising under a number of analytically convenient assumptions on the stochastic discounting
factor. One of the striking points of Section 6.4 is that in spite of these added dimensionalities,
the resulting models might spectacularly fail explain the dynamics of asset pricespumping up
volatility is not enough, if this added volatility is not accompanied by time-varying countercyclical statistics. Section 6.5 develops a link between stochastic discount factors and Sharpe ratios,
and Section 6.6 develops dynamic versions of the core bounds at the heart of this chapter.
c
by
A. Mele
We consider an economy with a single agent with CRRA equal to , and a constant discount
factor . We assume cum-dividends gross returns, ( + )/
1 , are generated by:
+
ln(
) = ln
ln
= ln
where
1
2
1
2
+
+
02 ;
+
(6.1)
2
2
The second equation in (6.1) is obviously given. The rst, is endogenous, so to speak. We want
to nd parameter values such that Eqs. (6.1) hold in equilibrium.
6.2.1.2 Eulers restrictions
+1 +
+1
+1
+1
+1
(6.2)
1=
+1
+1 = ln
+1
where
is the information set as of time , and
+1 is the stochastic discounting factor.
Naturally, Eq. (6.2) holds for any asset. In particular, it holds
1for
a one-period bond with price
, +1
1 and +1 0. Dene,
ln
into
ln . By replacing
then, +1
1
+1
Eq. (6.2), one gets
=
, such that,
1
+1 |
1=
+1
+1
(6.3)
The following result helps determine the expectations in Eqs. (6.3) in closed-form:
Lemma 6.1: Let
. Then, for any
+1
R,
+1
+1
By the denition of
1
)=
+1 |
)+ 12
+1 |
)+
+1 |
+1 |
p
)
1
+1 |
+1 |
)+ 12
+1 |
ln
1
2
)+ 12
2 2
(6.4)
Therefore, the equilibrium interest rate is constantits expression is given in the second of
Eqs. (6.5) below.
257
c
by
A. Mele
(exp (
+1
+1 )|
)=
ln
1
2
)+
1
2
+1
(0 2 + 2 2 2
). The expectation in the above equation
where +1
+1
+1
can be determined through Lemma 6.1, resulting in,
| {z }
and
ln
ln +
1
( + 1)
2
(6.5)
risk premium
where we have used Eq. (6.4) to calculate the interest rate, . Note that the expected gross
return on the risky asset is,
1 2
+1 +
+1
+1
2
=
[
| ]=
= +
Therefore, if
0, then ( ( +1 + +1 )/ | )
, as expected.
The expressions for the equity premium and the short-term rate are the discrete-time counterpart to those derived in Chapter 4. Consider, for example, the interest rate. The second
term,
, reects intertemporal substitution e ects: consumption endowment increases, on
average, as
increases, which reduces the demand for bonds, thereby increasing the interest
rate. The last term, instead, relates to precautionary motives: an increase in the uncertainty
related to consumption endowment,
, raises concerns with our representative agent, who
then increases his demand for bonds, thereby leading to a drop in the interest rate.
6.2.1.3 Solution for the P/D ratio and absence of sunspots
We check the internal consistency of the model. The coe cients of the model satisfy some
restrictions. In particular, the asset price volatility must be determined endogenously. Let us
conjecture, and later verify, that the following no-sunspots condition holds, for each period
:
=
(6.6)
By Eqs. (6.5) and (6.6),
=
and by the denition of
+1
+1
(6.7)
in Eq. (6.2),
1
+
2
+1
+1
+1
such that we can dene the pricing kernel , from the stochastic discounting factor, recursively,
as follows:
+1
+1
+1
=1
c
by
A. Mele
Y
X
Y
+
+
+
=
=1
=1
=1
X
+
+
+
+
+
=
=1
By letting
and assuming the rst term in the previous equation goes to zero,
X +
=
+
(6.8)
=1
Eq. (6.8) holds, as just mentioned, under a transversality condition, similar to that analyzed in
Chapter 4, Section 4.3.3, which always holds, under the inequalities given in (6.9) below.
The expectations in Eq. (6.8) are, by Lemma 6.1,
+
(
)
+
=1
=
=
+
+
Suppose the risk-adjusted discount rate +
viz
+
(6.9)
c
by
A. Mele
Chapter 3 shows that within a IID environment, prices are convex in the dividend
if
1,
and concave if
1. Eq. (6.10) reveals this property may be lost in a dynamic environment.
The next chapter shows that in such an environment, the convexity of the price depends on the
that of the dividend process, in the following sense: if the expected dividend growth under the
risk-neutral probability is convex (resp. concave) in , the price is convex (resp. concave) in .
In the model of this section, the expected dividend growth under the risk-neutral probability
is linear in
, which explains the linear property in Eq. (6.10). [In progress, give economic
intuition]
6.2.3 Equity premium and interest rate puzzles
Average excess returns on the US stock market [the equity premium] is too high to be
easily explained by standard asset pricing models. Mehra and Prescott (1985)
Mehra and Prescott (1985) noted the following di culty with the Lucas model, which gave
rise to what is widely known as the equity premium puzzle. To be consistent with US data,
2
the equity premium in Eq. (6.7),
=
, must be an approximate 6% annualized, as
explained in the next chapter. If the asset we are trying to price is literally a consumption
claim, then,
would be consumption volatility, which is very low, approximately 3.3%. For
the equity premium to be high, we would need, by reverse-engineering the two equations in
0 06
(6.7), a quite high value of the relative risk-aversion, say
55. Section 6.4 explains
0 0332
this number, 55, can be improved to 35, once we also condition on the volatility of short-term
bonds.
One assumption underlying the previous calculations is that aggregate dividend equals aggregate consumption, which is obviously not the case in the real world. Note, then, that dividend
06
growth volatility is around 6%, which implies the implied is 17 0006
2 , thereby mitigating the
premium puzzle. Still, the model would fail deliver realistic predictions about return volatility,
as in this case, return volatility would be just 6%, by Eq. (6.11), which is less than a half of what
we see in the data, as explained in the next chapter. Moreover, the model would fail predict
countercyclical statistics, such as countercyclical expected returns or dividend yields. We shall
return to these topics in the next chapter.
r
0.1
0.0
10
20
30
40
gamma
-0.1
FIGURE 6.1. The risk-free rate puzzle: the two curves depict the graph
7
( ) =
2
1
(0
0328)
(
+
1),
with
=
0
95
(solid
line)
and
=
1
05
(dashed
ln + 0 0183
2
260
c
by
A. Mele
line). Even if risk aversion were to be as high as = 30, the equilibrium short-term rate
would behave counterfactually, reaching a level as high as 10%. In order for to be lower
when is high, it might be required that
1.
The equity premium puzzle is not the only one. Even if we are willing to consider that a
CRRA as large as = 30 is plausible, another puzzle arisesan interest rate puzzle. As the
expression for in equations (6.5) shows, a large value of can lead the interest rate to take very
high values, as illustrated by Figure 6.1. Finally, related to the interest rate puzzle is an interest
rate volatility puzzle. In the model of this chapter, the safe rate is constant. However, in models
where both the equity premium and interest rates change over time, driven by state variables
related to, say, preference shocks or market imperfections, the short-term rate is too volatile.
For example, in the presence of time-varying expected dividend growth,
say, the expression
for the short-term rate is the same as in Eq. (6.5), but with
replacing the constant
, as
explained in the next chapters. It is easily seen, then, that the interest rate is quite volatile for
high values of .
This interest rate volatility puzzle relates to the assumption of a representative agent. Chapter
3 (Section 3.2.3) explains that agents with low elasticity of intertemporal substitution (EIS)
have an inelastic demands for bonds. In the context of CRRA utility functions, a low EIS
corresponds to a high CRRA, as EIS = 1 , as explained in Chapter 3. So now suppose there is a
wide-economy shock that shifts the demand for bonds, as in the following picturefor example,
a shock that makes
change.
Bond pric e
B ond suppl y
An economy with a representative agent is one where the supply of bonds is xed. The combination of a representative agent with a low EIS, then, implies a high volatility of the short-term
rate, which is counterfactual. To mitigate this issue, one may consider preferences that disentangle the EIS from risk-aversion, as such as those relying on non-expected utility (Epstein
and Zin, 1989, 1991; Weil, 1989), or a framework with multiple agents, where bond supply is
positively sloped, as in the limited participation model of Guvenen (2009). These models are
examined in Chapter 8.
261
c
by
A. Mele
+1
+1 )|
= 1
Assuming
+1 is stationary, and taking the unconditional expectation of both sides of the
previous equation, leaves,
1 =
(1 +
)]
=(
)>
(6.12)
Next, let
( ), and create a family of stochastic discount factors
, by projecting on to the asset returns, as follows:
Proj ( | 1 +
() = +[
)]>
, parametrized by
where1
h
and
(
( )) (
discounting factor, i.e.,
1 +
) =
[1
(1 +
)]
i
))> . The Appendix shows that
1 =
( ) (1 +
( ) is also a stochastic
)]
(6.13)
We have,
p
( )) =
>
q
= (1
(1 +
))>
(1
(1 +
))
(6.14)
Eq. (6.14), provides the expression for the celebrated Hansen-Jagannathan cupafter the
work of Hansen and Jagannathan (1991). It leads to an important tool of analysis, as the
following theorem shows.
Theorem 6.2: Among all stochastic discount factors with xed expectation ,
one with the smallest variance.
Proof: Consider another discount factor indexed by , i.e.
1 = [ ( ) (1 + )]. Moreover, by Eq. (6.13),
0 =
=
=
=
[( ( )
[( ( )
[( ( )
[ ()
( ). Naturally,
( )) (1 + )]
( )) ((1 + ( )) + (
( )) (
( ))]
() ]
( ) is the
( ) satises
)))]
(6.15)
where the third line follows because [ ( )] = [ ( )] = , and the fourth line holds by
[( ( )
( ))] = 0. But
( ) is a linear combination of . By the previous equation,
1 We
have,
1 +
)=
(1 +
)]
( )
(1 +
262
)=1
(1 +
).
c
by
A. Mele
( )] =
=
=
[
[
[
[
()
()+
( )] +
( )] +
( )]
()
( )]. Therefore:
()
( )]
[ ()
( )] + 2
[ ()
( )]
()
()
( )]
k
Hansen and Jagannathan (1991) consider an extension of this result, in which the stochastic
discount factor satises the non-negativity constraint,
0.
Consider, then, the curve, 7
[ ( )]the cup. By Theorem 6.1, each pair
(
[ ( )]) predicted by any candidate model satisfying Eq. (6.12) has to lie above the
cup for each possible , (
[ ( )]). The idea is the following. Testing the validity of a
model is tantamount to assess whether the Euler equations it predicts are satised and how
volatile its pricing kernel is. Consider, then, Eq. (6.12), and assume that a candidate pricing
kernel is parametrized by some vector
, such that ( ; )
( ) say, and is volatile
enough. We want to nd values of , say , such that ( ; ) is a valid stochastic factor, i.e.
such that it prices all the assets, just as in Eq. (6.12) and is volatile enough. Requiring it prices
all the assets is actually a condition needed for Theorem 6.2 to holdin particular, the proof
shows that in this case Eq. (8.44) would hold. Naturally, the pricing kernel needs to be volatile
enough as well. The issue then is to ascertain whether the values of that let the pricing kernel
in the cup are economically reasonable so to speak.
To illustrate, consider the Lucas model of Section 6.2. We calculate the Hansen-Jagannathan
bounds in Eq. (6.14) and ascertain whether there are parameter values of the model that would
allow the stochastic discount factor to be inside the bounds in the mean and standard deviation
space. The stochastic discount factor of the model is:
+1
+1
= exp (
+1 )
+1
)=
and
1
+
2
+1
+1
+1
are:
p
( )) =
+ 12
p
1
(6.16)
c
by
A. Mele
4.5
HansenJagannathan
bounds
4
3.5
3
2.5
2
1.5
Predictions of
the Lucas model
1
0.5
0
0.8
0.85
0.9
0.95
1
1.05
Expected value of the pricing kernel
1.1
1.15
FIGURE 6.2. The solid line depicts the Hansen-Jagannathan bounds, obtained through
Eq. (6.14), through aggregate stock market data and the short-term rate. The average
return and standard deviation of the stock market are taken to be 0.07 and 0.14. The
average short-term rate (three-month bill) and its volatility are, instead, 0.01 and 0.02.
These estimates relate to the sample period from January 1948 to December 2002. The
circles are predictions of the Lucas model in Eq. (6.16), with = 0 95,
= 0 0183,
= 0 0328 and ranging from 1 to 35. The two circles inside the cup are the pairs
( ) in Eq. (6.16) obtained with = 35 and 33. Progressively lower values of lead
the pairs ( ) to lie outside the cup, nonlinearly.
The Lucas model predicts that the pricing kernel is quite moderately volatile. The following
chapters discuss models with both heterogeneous agents or more general preferences, which can
help boost the volatility of the pricing kernel.
c
by
A. Mele
This section considers a variant of the Lucas model of Section 6.2, and determines expected
returns under a di erent assumption regarding the returns distributions. We still maintain the
hypothesis that the stochastic discount factor is exponential-Gaussian, viz
1 2
+
NID (0 1)
+1 = exp ( +1 )
+1 =
+1
+1
2
where
and
1=
+1
+1 ) =
+1 )
( +1 ) +
+1
+1
+1
+1 )
+1 )
+1 )
+1
+1
,
(6.17)
(| 0 ( )|)
and
Next, suppose is normally distributed. This assumption is inconsistent with those underlying the model in Section 6.2, in which is lognormally distributed, in equilibrium, be1 2
ing equal to ln =
+ , where
is normal. Howver, let us explore the asset
2
pricing implications of this new assumption . Because +1 and +1 are both normal, and
( +1 ) = exp ( +1 ), we apply Lemma 6.3 and obtain,
+1 =
(
+1
+1 ) =
+1 )]
+1 ) =
+1
+1
+1
+1 )
+1
We now extend the previous observations to a more general setup. Consider a stochastic discount factor as a function of factors, ( 1
) say. A particularly convenient analytical
assumption is that
is exponential-a ne and the factors ( ) =1 normal, as in the following
denition:
Definition 6.4 (EAPK: Exponential A ne Pricing Kernel): Let,
0
X
=1
2 With
+1 )
, we have:
1
(
+1 )
(
(
+1 )
+1 )
In more general setups than the ones considered in this introductory example, both
265
+1 )
(
(
+1 )
+1 )
and
+1 )
should be time-varying.
c
by
A. Mele
( ) = exp( )
(6.18)
= 1
, the
+1
+1 ) =
[exp (
+1 )
+1 ] =
+1
+1 ) =
+1
+1 )
=1
By replacing this into Eq. (6.17) leaves the linear factor representation,
( +1 )
=1
+1
{z
+1 )
}
(6.19)
betas
linear factor
The APT representation in Eq. (6.19), is similar to a result in Cochrane (1996).3 Cochrane
(1996) assumes that is a ne, i.e.
is as in Denition 6.1. This assumption
P ( ) = where
implies that
( +1 +1 ) =
(
).
By replacing this expression for the
+1
+1
=1
covariance,
( +1 +1 ), into Eq. (6.17), leaves
( +1 )
+1
+1 )
=1
where
1
1
=
( )
0
The NEAPK assumption, compared to Cochranes, carries the obvious advantage to guarantee
the stochastic discount factor is strictly positivea theoretical condition we need to rule out
arbitrage opportunities.
6.4.2 Lognormal returns
Next, assume that is lognormally distributed, and that the NEAPK holds. We have,
+1
0
=1
1=
(6.20)
=
+1
+1
+1
1
2 2
2
0 =
1 +1 + +1
= ( +1 )+ 2 ( 1 + +2 1 )
3 To recall why Eq. (6.19) is indeed a APT equation, suppose that is a -(column) vector of returns and that = +
, where
is -(column) vector with zero mean and unit variance and
are some given vector and matrix with appropriate dimension.
Then clearly, =
( ). A portfolio delivers > = > + >
( ) . Arbitrage opportunity is:
: >
( ) = 0
and > 6= . To rule that out, we may show as in Part I of these Lectures that there must exist a -(column) vector
s.t.
=
( ) + . This implies = +
= +
( ) + . That is, ( ) = +
( ) .
266
c
by
A. Mele
1 2 2
2
( +1 ) =
+ +2 1 )
0+ ( 1
2
By applying the pricing equation (6.20) to a zero coupon bond,
ln +1
1 2 2
0 =
1 +1
= ln +1 + 2 1
which we can solve for
+1 :
ln
+1
1
0+
2
2 2
1
1 2
=
(6.21)
1
2
Eq. (6.21) shows that the theory in Section 6.2 through a di erent angle. Apart from Jensens
. This model
inequality e ects ( 12 2 ), this is indeed the Lucas model of Section 6.2 once 1 =
has poor quantitative implications as discussed in Section 6.3, bound as it is to explain returns
with only one stochastic discount-factor parameter, i.e. 1 .
Next consider the general case. Assume as usual that dividends are as in (6.1). To nd the
price function in terms of the state variable , we may proceed as in Section 6.2. In the absence
of bubbles,
X
X
+
( + 0 + 12 =1 ( 2 +2 ))
=
+ =
(
)
(
+1 )
=1
ln
+1
=1
Thus, if
then,
1X
2 =1
+2
1
There are interesting features of the model to mention. The model is cast within a multi-factor
setting, and yet it predicts that the price-dividend ratio is constant, which is counterfactual, as
explained in the next chapter. However, note the following facts. The rst two moments of the
stochastic discount factor of this model are easily determined, by relying on Lemma 6.1:
p
p
1
( )
( ) = ( )+ 2 ( )
( ) = ( )+ ( ) 1
Therefore, we can always calibrate the parameters of this model and make sure the rst
two moments of the pricing kernel enter into the Hansen-Jagannathan cup. At the same time,
remember, the model still predicts price-dividend ratios to be constantthat is, the model
makes counterfactual predictions (constant price-dividend ratios) even when the variance of its
pricing kernel is arbitrarily large.
In other words, a model satisfying Hansen-Jagannathan bounds is not denitely good. It
would rather necessitate further scrutiny. The next section illustrates a variant of the previous
model in which time-variation in the risk-premiums gives rise to time varying statistics. The
next chapter makes additional steps forwards, attempting to develop theoretical test conditions,
which ensure that these time-varying statistics have the same cyclical properties as in the data,
such as procyclicality of price-dividend ratios or countercyclical volatility.
267
c
by
A. Mele
0=
( +1 )
Std ( +1 ) Std
+1
+1 =
+1 +
+1
where Std ( +1 ) denotes the standard deviation of a variable +1 , conditionally upon the
information available at time , and
corr ( +1
+1 ), a conditional correlation. Hence,
(
)
the Sharpe Ratio, S Std +1 , satises:
( +1 )
|S|
Std (
(
+1 )
= Std (
+1 )
+1 )
+1
(6.22)
The highest possible Sharpe ratio is bounded. The equality holds for a hypothetical portfolio
, say, yielding excess returns perfectly conditionally negatively correlated with the stochastic
discount factor,
= 1. We shall say of
that it is a -CAPM generating portfolio. Is
it also a market portfolio? After all, a feasible and attainable portfolio lying on the volatility
bounds for the stochastic discount factor is clearly mean-variance e cient. The answer is subtle.
As explained in the context of the static model of Chapter 1, the Sharpe ratio, S, equals the
slope of the Capital Market Line, and bears the interpretation of unit market risk-premium. If
= 1, then, by Eq. (6.22), the slope of the Capital Market Line reduces to Std(( +1+1)) . For
example, with the Lucas model in Section 6.2,
Std (
(
+1 )
+1 )
2 2
)/
=
, which is only approximately true,
In Section 6.2, we also explained that (
according to the previous relation. Not even a simple model with a single tree, such as that
in Section 6.2, would be capable of leading to a -CAPM generating portfolio, or a market
1 2
) 12 2 2 , and
2
portfolio! Indeed, for this model, we have that ( ) =
, = ln + (
2
( ) = 2 (
1), with a Sharpe ratio equal to:
By simple computations,
mately equal to
1
2 2
1
S=p
2
2
.
268
c
by
A. Mele
A further complication arises, a -CAPM generating portfolio is not necessarily the tangency
portfolio. We can show that there is another portfolio leading to the very same -pricing relation
predicted by the tangency portfolio. Such a portfolio is referred to as the maximum correlation
portfolio, for reasons developed below. Let = (1 ) . By the CCAPM in Chapter 2,
where
is a portfolio return. Next, let
=
, which is clearly perfectly correlated
( 2)
with the stochastic discount factor. By results in Chapter 2,
=
( )
This is not yet the -representation of the CAPM, because we have yet to show that there
is a way to construct
as a portfolio return. In fact, there is a natural choice: pick =
,
where
is the minimum-variance kernel leading to the Hansen-Jagannathan bounds. Since
is linear in all asset retuns,
can be thought of as a return that can be obtained by
investing in all assets. Furthermore, in the appendix we show that
satises,
1=
Where is this portfolio located? The Appendix shows that there is no portfolio yielding the
same expected return with lower variance, that is,
is mean-variance e cient), and that:
1=
1+
1+
1+
Mean-variance e ciency of
and the previous inequality imply that this portfolio lies in
the lower branch of the mean-variance e cient portfolios. And this is so because this portfolio
is positively correlated with the true pricing kernel. Naturally, the fact that this portfolio is
-CAPM generating doesnt necessarily imply that it is also perfectly correlated with the true
stochastic discount factor. As shown in the appendix,
has only the maximum possible
correlation with all possible . Perfect correlation occurs exactly in correspondence of the
stochastic discount factor =
(i.e. when the economy exhibits a stochastic discount factor
exactly equal to
).
Proof that
imply:
( )
or,
(
(
)
)
(
=
(
(
=
)
)
) and 1 =
By construction,
is perfectly correlated with
. Precisely,
=
( 2 ). Therefore,
)
(
=
=
=
(
)
(
)
(
)
k
269
),
/ (
c
by
A. Mele
Figure 6.2 depicts the typical situation the neoclassical asset pricing model has to face. Points
are those generated by the Lucas model for various values of . The model has to be such
that points lie above the observed Sharpe ratio ( ( )/ ( )
greatest Sharpe ratio ever
observed in the dataSharpe ratio on the market portfolio) and inside the Hansen-Jagannathan
bounds. Typically, we need high values of to enter the Hansen-Jagannathan bounds.
There is an interesting connection between these facts and the classical mean-variance portfolio frontier described in Chapter 1. As shown in Figure 6.3, every asset or portfolio must lie
inside the region bounded by two straight lines with slopes
( )/ ( ). It must be so, as
for any asset (or portfolio) priced by a stochastic discount factor , we have that
( )
( )
As seen in the previous section, the equality is only achieved by asset (or portfolio) returns
that are perfectly correlated with . A tangency portfolio such as T doesnt necessarily attain
the volatility bounds for the stochastic discount factor. Moreover, the market portfolio has no
reasons to lie on the volatility bound for the stochastic discount factor. As an example, for
the simple Lucas model, the (only existing) asset has a Sharpe ratio, which doesnt lie on the
volatility bounds for the stochastic discount factor. In a sense, the CCAPM does not need
to imply the CAPM: there are necessarily no assets performing, at the same time, as market
portfolios and -CAPM generating, which are also priced consistently by the true stochastic
discount factor. These conditions would only simultaneously hold if the candidate market portfolio were perfectly negatively correlated with the stochastic discount factor, which is a quite
specic circumstance, the only circumstance where we can really say the CAPM is a particular
case of the CCAPM. We still do not know conditions on general families of stochastic discount
factors, which are consistent with the previous properties.
(m )
H ansen-Jagannathan bounds
Sharpe ratio
E(m )
However, we know that there exists another portfolio, the maximum correlation portfolio,
which is also -CAPM generating. In other terms, if
:
=
, for some positive
constant , then the -CAPM representation holds, but this doesnt necessarily mean that
270
c
by
A. Mele
all
(6.23)
Therefore, we dont need an asset or portfolio return that is perfectly correlated with
to
make the CCAPM collapse to the CAPM. All in all, the existence of an asset return that is
perfectly negatively correlated with the stochastic discount factor is a su cient condition for
the CCAPM to collapse to the CAPM, not a necessary condition. The proof of Eq. (6.23) is
simple. By the CCAPM,
(
( )
(
( )
and
( )
(
( )
That is,
(
(
But if
)
)
(
(
)
)
(6.24)
is -CAPM generating,
(
(
)
)
(
(
)2
(
(
)
)
(6.25)
E(R)
T
tangency portfolio
1 / E(m)
maximum correlation portfolio
(R)
A nal thought. In many pieces of applied research, we often read that because we observe
time-varying Sharpe ratios on (proxies
. of) the market portfolio, we should also model the market
p
risk-premium
( +1 )
( +1 ) as time-varying. While Chapter 7 explains that
the evidence for time-varying risk-premiums is overwhelming, a criticism to this motivation
is that
is only an upper bound to the Sharpe ratio of the market portfolio. On a strictly
theoretical point of view, then, a time-varying is neither a necessary or a su cient condition
to have time-varying Sharpe ratios, as Figure 6.3 illustrates.
271
c
by
A. Mele
272
c
by
A. Mele
6.8. Appendix
6.8 Appendix
Proof of Eq. (6.13). We have,
[
( ) (1 +
)] =
=
+(
))>
(1 +
)+
(1 +
)+
(1 +
)+
(1 +
)+
(1 +
)+
(1 +
)+1
i
)
(1 +
h
i
(
( ))> (1 + )
i
h
(1 + ) (
( ))>
h
((1 + ( )) + (
( ))) (
i
h
(
( )) (
( ))>
))>
(1 +
. We have,
)
(
2
[( ) ]
)=
where
(
) = 2 +
= 2 +
2
= +
h
h
))>
(
)>
(1 +
( ) [1 +
h
= 2 + 1
h
= 2 + 1
= 2 +
(1 +
(1 +
(
>
(
>
))
))>
)]
i
i
(1 +
))>
[1
(1 +
))]
is mean-variance
efficient. Let = ( 0 1
)> the vector of
Proof that
+1
+1
=1
>
= 1 1 1
. We denote our benchmark portfolio
The returns we consider are
=
1. Next, we build up an arbitrary portfolio yielding the same expected return
return as
( ) and then we show that this has a variance greater than the variance of . Since this portfolio
273
c
by
A. Mele
6.8. Appendix
is arbitrary, the proof will be complete. Let
(
>
)=
1+ > ]
( 2)
>
)=
2)
)=
). We have:
)]
such that
)]
[ (
(1
+1
2 )]2
))
1]
=0
The rst line follows by construction since
>
(1
+1
)] =
+1
=1
Given this, the claim follows directly from the fact that
(
)=
+(
)] =
(
1+
1+
1=
)+
1=
. We have,
1
)2 ]
[(
is:
1
[1
(1 + )]
We have,
[(
)2 ] = [ + ( )>
= 2 +
[(
= 2 +
[(
= +
[(
= 2 +
>
>
2
]
)> ]2
)> ( )> ]2
>
)( > > )]
= + [1
(1> +
= 2 + 1>
[1>
1 +
>
>
)]
1>
1
)2 ] =
1=
(
[(
)
)2 ]
[1
(1 + )]
1 + 1>
(1>
1>
1 +
11
2 ( + )+ 2 1+
1=
>
1 + 1>
and
+2 +
1>
>
)]
), this is:
+2 ( + ) 2 1+ +2 +
2 ( + ) + 2 (1 + + 2 + >
274
>
>
1
1
c
by
A. Mele
6.8. Appendix
Next, recall the following two denitions:
=
In terms of
and
(
1
1+
=(
)>
>
)=
, we have,
)
( )
[( )2 ]
1=
(1 + )2 (1 + ) (1 + 2 + 2 ) + 1 + + 2 + >
(1 + )2 (1 + ) (2 + 2 ) + 1 + + 2 + >
=
=
1
1
1
1+
1+
1+
This is positive if
0, i.e. if
low (or su ciently high) values of .
>
(2 + 1) +
)=
[
(
(1
(1
+ ((1
) +
)+
( (1
) +
p
=
( )
((1
) +
)
(
p
=
( )
((1
) +
)
((1
= (1
)=
( )
(
)+
| {z }
((1
(
((1
)
)
)=
. The last
)]
=1
= 0
is a nonstochastic a ne translation of
) =
)]
)
}
{z
=1
{z
=1
)
}
). Therefore,
( )
(
p
)
(
275
c
by
A. Mele
6.8. Appendix
References
Alvarez, F. and U.J. Jermann (2005): Using Asset Prices to Measure the Persistence of the
Marginal Utility of Wealth. Econometrica 73, 1977-2016.
Cecchetti, S., Lam, P-S. and N. C. Mark (1994): Testing Volatility Restrictions on Intertemporal Rates of Substitution Implied by Euler Equations and Asset Returns. Journal of
Finance 49, 123-152.
Cochrane, J. (1996): A Cross-Sectional Test of an Investment-Based Asset Pricing Model.
Journal of Political Economy 104, 572-621.
Epstein, L.G. and S.E. Zin (1989): Substitution, Risk-Aversion and the Temporal Behavior of
Consumption and Asset Returns: A Theoretical Framework. Econometrica 57, 937-969.
Epstein, L.G. and S.E. Zin (1991): Substitution, Risk-Aversion and the Temporal Behavior of
Consumption and Asset Returns: An Empirical Analysis. Journal of Political Economy
99, 263-286.
Ferson, W. E. and A. F. Siegel (2003): Stochastic Discount Factor Bounds with Conditioning
Information. Review of Financial Studies 16, 567-595.
Gallant, R. A., L. P. Hansen and G. Tauchen (1990): Using the Conditional Moments of
Asset Payo s to Infer the Volatility of Intertemporal Marginal Rates of Substitution.
Journal of Econometrics 45, 141-179.
Gordon, M. (1962): The Investment, Financing, and Valuation of the Corporation. Homewood,
IL: Irwin.
Guvenen, F. (2009): A Parsimonious Macroeconomic Model for Asset Pricing. Econometrica
77, 1711-1740.
Hansen, L. P. and R. Jagannathan (1991): Implications of Security Market Data for Models
of Dynamic Economies. Journal of Political Economy 99, 225-262.
Mehra, R. and E. C. Prescott (1985): The Equity Premium: A Puzzle. Journal of Monetary
Economics 15, 145-161.
Weil, Ph. (1989): The Equity Premium Puzzle and the Risk-Free Rate Puzzle. Journal of
Monetary Economics 24, 401-421.
276
7
Aggregate uctuations in equity markets
7.1 Introduction
This chapter discusses empirical regularities of the aggregate equity market, how these properties relate to the business cycle, and the extent to which the neo-classical model can account
for them. This chapter is, thus, a natural development of the previous. Its motivation is still
to explain how existing models can help rationalize the extant empirical evidence. However, its
focus now regards how aggregate stock market uctuations relate to macroeconomic developments. When are stock returns pro-cyclical? When is stock volatility countercyclical? Such are
the questions this chapter addresses.
Our analysis regards how to reverse-engineer the pricing kernel that is consistent with the
empirical properties of the aggregate stock market. We would like to identify properties of
the pricing kernels consistent with stock volatility, not only stock returns. We consider two
broad classes of models. In the rst, agents evaluate assets relying on time-varying discount
rates. Thus, in these models, the cyclical properties of aggregate equity markets are explained
by the agents optimal response to shocks on fundamentals. In the second class of models,
expected growth is time-varying. This variability translates to uctuations in aggregate stock
market volatility. For example, time-varying expected growth may arise because agents have
incomplete knowledge of the state of the economy, and try to infer it from public signals. If the
agents estimate that the probability of living in the good state is time-varying, they will view
expected growth as being random. This randomness becomes a source of asset volatility.
The previous properties of the pricing kernels as well as the agents inference processes form
the basis of more advanced discussions in Chapter 8. Note, nally, that the models we analyze
in this chapter do not necessarily lead to a resolution of the puzzles surveyed in the previous
chapter. To illustrate, even if a given model likely leads to interesting dynamics, further scrutiny
is required regarding the size of the equity premium. We may nd a model predicts countercyclical volatility, as in the data, and yet this volatility can be orders of magnitude lower than
in the data.
A nal remark is in order concerning the very nature of aggregate stock market uctuations.
The empirical evidence reviewed in this chapter (see Section 7.2) suggests that a few key
market statistics are quite stationary in relation to their historical behavior vis-`a-vis the business
c
by
A. Mele
cycle. It is interesting, especially in light of the fact that capital markets have undergone
signicant changes over time, which mainly a ected various aspects of their microstructure
say transactions technology, the price discovery process, liquidity, or volumes, to mention a
few examples. How is it that the properties of the aggregate stock market reviewed in this
chapter do not appear to be a ected by these changes? One simple possibility is that market
microstructure regards the very high frequency behavior of markets, whereas the properties
studied in this chapter relate to slow, low frequency movements, which would not be too a ected
by market microstucture details. The models in this chapter chapter aim to rationalize some of
these movements, as explained. Models addressing the previous market microstructure issues
are reviewed in Chapter 9.
More in detail, this chapter is organized as follows. The next section provides a succinct
overview of empirical regularities of aggregate equity markets at the business cycle frequency; we
explain that price-dividend ratios and stock returns are procyclical, whereas stock volatility and
risk-premiums are countercyclical. Section 7.3 analyzes in deeper detail the empirical behavior of
aggregate volatility, and provides intuitive explanations for it. Section 7.4 develops a framework
to think about our countercyclical statistics. Section 7.5 analyzes the two classes of economies
with which it illustrates the predictions of Section 7.4. The modeling approach in this chapter
is based on the price-dividend ratios. Section 7.6 develops an alternative approached based on
B/M ratios. [In progress]
c
by
A. Mele
and stock volatility and expected returns. Stock volatility is calculated as follows:
1 X
+1
12 =1
12
Vol
+1
(7.1)
where
is the risk-free rate, taken to be the one month bill return.1 Expected returns are
calculated as explained below (see Eq. (7.2)). With the exception of the P/D ratio, all gures
are annualized percent.
We note the rst main set of stylized facts:
Fact I. The P/D ratio and realized returns are procylical, although variations in
the business cycle conditions do not seem to be their only driving forces.
For example, Figure 7.1 reveals that the price-dividend ratio on the S&P500 declines during
all of the economic slowdowns, as signaled by the recession indicator calculated by the National
Bureau of Economic Research (NBER)the NBER recessions. At the same time, during NBER
expansions, price-dividend ratios seem to be driven by additional factors not necessarily related
to the business cycle. For example, during the roaring 1960s, price-dividend ratios experienced
two major drops that display the same order of magnitude as the decline at the very beginning
of the chaotic 1970s. Realized returns follow approximately the same pattern, although they
are more volatile than price-dividend ratios (see Figure 7.2).
P/D ratio
P/D
ln P/D+1
one year returns
real risk-free rate
excess return volatility
expected returns
total
average std dev
31.99
15.88
2.01
12.13
8.59
15.86
1.02
2.48
14.55
4.68
8.36
3.49
NBER expansions
average std dev
33.21
15.79
3.95
10.81
12.41
13.04
1.03
2.43
14.05
4.47
8.09
3.29
NBER recessions
average std dev
26.20
14.89
7.28
16.79
9.45
15.49
0.97
2.69
16.91
4.91
9.62
4.10
TABLE 7.1. Data are sampled monthly and cover the period from January 1948 through
December 2002. With the exception of the P/D ratio levels, all gures are annualized
percent.
Table 7.1 also reports estimates of a key component of asset evaluation, the expected market
return (i.e. the investors expected return to invest in the stock market) at each point in time
of our sample. Appendix 1 describes a procedure to estimate such an unobserved variable. This
chapter relies on our reconstruction of yearly expected returns, dened as
E
12
X
( + )
(7.2)
=1
1 The
rationale behind this calculation is as follows. First, is an estimate of the average volatility occurring over the last 12
months. We annualize by multiplying it by 12. The term 6 arises for the following reason. If we assume that a given return
=
, where
follows by multiplying
12 ( ) by
(| |) =
279
c
by
A. Mele
280
c
by
A. Mele
FIGURE 7.1
c
by
A. Mele
Predictive regression
24
30
22
Data
35
25
20
15
10
20
18
16
14
12
5
10
FIGURE 7.3. Stock volatility and business cycle conditions. The left panel plots stock
volatility, Vol , against yearly (deseasoned) industrial production average growth rates,
1 P12
computed as IP
=1 Ind +1 , where where Ind is the real, seasonally adjusted
12
industrial production growth as of month . The right panel depicts the prediction of the
ordinary least squares regression: Vol = 15 59 5 21 IP +1 56 IP2 + , where
is a
(0 19)
(0 39)
(0 41)
residual term, and robust standard errors are in parenthesis. The data span the period
from January 1948 to December 2002.
282
c
by
A. Mele
Data
Predictive regression
11
18
16
14
12
10
10
5
2
FIGURE 7.4. The left-hand side of this picture plots estimates of the expected returns
(annualized, percent) (E say) against yearly (deseasoned) industrial production average
1 P12
growth rates, computed as IP
=1 Ind +1 , where where Ind is the real, seasonally
12
adjusted industrial production growth as of month . Expected returns are estimated
through Eq. (7.2). The right-hand side of this picture depicts the prediction of the ordinary
least squares regression: E = 7 74 2 19 IP + 0 43 IP2 + , where
is a residual
(0 09)
(0 19)
(0 19)
term, and robust standard errors are in parenthesis. Data are sampled monthly, and span
the period from January 1948 to December 2002.
Fact I entails a quite intuitive consequence: price-dividend ratios might convey information
relating to future returns. After all, expansions are followed by recessions. Therefore, in good
times, the stock market predicts that future returns will be negative. Dene the excess return
+
+ is the asset return over
for the time period [ + ] as +
+ , where
[ + ], and + is the sum of the one-month Treasury bill rate, taken over [ + ]. Consider
the following regressions,
P D +
(7.3)
+ P D . Thus, they
In turn, the previous regressions imply that [ + P D ] =
suggest that price-dividend ratios are driven by expected excess returns. In this restrictive sense,
countercyclical expected returns (Fact II) and procyclical price-dividend ratios (Fact I) might
283
c
by
A. Mele
be two sides of the same coin. To investigate how this predictability links to business cycle
developments, consider the following regression, performed with monthly data from 1948:01 to
2002:12,
= 14 64
12
(1 04)
9 09 IP
(1 37)
12
14 27 In
(2 67)
12
with R2 = 11%
(7.4)
where robust standard errors are in parenthesis, is a residual term, 12 is the excess return
from 12 to , IP is the average industrial production growth over the previous twelve months,
as dened in Figure 7.3, and In is dened similarly as IP .
The negative signs of the coe cients in Eq. (7.4) are quite to be expected. Economic activity
does display mean-reverting behavior: bad times are followed by good. But good times are those
where the stock market goes up. Therefore, a slowdown in economic activity is a predictor of
high returns in the future. To illustrate with a simple example, consider a case where the
aggregate stock market positively links to a single state variable tracking the business cycle
conditions , say, such that the log of the aggregate equity index is ln = 0 +
, for two
constant 0 and , and where
0. Assume, then, and critically, that
is mean-reverting,
with unconditional expectation , speed of adjustment
0, and some volatility coe cient
( ),
= (
) + ( )
where
is a Brownian motion. Then, it is straightforward to show that
ln
=
12
12
denotes the expectation conditional on the information at , 0
0
1
12 , where
12
12
(1
) and 1
(1
). That is, if
is mean-reverting,
0, and the aggregate stock market is procyclical,
0, expected returns negatively link to past values
of , i.e. 1
0. This reasoning generalizes to a multivariate case, although the presence of
feedbacks between macroeconomic variables might then dilute the contribution of each variable
as a predictor of future expected returns.
The regression results in Eq. (7.4) have the same nature as that underlying Eq. (7.3): pricedividend ratios and market returns are procyclical. Note a nal relation that reveals this procyclicality of market returns. At a contemporaneous level, the excess market returns are positively related to industrial production and negatively related to ination,
12
= 10 47 + 7 27 IP
(1 07)
(1 19)
with R2 = 14%
16 33 In +
(2 91)
(7.5)
P D +
c
by
A. Mele
(ii) the role played by expected dividend growth is somewhat limited. In the next chapter (see
Section 8.11), we shall explain that this view can be challenged along several dimensions. First,
it seems that expected earning growth does help predicting price-dividend ratios. Second, the
fact expected dividend growth does not seem to a ect price-dividend ratios can be a property
to be expected in equilibrium.
Naturally, because expected returns and stock volatility are both strongly countercyclical,
they then positively relate at the business cycle frequency considered in this chapter, as illustrated by Figure 7.5 below.
18
16
14
12
10
8
6
4
2
10
15
20
25
30
35
285
c
by
A. Mele
Why does equity volatility relate to the business cycle? One of the rst contributions to this
literature is Schwert (1989a,b). Schwert points out that low frequency uctuations in equity
volatility are di cult to explain through those in the volatility of other macroeconomic variables.
For example, industrial production volatility does not correlate
with stock volatility. Let us
P12
1
calculate industrial production volatility as VolG
is the real,
+1 |, where
=1 |
12
seasonally adjusted industrial production growth rate at month (similarly as in Eq. (7.1)).
Figure 7.6 plots stock volatility against VolG ; it does not reveal any obvious pattern between
these two variables. These results are in striking contrast with those available from Figure 7.3,
where, instead, stock volatility exhibits a quite clear countercyclical behavior. More in detail,
Table 7.1 reveals that stock market volatility is almost 30% higher during NBER recessions
than during NBER expansions.
In fact, Schwert also shows that stock volatility is countercyclical. The main focus of this section is to provide a few explanations of this evidence, which supports the view equity volatility
relates to the business cycle, although not precisely related to the volatility of other macroeconomic variables.
A seemingly unrelated but well-known stylized fact is that risk-premiums are countercyclical,
as summarized by Fact II in the previous section. Particularly important is also Fact III:
expected returns lower much less during expansions than they increase during recessions. With
post-war data, we nd that compared to an average of 8.36%, the expected returns increase by
nearly 19% during recessions while they drop by a mere 3% during NBER expansions (see Table
7.1). A nal stylized fact is that price-dividend ratios behave asymmetrically over the business
cycle. Table 7.1 reveals that not only are price-dividend ratios pro-cyclical. Their downward
changes during recessions are also more severe than their upward movements during expansions.
Table 7.1 suggests that price-dividend ratios uctuate nearly two times more in recessions than
in expansions.
How can we rationalize these facts? A simple possibility is that the economy is frequently hit
by shocks that display the same qualitative behavior of return volatility, expected returns and
price-dividend ratios. However, Figure 7.6 suggests this channel is unlikely. Another possibility
is that the economy reacts to shocks, thanks to some mechanism endogenously related to the
investors maximizing behavior, which then activates the previous phenomena.
Section 7.3.2 puts forward explanations for countercyclical stock volatility relying on endogenous mechanisms. Section 7.3.3 provides, instead, additional empirical properties of equity
volatility. The motivation is simple: because stock volatility is countercyclical, it might contain useful information about ongoing business cycle developments. The section, then, aims to
provide answers to the following questions: (i) Do macroeconomic factors help explain the dynamics of stock market volatility? (ii) Conversely, what is the predictive content stock market
volatility brings about the business cycle?
286
c
by
A. Mele
Data
Predictive regression
35
18
17
30
25
20
15
10
16
15
14
13
5
10
12
12
10
12
FIGURE 7.6. Return volatility and industrial production volatility. The left panel plots
stock volatility, Vol , against industrial production volatility, VolG . The right panel of
the picture depicts the prediction of the ordinary least squares regression: Vol = 16 51
(0 93)
0 78 VolG + 0 05 Vol2G +
(0 47)
(0 05)
, where
parenthesis. The data span the period from January 1948 to December 2002.
In frictionless markets, the price of a long-lived security is simply the risk-adjusted discounted
expectation of the future dividends stream. Heuristically, and other things being equal, this
price increases as the expected return from holding the asset and, hence, the risk-premium,
decreases. According to this mechanism, asset prices and price-dividend ratios are pro-cyclical
because risk-adjusted discount rates are countercyclical.
Would countercyclical risk-premiums also lead to countercyclical volatility? We now explain
that an addditional property is required, asymmetry. Figure 7.7 depicts a situation in which
287
c
by
A. Mele
risk-premiums are countercyclical and, also, asymmetric, in that they decrease less in good
times than they increase in bad, consistently with the empirical evidence.
Suppose that the economy enters a boom, in which case risk-premiums decrease, and asset
prices increase, on average. During the boom, when the economy is hit by positive shocks on
the fundamentals, risk-premiums decrease and asset prices increase. However, risk-premiums
(and, hence, asset prices) do not change as they would during a recessionwe are assuming
that risk-premiums behave asymmetrically over the business cycle. Eventually, the boom ends
and a recession begins. As the economy leads to a recession, risk-premiums increase and asset
prices decrease. Yet now, the negative shocks on the fundamentals lead risk-premiums increase
(and, hence, asset prices decrease) more than they did during the boom. All in all, volatility
increases on the downside. Once again, this asymmetric behavior occurs due to the assumption
that risk-premiums change asymmetrically over the business cycle.3
Price-dividend
ratio
Risk-adjusted
discount rates
good
times
bad
times
bad
times
good
times
3 Mele
(2007) develops a no-arbitrage framework to deal with these countercyclical issues, on which the next section is based.
288
c
by
A. Mele
Data
Predictive regression
11
18
16
14
12
10
10
FIGURE 7.8. Expected returns and business cycle conditions. The left panel plots expected excess returns, E in Eq. (7.2) against real (deseasoned) monthly industrial production growth, Ind . The right panel depicts the prediction of the ordinary least squares
regression: E = 7 225 (0 782 IRec + 0 121 IExp ) Ind + , where IRec (resp., IExp )
(0 099)
(0 174)
(0 103)
is the indicator functions that takes the value one if the economy is in a NBER-recession
is a residual term, and standard errors
(resp. expansion) episode and zero otherwise,
are in parenthesis.
To summarize, if risk-premiums are more volatile during recessions than booms, asset prices
and price-dividend ratios are more responsive to changes in economic conditions in bad times
than in good, thereby leading to countercyclical volatility. These e ects are precisely those
we observe, as explained. The next section develops theoretical foundations to formalize these
links. A key result is that countercyclical volatility is likely to arise in many models, provided
the previous asymmetry in discounting is su ciently strong. More precisely, if the asymmetry
in discounting is su ciently strong in relation to some benchmark variable tracking the business cycle conditions, then, the price-dividend ratio is, then, increasing and concave in these
variables. It is this concavity feature to make stock volatility increase on the downside.
Section 7.4 provides a more comprehensive explanation of these facts, by relying on a fairly
general continuous-time framework and tools relatively unusual in economics, although the
intuition is still the same as that of Figure 7.7. The scope of this section is to provide a quantitative illustration of these results, based on a simple binomial tree model, which is solved in
closed-form, and shown to predict a few of the stylized features of the aggregate market, surveyed in Section 7.2. Section 7.6 provides additional models that help understand the empirical
evidence.
289
c
by
A. Mele
Consider an innite horizon economy, a single asset, and a representative agent. In equilibrium,
the agent consumes the dividends promised by asset. We also assume a safe asset is innitely
elastically supplied, such that the interest rate is some constant
0. In the initial state, the
dividend is equal to one (see Figure 7.9). In the second period, =
(
0) with prob
(the bad state) or = with prob 1
(the good state).
In the initial state, the agents CRRA is
0. In the good (resp., the bad) state, the agents
RRA is
(resp.,
) 0. In the third period, the agent receives the nal payo s of Figure
7.9, where
is the price of a claim to all future dividends, discounted at a RRA , with
{
} and
= .
This model is thus one with constant expected dividend growth, but random risk-aversion.
Note that risk-aversion is being a source of long-run riskonce this risk is resolved, riskaversion remains xed at its level forever.4
e2 + MG
e
good state
q
1 + MGB
q
p
e
bad state
q
e2 + MB
FIGURE 7.9. A model of random risk-aversion and countercyclical volatility. The dividend
is normalized to one at the initial node of the tree. With prob ,
then decreases to
in the bad state. The risk-neutral probability of this state is denoted as . The riskneutral probability of further dividend movements depends upon whether the economy is
in the good or bad state (i.e.,
or ). At time 3, the agent receives the dividends plus
the right to the stream of all future dividends. In the upper node, this right is worth
(obtained through the risk-neutral probability ). In the central node, it is worth
(through the risk-neutral probability ). In the lower node, it is worth
(through the
risk-neutral probability ).
Table 7.2 provides calibration results for this model, based on the statistics of Table 7.1
(see Appendix 2 for details). Note that the model is calibrated using data for the aggregate
dividend growth (not consumption), the volatility of which is about 6% annualized (almost
4 The literature on long-run risks is surveyed in Chapter 8. In this literature, the expected dividend growth is a ected by some
unobservable and persistent factor, which generates countercyclical stock volatility, due to the assumption that dividend growth
and consumption volatility are countercyclical. The model in this section leads to countercyclical volatility without the assumption
that the volatility of the fundamentals is countercyclical.
290
c
by
A. Mele
twice that on consumption growth). In the Lucas model of the previous chapter (Section 6.2),
06
this gure would imply a RRA equal to 17 0006
2 to match an equity premium of 6%. However,
the Lucas model would predict that return volatility is simply dividend growth volatility, thus
being equal to 6%, which is less than a half of the average volatility in the data, 14.55%. Finally
the Lucas model predicts the price-dividend ratio is constant, and therefore cannot lead to the
countercyclical statistics in Table 7.1.
P/D ratio
excess return volatility
Data
expansions average recessions
33.21
31.99
26.20
14.05
14.55
16.91
P/D ratio
excess return volatility
risk-adjusted rate
expected returns
implied risk-aversion
Model calibration
good state average bad state
32.50
31.81
28.15
7.29
8.20
13.03
8.95
9.07
9.71
10.16
11.46
18.42
13.69
13.89
14.96
TABLE 7.2. This table reports calibration results for the innite horizon tree model in
Figure 7.9, by perfectly matching the model-implied time-3 P/D ratios to the data in the
rst row. The model-implied statistics in the good, average and bad state are those of
time-2. The risk-adjusted rate is + 0 , where is the riskless rate; 0 is dividend
growth volatility, p
and
is the Sharpe ratio on gross returns in state , determined as
(
)
(1
) for
=
(the good state) and
=
(the bad state).
Finally, is the probability of the bad state and
is the state-dependent risk-adjusted
probability of the bad state (for
{
}). Implied risk-aversion is the coe cient RRA
in the good state ( = ) and in the bad ( = ) implied by the calibrated model.
The gures in the average column are the averages of the corresponding values in the
good and bad states, averages taken under = 0 158.
The model predicts that the average excess return volatility equals about 8%. Moreover,
the average implied RRA is now around 13, and the average expected excess returns are high.
Finally, the model of this section predicts stock volatility has swings that mimic those in the
data, with levels reaching 13% in the bad state. In the bad state, however, the model overstates
the expected returns by a few percentage points. Importantly, this calibration exercise illustrates
the asymmetric feature of expected returns and risk aversion. In this experiment, both expected
returns and risk-aversion increase more in bad times than they decrease in good.
7.3.2.3 Alternative channels
There are at least two broad mechanisms to explain aggregate stock market uctuations, as
expalined in the Introduction: (i) time-varying risk-premiums (as in this section), and (ii) timevarying expected dividend growth. Section 7.5 surveys more elaborated models than the tree
of this section, aiming to rationalize countercyclical statistics based on the rst channel. We
291
c
by
A. Mele
shall also examine models of learning, in which stock volatility is time varying due to the
agents attempt to learn about the state of the economythese models predict agents face timevarying expected dividend growth. In the next chapter, we survey many models relying on both
channels, each of them focussing on particular economic mechanisms and specic predictions
(e.g., idiosyncratic risk, restricted stock market participation, heterogeneous beliefs, bubbles).
7.3.3 What to do with stock market volatility?
The historical behavior of equity volatility displays a pronounced business cycle pattern. Could
we exploit this pattern for the purpose of forecasting? This section considers two exercises.
In the rst exercise, we forecast stock market volatility using past macroeconomic data. Note
that stock volatility is an input to many decision making processes, ranging from portfolio
selection to risk-management; understanding how it links to ongoing business cycle conditions
is therefore a natural exercise. The second exercise explores whether volatility helps predict
economic activity. This exercise might help decision makers (e.g., policy makers) take informed
decisions.
7.3.3.1 Macroeconomic constituents of stock market volatility
Table 7.3 reports results regarding the rst forecasting exercise. How does volatility link to past
macroeconomic data? We use year-to-year industrial production growth and ination as the two
macroeconomic factors that summarize the state of the economy at any given point in time.
Volatility is positively related to past growth in the medium term (say between one and two
years), a nding we can easily interpret. Bad times are followed by good. Because stock market
volatility is countercyclical, high growth is followed by high volatility. These explanations are
similar to those put forward in Section 7.2 while elaborating on Eq. (7.4). Equity volatility is
also related to past ination, but in a more complex manner.
Figure 7.10 (top panel) depicts equity volatility and its in-sample forecasts when the regression model is fed with past macroeconomic data only. Naturally, the t could be improved by
providing the model with information about both past volatility and past macroeconomic data.
Nevertheless, it is remarkable that the t relying on past macro information is more than 60%
better than that relying on past volatility only, as witnessed by the R2 s in Table 7.3.
Note that these results are not inconsistent with those reported by Schwert (1989). Indeed,
this section relies on estimates regarding lower frequency scales than those investigated by
Schwert. More importantly, these estimate regard the linkages between stock market volatility
and the level of macroeconomic variables, not their volatility.
292
c
by
A. Mele
FIGURE 7.10. Stock market volatility predictions. The top panel depicts stock market
volatility (solid line) and its forecasts based on the sole use of past macroeconomic indicators (dashed), i.e. the model estimates in the second column of Table 7.3 (Past). The
bottom panel depicts stock market volatility and its prediction based on the realization
of future values of macroeconomic indicators, i.e. the model estimates in the third column
of Table 7.3 (Future). Shaded areas are NBER recession and expansion episodes.
The previous ndings should not be interpreted as suggesting any causality link; they could be
best regarded as descriptive statistics. They do suggest, however, that stock market volatility
links to past macroeconomic developments. A natural question is how precisely it does. Once
again, the previous regressions capture mere statistical relations. Yet macroeconomic factors
are likely part of the evaluation leading to the very same volatility. In terms of our daily jargon,
macroeconomic factors could well be determinants of the pricing kernel. That is, stock volatility
links to how the price responds to shocks in the fundamentals and, hence, macroeconomic
conditions, but this linkage should be determined in absence of arbitrage.
Corradi, Distaso and Mele (2013) pursue this topic in detail and build up a no-arb model
that reproduces the previous predictability results. In their model, there is a no-arbitrage nexus
between equity volatility and macroeconomic factors. Christiansen, Schmeling and Schrimpf
(2012) and Paye (2012) provide evidence of Granger causality from past values of several macroeconomic variables to stock volatility, in out of sample experiments. However, Paye notes that
we are still not able to exploit these linkages for forecasting purposes. It is an important result,
as it points to the possibility that in the future, alternative data sets could do a better job than
the datasets these authors are using.
The distinction between Granger causality and forecasting accuracy is indeed subtle. A set of
variables could well a ect the probability distribution of stock volatilitythis is the denition
293
c
by
A. Mele
of Granger causality. At the same time, estimating, say, a linear regression linking past macroeconomic variables to stock volatility might not necessarily perform well. Intuitively, this relation
can be subject to parameter estimation error, which increases the uncertainty sorrounding the
forecasts. This uncertainty might overwhelm bias reduction gains brought by a correctly specied model, i.e. without omitted variables (macroeconomic variables). We illustrate this point
in more detail below (see Section 7.3.3.4).5
Const.
Growth
Growth
Growth
Growth
In 12
In 24
In 36
In 48
Vol 12
Vol 24
Vol 36
Vol 48
R2
12
24
36
48
Past
10.98
0.36
0.09
0.10
0.08
12.50
10.81
0.15
0.16
0.27
0.23
0.76
0.97
0.45
0.31
21.91
4.79
0.002
0.24
0.27
0.20
0.63
0.92
0.62
0.15
0.28
0.03
0.02
0.09
27.24
Future
Const.
17.88
Growth +12 0.02
Growth +24
0.15
Growth +36
0.43
Growth +48
0.31
In +12
1.12
In +24
1.09
In +36
1.03
In +48
0.94
R2
24.30
TABLE 7.3. Forecasting stock market volatility with economic activity. The left part of
this table (Past) reports OLS estimates in linear regression of one year volatility (in %)
on to, past one year industrial production growth (in %), past one year month ination
(in %), and past stock volatility. Growth
is one year industrial production growth at
time
, etc. Time units are months. The second part of the table (Future) is similar,
but contains coe cient estimates in linear regressions of volatility on to future industrial
production growth and future ination. Starred gures are not statistically distinguishable
from zero at the 95% level. R2 is the percentage, adjusted R2 .
7.3.3.2 Macroeconomic implications of stock market volatility
Does equity volatility also anticipate the business cycle? Table 7.3 suggests that stock volatility
does indeed link to future business cycle developments. The bottom panel of Figure 7.10 depicts
the predicting part of the regression in the third column of Table 7.3, a back-casting exercise.
Fornari and Mele (2013) have actually tackled this issue in great detail, concluding that stock
volatility does quite help predict the business cycle, on top of traditional indicators such as the
term spread and other nancial variables, both in sample and out of sample.
To illustrate, note that not only is stock volatility countercyclical, i.e. a coincident business
cycle indicator. Figures 7.2 and 7.10 also seem to indicate that stock volatility tends to increase
before recessions, a typical attribute of a leading indicator. Consider the following regression:
X
= +
+ 1 I O(NBER =1) + 2 INBER =1 +
(7.6)
{3 12 24 36}
5 The literature on statistical tests for Granger causality and forecasting accuracy is large. See, e.g., Clark and West (2007) for
the former and Giacomini and White (2006) for the latter.
294
c
by
A. Mele
where
is stock volatility at month ; I O(NBER =1) is the indicator function that equals one
in the twelve months preceding any NBER-dated recession, and zero otherwise; INBER =1 is
the indicator function that equals one during any NBER-dated recession, and zero otherwise;
nally, is a residual term.
Table 7.4 reports estimates of this model parameters on a sample covering monthly data
from January 1957 to September 2008. The table reports estimates for the whole sample, and
two subsamples, one before the Great Moderation, i.e., up to 1982, and another, covering
the Great Moderation and ending in 2008. A value 1
0 is indicative that stock volatility
increases ahead of recessions and a value 2
0 indicates that stock volatility also increases
during recessions.
3
1957-2008
1957-1982
1983-2008
3.11 0.94
3.60 0.98
2.88 0.94
12
0.15
0.24
0.09
24
36
0.01
0.02
0.05
0.01
0.04
0.01
0.48
0.34
1.01
1.51
1.87
1.22
TABLE 7.4. This table reports ordinary least squares estimates of the parameters in Eq.
(7.6) for the post-War data and the two subsamples (i) prior and over (ii) the Great
Moderation. Starred gures are not statistically distinguishable from zero at the 95%
level.
It appears that especially during the Great Moderation, stock volatility does anticipate economic downturns. This issue is indeed quite a delicate one. The fact stock volatility is countercyclical does not necessarily imply it anticipates real economic activity. And even if it could,
there would remain to know whether a sustained stock market volatility could really create the
premises for future economic slowdowns.
Post hoc ergo propter hoc? Does aggregate stock market volatility a ect investment decisions
in the real sphere of the economy? Or, rather, does volatility help predict the business cycle?
The policy implications of these issues are quite obvious. If volatility merely anticipates, without
a ecting, the business cycle, there is little policy makers can do about it, even if its forecasting
power is obviously interesting per se. These themes are still unexplored at the time of writing.
[However, survey the recent volatility paradox ideas]
7.3.3.3 Forecasting with the wrong model
The results in this section are in-sample. It may turn out that real-time forecasts could be
disappointing. One reason could be data-snooping: if we regress a variable of interest over
thousands, there is a considerable chance that at least one out of these thousands nicely links to
the endogenous variable, and displays a spectacular t (in-sample). However, precisely because
this t was obtained only by chance, and not due to an economic linkages between this variable
and the endogenous one, the out-of-sample performance of the model will likely disappoint.
An opposite situation can actually occur, in which a link between two variables really exists,
which cannot be properly exploited for practical forecasting purposes. The intuitive reason for
this di culty is limited data. That is, we can only estimate a linkage between two variables by
relying on a nite sample. Yet the nite-sample bias in the linkage estimates could turn out
to be substantial, and lead to large forecasting errors. Consider the following example, a data
generating process in which a variable Granger causes a second one, , as follows:
= +
NID (0
295
NID (
(7.7)
c
by
A. Mele
for ve constants , , ,
and , the parameters of the model. We assume that
and
are known, and consider making predictions of the variable through two models.
The rst model is misspecied, in that we simply neglect that
Granger causes , i.e.
= + , for some constant and some residual term . We estimate the constant of this
misspecied model through ordinary least squares (OLS), obtaining:
= +
+1
+1
)+
+1
+1
+1
+1
=
=(
+ (
) (
+1
)
+1
+1
)+
+1
+1
(7.8)
and
stand for the sample covariance and variance of their arguments.
and
The correctly
specied model does, naturally, lead to an unbiased predictor, in that
2 +1 = 0, by the
second line in Eq. (7.8).
Therefore, the two models we consider (the misspecied and the correctly specied) both
lead to unbiased predictors. However, the second predictor is plagued by parameter estimation
error, and might actually lead to mean-squared prediction errors higher than those generated
is large. In other words,
by the rst predictor, especially when the sample variance of
is, of course, quite small, as is consistent for . In nite samples,
for large samples,
however, this term can adversely a ect the performance of the correctly specied model.
7.3.4 What did we learn?
Stock market volatility is higher in bad times than in good. Explaining this basic fact is challenging. We know very well how to model risk-premiums and how these premiums should relate
to the business cycle. We are more embarrassed when it comes to explain volatility. This section
explains that countercyclical volatility could arise because risk-premiums undergo large swings
as the economy moves away from good states, just as the data seem to suggest.
The focus in this section relates to the uctuations of aggregate stock volatility and risk
premiums, not their average levels. Not suprisingly, the question whether these uctuations
(and their average levels) can be consistent with the neo-classical model of rational evaluation is
controversial, as for many topics at the intersection of nancial economics and macroeconomics,
296
c
by
A. Mele
as vividly illustrated in the early debates (see, e.g., Campbell, 2003; Mehra and Prescott, 2003).
However, this section suggests that there is a potential for explaining the swings that aggregate
stock volatility experiences across states of nature.
Do these theoretical insights have some additional empirical content? This section has discussed three empirical issues: (ii) the market expected returns are strongly countercyclical and
asymmetrically related to macoreconomic conditions; (ii) equity volatility links to the business
cycle (in-sample), although it cannot necessarily be forecast through macroeconomic variables,
out-of-sample; (iii) equity volatility contains information regarding business cycle developments.
We now turn to more theoretically-based explanations of the aggregate stock market uctuations.
Asset returns depend on both payo s and prices. Consider the following identity that holds for
+1 +
+1
the gross returns, +1
,
ln +1
+1
+ ln
+1
+1
(7.9)
where
ln
, the price-dividend ratio. Thus, return
, the dividend growth, and
1
volatility is countercyclical because the dividend growth and/or the price-dividend ratio changes
have countercyclical volatility.
The empirical evidence in Section 7.2 suggests that return volatility does not necessarily
inherit the properties of the volatility of the fundamentals. Instead, the empirical evidenc suggests at least two minimal predictions any model should make regarding the price-dividend
ratio. First, it needs to be volatile, and second, it needs to be more volatile in bad times than
in good. For example, in an economy driven by a state variable linked to the business cycle,
such as habit formation (see Section 7.5), we would require that the price-dividend ratio be increasing and concave in the business cycle variable, as previously explained (see Section 7.3.2).
Intuitively, this property ensures stock volatility increases on the downsidethe very denition
of countercyclical volatility. This section aims to provide conditions under which price-dividend
ratios behave in this way.
7.4.1.2 Asymmetric behavior of the price-dividend ratio
Do price multiples behave asymmetrically over the business cycle? Empirically, they do, as
explained. We now rely on a simple continuous time model that leads to these asymmetric
297
c
by
A. Mele
0 CF (
where () is the short-term rate and CF () is the cash-ow lambda. In the next sections and
in the next chapter, we explain how agents preferences and beliefs can lead to these discount
rates. We assume that
is solution to
=
( )
( )
( )
1 2
1 )
R( )
(7.10)
= ( 0 2 0 )( )+ 0 ( 1
( )=E
where 1 is a standard Brownian motion under the risk-neutral probability , and E denotes
the expectation under .
Eq. (7.10) is derived in Chapter 4 (Section 4.2.5). It suggests that the sensitivity of with
respect to is related to the sensitivity of the risk-adjusted discount rate R with respect to
. Hence, whether volatility is countercyclical now depends on how the risk-adjusted discount
rates change after shocks in .
This section formalizes the previous intuition. It shows that if R increases in bad times
su ciently more than it does in good times, the price-dividend ratio is concave in , thereby
being more volatile in bad times than in good. This property is desirable because it would be
consistent with the empirical behavior of price-dividend ratios and volatility. Chapter 6 explains
that additional state variables to dividends are needed to drive uctuations in the price-dividend
ratio. But Chapter 6 also explains that multifactor models are necessarily satisfactory. Indeed,
multifactor models exist, such that (i) the variance of the pricing kernel increases arbitrarily
with the number of factors, and yet (ii) price-dividend ratios are constant. What we really need
is a discipline on how to increase the dimension of a model.
In the remainder of the chapter, we focus on two broad but key properties of models consistent
with this search process: (i) monotonicity and (ii) convexity properties:
(i) Monotonicity. Consider the price-dividend ratio
p in Eq. (7.10). By Itos lemma, stock
0( )
2
2
volatility is 0 + ( ) Vol ( ), where Vol ( ) =
1 ( ) + 2 ( ) is the volatility of . Therefore, can help inate volatility if increases with . This monotonicity is important
theoretically: it ensures that stock volatility is strictly positive, thereby guaranteeing the
agents budget constraints are well-dened.
298
c
by
A. Mele
(ii.1) Negative convexity. Suppose as before that correlates with the business cycle. If Vol( )
is constant, stock volatility is countercyclical whenever in Eq. (7.10) is concave in ,
as in the simple reasoning underlying Figure 7.8. We shall study this point in detail in
Section 7.5.3.
(ii.2) Convexity. Alternatively, suppose that expected dividend growth, say, is stochastic (an
assumption we explore in detail in Section 7.5). We shall explain that under conditions,
the price-dividend ratio is a function of , similarly as with Eq. (7.10). Now, suppose
is increasing and convex in . In this case, the price-dividend ratio would displays
overreaction to small changes in in good times, i.e. when is high. The empirical
relevance of this point was rst acknowledged by Barsky and De Long (1990, 1993), and
formalized by Veronesi (1999) in a model with learning (see Section 7.5.4).
We now introduce a framework to study these issues. We need to revisit the option pricing
literature on convexity of option prices and extend it to contexts with untraded risks. Chapter
10 contains additional explanations regarding these general properties of option prices that go
beyond those needed for the purpose of this chapter.
7.4.2 Asset prices as options
Consider a two-period market for a cash to be paid in the second period. We assume interest
rates are at at zero, and
= () for some random variable . Let
E[ ()] be the
premium.
The focus of standard textbooks is how the premium relates to the volatility of (see
Appendix 5 for further details). Appendix 5 considers a dynamic extension of this problem,
and develop conditions matching those in the static case. In this extension, = , for some
future date , where is a random process, with 0 = , such that the price of the claim is
now,
( ) = E ( ( )| )
(7.11)
Clearly, the two pricing problems, E ( ( )| ) and E ( ()), are not the same. They actually
bear similarities if (i) is the price of a traded asset; and (ii) is a proportional processone
for which the risk-neutral distribution of
is independent of . If these assumptions hold,
the usual tools of the static case still apply to this dynamic case. In particular, increases after
a mean-preserving spread in whenever is convex.6 We now examine cases in which these
assumptions do not necessarily hold.
7.4.2.1 Volatility, options and convexity
If
is the price of a traded asset, which does not pay dividends, the drift of
under the
risk-neutral probabilty is proportional to . We shall clarify soon that in this case, the price
inherits the convexity properties from the nal payo only, .
But there are risks that are not necessarily traded. In these markets, interesting nonlinearities
arise. For example, Theorem 7.1 reveals that in this context, convexity of is neither a necessary
or a su cient condition for convexity of . The drift of plays a crucial role.7 .
6 This
prediction is consistent with the celebrated Black and Scholes (1973) formula, as we further explain in Chapter 10a
point made by Jagannathan (1984, p. 429-430). As further explained in Chapter 10, Bajeux-Besnainou and Rochet (1996, Section
5), Bergman, Grundy and Wiener (1996), El Karoui, Jeanblanc-Picqu
e and Shreve (1998) and Romano and Touzi (1997) generalize
these results to more general di usion models, including those with stochastic volatility.
7 Kijima (2002) produces a counterexample where convexity of option prices might break down even payo s are convex in the
underlying, and traded, assets. This counterexample relies on an extension of the Black-Scholes model where due to the presence of
299
c
by
A. Mele
Let us consider the following problem. It is the benchmark for a number of pricing problems
dealt with in the remainder of this chapter.
Canonical pricing problem. Let
= ( )
(7.12)
( )
0
( )
(7.13)
(
) E
to be the price of an asset which promises to pay
) at time
( )
( )
+
+
( )
(7.14)
1
( )
where 1 and 2 are standard Brownian motions. This example generalizes that in Section
7.4.1.2, in that expected dividend growth, , can be stochastic, driven as it is by the state
variable .
Note also that in this model, the distribution of
does not depend on . That is, it is
a proportional Samuelson-Mertons process. Moreover, we assume that the short-term rate
only depends on . These assumptions imply that the price-dividend ratio is only a function
of the current state, similarly as in Section 7.4.1. However, due to stochastic expected dividend
growth, the expression for is slighly more general than that in Eq. (7.10). It is:
Z
( )=
( )
(7.15)
0
(
)
0
( ) E
h
1 2
i
( )
)
+ 0 1
0
0 R(
2 0
=E
( )
)
0
0 R(
=E
(7.16)
dividends, the drift of the underlying asset is concave in the asset price. Among other things, Theorem 7.1 below unveils the origins
of this counterexample.
300
c
by
A. Mele
P2
=
( )
( )
=1
( )
and
satisfy:8
1
+
( ) 1 +
( ) 2
where are two Brownian motions under . The third line of Eq. (7.16) is obtained with a
is the expectation taken under a conveniently changed probability
change of probability, and E
, dened by the Radon-Nikodym derivative,
= 12 20 + 0 1
(7.17)
where
denotes the information set as of time
have that under ,
= ( ) +
where ( )
( )
( ) 1 + 2 ( ) 2
P2
( ) ( )+ 0 1( )
=1
(7.18)
and are Brownian motions under .9 Note the trick we have used to arrive to a relatively
1 2
neat formula, by getting rid of the term, 2 0 + 0 1 , arising because consumption and the
state variable are correlated. The density of under is right-shifted with respect to the
same density under , due to the positive covariance between consumption growth and
.
Our canonical pricing problem allows us to analyze properties of prices relating to long-lived
assets, through those relating to in Eq. (7.16), with as in Eq. (7.18), once we set
( )
1;
( )
R( )
( );
( )= ( )
(7.19)
The next theorem characterizes slope and convexity properties of the price in the canonical
pricing problem.
Theorem 7.1. We have:
0, then is increasing whenever 0
0. Furthermore, if 0 = 0, then is
(i) If 0
decreasing (resp. increasing) whenever 0 0 (resp. 0).
(ii) If 00 0 (resp. 00 0) and is increasing, then is concave (resp. convex ) whenever
00
2 0 (resp. 00
2 0 ) and 00
0 (resp. 00
0). Finally, if 00 = 2 0 , is concave (resp.
convex ) whenever 00 0 (resp. 0) and 00 0 (resp. 0).
Theorem 7.1-(i) generalizes previous results regarding monotonicity of option prices, obtained
by Bergman, Grundy and Wiener (1996). By the so-called no-crossing property of a di usion,
is not decreasing in its initial condition . Therefore, inherits the same monotonicity features
of if discounting does not operate adversely. This simple observation allows us to address
monotonicity properties of long-lived asset prices, as we shall see in Section 7.5.
Theorem 7.1-(ii) generalizes a number of existing results on option price convexity. First,
assume that is constant and that
is the price of a traded asset, such that 0 = 00 = 0.
8 See, for example, Huang and Pag`
es (1992, Theorem 3 p. 53) and Wang (1993, Lemma 1, p. 202), for regularity conditions
underlying the Feynman-Kac theorem in innite horizon settings; and Huang and Pag`es (1992, Proposition 1, p. 41) for regularity
conditions ensuring that the Girsanovs theorem holds in innite horizon settings.
9 Mele (2005, 2007) contains the rst derivation of this representation of the price-dividend ratio.
301
c
by
A. Mele
The last part of Theorem 7.1-(ii) then says that the convexity of propagates to the convexity
of . This result reproduces the ndings in the literature surveyed earlier. Theorem 7.1-(ii)
characterizes convexity in a more general context. Suppose, for example, that 00 = 0 = 0, and
that
is not a traded risk. Then, Theorem 7.1-(ii) suggests that inherits the convexity of
the drift of . As a nal example, Theorem 7.1-(ii) extends a result in Mele (2003) relating to
bond pricing: let ( ) = 1 and ( ) = . Accordingly, is the price of a zero-coupon bond in a
short-term rate model (see Chapter 12 for details). By Theorem 7.1-(ii), is convex whenever
00
2 (see Appendix 6 for further details and intuition on this bounding number).
Option prices rely on both discounting and nonlinearities a ecting the drift of the state
variables, when the underlying fundamentals are not traded (unlike stock prices). In Section
7.5, we rely on the predictions of Theorem 7.1 and analyze the price of long-lived assets. In
the next section, we illustrate the gist of the proofs underlying Theorem 7.1, by developing one
example.
7.4.2.2 A digression on a macro-asset option
( )
+ ( )
1
2
for all
and
[0
(7.20)
(
)
=
(
)=
=
(7.21)
1
2
(
302
for all
and
[0
c
by
A. Mele
)=
( ), all
( )
(7.22)
1
2
1
+ ( + ( 2 )0 )
2
)=
) = 0, all
) +
for all
and
[0
) (7.23)
0(
)+
1
2
+( +( 2 )0 )
1 2 00
( ) ) +
2
for all
and
[0
) (7.24)
where
(
)+
00
( ) (
(7.25)
0
(
)
)=
( ) . Thus is increasing in by
By Eq. (7.22), we have that (
the assumption that is increasing and convex, and the no-crossing property of a di usion,
by which
is increasing in the initial condition . Therefore,
0. Furthermore,
0.
00
Therefore, (
) 0 whenever ( ) 0. By the Feynman-Kac theorem, then, is convex
in whenever 00 ( ) 0.
The previous conclusions can hold even with a concave payo function, say ( ) = ln . In
(
)1
this case, Eq. (7.22) implies that (
)=
, such that the function in Eq. (7.25)
collapses to, (
) = 00 ( ) (
). That is, the price (
) is convex (resp. concave) in
whenever is convex (concave) in . Note, then, that the price is linear in whenever 00 = 0,
as it can easily be veried by replacing ( ) = ln into Eq. (7.21), leaving:
(
)=
ln +
These examples convey a gist of the arguments underlying the proof of Theorem 7.1. They
also illustrate how we shall proceed to develop properties regarding long-lived asset prices in
the context of the canonical pricing problem of Section 7.4.2.1.
303
c
by
A. Mele
1. Expected returns
2. Returns volatility
Pricing
Kernel
Dividends
distribution
1. Interest rates
2. Risk-premium
FIGURE 7.11.
This section deals with this search process while relying on methodology introduced in the
previous section. We consider two economies. In the rst economy, changes in the economic
fundamentals determine cyclical variations in the discount rates (in Section 7.6.1). In the second,
the economic fundamentals lead to time-varying expected dividend growth (in Section 7.6.2).
We need to provide preliminary results about pricing kernels, which we need to use while
illustrating these two broad classes of economies. Finally, Section 7.5.4 is an introduction to a
class of hopefully analytically convenient processes we can use to model long lived asset prices.
7.5.2 Markov pricing kernels, asset returns and volatility
7.5.2.1 Pricing kernels
Motivated by the previous discussion, this section considers economies in which asset prices
have high volatility due to volatile pricing kernels and stochastic expected dividend growth. We
provide foundations relying on a representative agent economy. In this setting, interest rates
and risk-premiums change randomly because the agents utility depends on both consumption
and other variables. While complete markets naturally t in the analysis of this section (see
Chapter 2), incomplete markets can sometimes be analyzed relyiong on this framework, as
explained in the Chapter 8.
304
c
by
A. Mele
We extend some foundational issues in Chapter 4. Consider the stochastic discount factor in
Chapter 4,
( )
, where the pricing kernel process
satises
(
)=
=1
(7.26)
= (
) + 0( )
1
(7.27)
= (
) + 1(
)
)
1 + 2(
2
The pricing kernel is solution to
=
(7.28)
1(
) =
) =
(
)
(
)
ln (
)
)
0( )
2
ln (
1(
ln (
where
is the innitesimal generator operator (see Chapter 4) for (
) and
(
) denote
the di usion coe cients of .
For example, consider an innite horizon economy in which aggregate dividends are solution
to Eq. (7.27), with 2 0 and
1 , and a representative agent solves the following program:
Z
(
)
s.t
0
max
0 =
0
(
where
0, the instantaneous utility
tiable, and is solution to
=
(
1(
(
1(
12
)=
)
)
11
)=
(
1(
11
)
)
(
)
)
1
2
)
)
( )
1
2
(
1(
12
2
0
(
)
)
1(
)
122 (
)
)
1(
111
( )
(
)
)
in Eq. (7.26),
1 , and
(
305
( )
(
1(
112
)
)
(7.29)
(7.30)
c
by
A. Mele
We study the implications of these pricing kernels in terms of the asset expected returns and
volatility. We base the derivations on the continuous time formulation of the APT model in
Chapter 4 (see Section 4.2.5). These derivations rely on the assumption of a scale-invariant
economy, and may reveal useful as a guidance for empirical work. It can be shown that a
scale-invariant economy obtains once we assume that in Eq. (7.27), (i) (
) =
, (ii)
(
)
=
(for
some
constant
),
and
(iii)
the
drift
and
di
usion
coe
cients
of
are
0
0
0
independent of . We still assume that is some monotonic function Y () of in Eq. (7.26);
accordingly, we express the price-dividend ratio in terms of .
Under these assumptions, we have that the asset expected returns are
E
( ) +
=
where
R =
0 1
( )
W =
( )
(
( )
( )
(7.31)
( )+
( )
( ))
Vol1 ( )
Vol2 ( )
"
( )
0(
2( ) (
0(
)
( )
)
)
We rely on these predictions while analyzing a number of models in the following sections as
well as in Chapter 8.
7.5.3 External habit formation
Time-varying risk-premiums are a plausibly engine mechanism for asset price uctuations. Intuitively, the very properties of asset prices must necessarily inherit those of the risk-premiums,
as illustrated by Figure 7.11. Campbell and Cochrane (1999) model of external habit formation
is a well-known attempt at explaining some of the empirical features outlined in Section 7.2 by
incorporating time-varying risk-premiums into an economy with i.i.d. dividend growth.
Consider an innite horizon economy in which a representative agent has undiscounted instantaneous utility:
)1
1
(
(
)=
(7.32)
1
where denotes consumption and is a time-varying habit, or exogenous subsistence level.
In this model, the habit process is dened in a residual way, through the surplus consumption
ratio, as we now explain.
The total endowment process,
, satises,10
=
(7.33)
10 Campbell and Cochrane (1999) consider a discrete-time model in which the log-consumption growth is Gaussian. Eq. (7.33) is
the di usion limit of their model.
306
c
by
A. Mele
A measure of distance between consumption and the level of habit is the surplus consumption
ratio,
)
=
)
(7.34)
where subscripts denote partial derivatives, the second equality is the equilibrium condition,
=
, and the third is the denition of the equilibrium surplus consumption ratio. By
assumption, ln is solution to:
ln
= (1
)(
ln
( )
(7.35)
where is a positive function, dened below. That is, the surplus ratio is driven by output
innovations (i.e., by
): the higher the output growth innovations (which lead to higher
consumption in equilibrium), the higher the surplus ratio.11
This model of habit formation di ers from previous formulations such as that of Ryder
and Heal (1973), or Sundaresan and Constantinides (1990), due to three properties: (i) it
is an external theory, in that the habit is aggregate, not consumption chosen by the
individual, similarly as with Abels (1990) catching up with the Joneses formulation, or
Duesenberrys (1949) relative income model; (ii) habit responds to consumption smoothly, not
to each period past consumption, as in previous models of habit formation such as that of
Ferson and Constantinides (1990); (iii) it guarantees marginal utility is always positive.
The second of the previous properties produces slow mean reversions in the price-dividend
ratio and long-horizon predictability, and large predictable movements in stock volatility, three
empirical features reviewed in Section 7.2.
Note that markets are complete as there is only one source of risk (the dividend in Eq. (7.33).
Therefore, we can determine the Sharpe ratio in this economy relying on results in Section 7.5.2
(see Eq. (7.30)):
1
(
)=
(
)
(7.36)
0
where
is the di usion coe cient of equilibrium habit,
to Eq. (7.35). By Itos lemma, (
) = (1
( ))
leaves:
( ) = 0 (1 + ( ))
The real interest rate is, by Eq. (7.29),
1 2
+ (1
( )= +
0
2 0
) (
ln )
=
(1
), and
is solution
,
which
replaced
into
Eq.
(7.36),
0
(7.37)
1
2
2 2
0
(1 + ( ))2
(7.38)
The third term reects usual intertemporal substitution e ects. Due to mean reversion, bad
times (when is low) are those when agents expect the very same will improve. Therefore, in
bad times, agents expect their marginal utility to decrease in the future and to compensate for
11 One could add an additional Brownian motion in Eq. (7.35) to lower the conditional correlation between output growth and
surplus ratio.
307
c
by
A. Mele
this fall, they will try to decrease future consumption, compared to today, by trying to save less
(or trying to borrow more), thereby pushing interest rates up. The last term is a precautionary
savings term.
Campbell and Cochrane (1999) choose the function so as to satisfy three conditions: (i)
the short-term rate is constant; and habit is predetermined both (ii) at the steady state, and
(iii) near the steady state. A constant is consistent with the empirical evidence surveyed in
Section 7.2, that real interest rates are really not volatile, compared to stock returns. Making
habit predetermined at and near the steady state formalizes the idea that it takes time for
consumption shocks to a ect habit, at least at the steady state. The Appendix shows that
under these conditions, the function is:
p
( ) = 1 1 + 2( ln ) 1
(7.39)
q
where = 0 1 = . In turn, this function implies that the short-term rate in Eq. (7.38)
1
1 2
(1
).12
is: = +
0
2 0
2
The next picture depicts the function in Eq. (7.39), obtained using the parameter values in
Campbell and Cochrane, = 2, 0 = 0 0150, = 0 870. It is decreasing in , and convex in ,
over the empirically relevant range of variation of .
l(s)
50
40
30
20
10
0
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
These properties are inherited by the Sharpe ratio in Eq. (7.37). Precisely, the Sharpe ratio
is countercyclical and moves asymmetrically over the business cycle because habit is predetermined near the steady state, and the short-term rate constant (or, at least, a ne in ln , as in
the Appendix).
The model makes a number of important predictions. Consider, rst, the instantaneous util1
ity in Eq. (7.32). By Eq. (7.34), relative risk aversion is equal to
. That is, risk aversion is
,
0. It is countercountercyclical. Formally, the stochastic discount factor is
0
0
cyclical because both and
are procyclical;
moreover, it is more volatile than the standard
Appendix considers a slightly more general model, in which the short-term rate is a ne in ln .
308
c
by
A. Mele
where the risk-adjusted discount rates, R ( ), and the wedge over them, W ( ), are given by:
R( ) = ( ) +
( )
W( )=
( )
( ) ( )
( )
(7.40)
and () denotes the di usion coe cient of in Eq. (7.35), ( ) = 0 ( ). The mechanism
is sensible. Intuitively, during economic downturns, the surplus consumption ratio decreases
and agents become more risk-averse. As a result, prices decrease and expected returns increase;
moreover, the model leads to realistic risk premiums. Note that a su cient condition for these
e ects to occur is that the price-dividend ratio is concave, which also ensures a countercyclical
wedge, W 0 ( )
0. However, the economy is one with high risk-aversion, as on average, the
1
calibrated model produces a value of
with an average value of approximately 40.
By Eq. (7.35), the log of is a mean-reverting process. By taking logs, we are sure that
remains positive. Moreover, ln is also conditionally heteroskedastic since its instantaneous
volatility is 0 . Because is decreasing in , and is clearly procyclical, the volatility of ln
is countercyclical. This feature is responsible of many interesting properties of the model, such
as countercyclical returns volatility.
Finally, the Sharpe ratio in Eq. (7.37) is made up of two components. The rst is
0,
which coincides with the Sharpe ratio predicted by the standard Gordons (1962) model. The
second is
0 ( ), and arises as a compensation related to the stochastic uctuations of the
habit, = (1
). Therefore, is countercyclical due to the functional form of . Combined
with a high , this assumption leads to slowly varying, countercyclical expected returns. Finally,
the model suggests that the price-dividend ratio is concave in .13
We now explain the link between convexity of and concavity of the price-dividend ratio in
this model. We rely on the canonical pricing problem of Section 7.4, and appeal to Theorem
7.1. What is the price-dividend ratio in this economy? Note that the short-term rate is constant
in this model, as discussed. Yet for sake of generality, assume it is state-dependent, although
only a function of (as, e.g., in the Appendix). The price-dividend ratio is then as in Eqs.
(7.15)-(7.16)-(7.18), with constant growth, i.e., ( ) = 0 :
Z
0 R( )
( )=
E
(7.41)
0=
0
+ ( )
where
( ) = ( ) +( 0
( ) = (1
( )) ( )
( ) = 0 ( )
1 2 2
) ( ln ) + 2 0 ( )
309
c
by
A. Mele
0. Then, the
(ii) Suppose that the price-dividend ratio is procyclical. Then, the price-dividend ratio is also
a concave function of as soon as the risk-adjusted discount rates are convex in , viz
R00 ( ) 0, and 00 ( ) 2R0 ( ).
The previous statements impose joint restrictions on the primitives such that the pricedividend ratio is consistent with properties given in advance. The economic interpretation of
the convexity of R is similar to that anticipated in Section 7.3 (see Figure 7.7). In terms of
the Campbell-Cochrane economy, countercyclical volatility arises because R is decreasing and
convex in the surplus consumption ratio , such that is concave in .
The mechanism is the following. In bad times, consumption gets close to the substistence
level , such that is very small. Risk aversion is high as a result, and the agent becomes more
reluctanct to invest in the stock market. That is, in bad times, risk-adjusted discount rates,
R, increase sharply, thus making the price-dividend ratio quite responsive to changes in the
economic conditions. Instead, in good times, and again due to convexity, R changes relatively
less, such that the price-dividend ratio changes relatively less in response to changes in the
economic conditions.
In other words, the model is such that risk aversion becomes extremely large in bad times.
Technically, R is su ciently convex in , such that the price-dividend ratio is concave in .
Then, stock volatility increases on the downside, i.e., it is thus countercyclical, as illustrated by
Figure 7.6. Note that these properties arise because the risk-adjusted rate, R, is su ciently
convex in the surplus consumption ratio. The Appendix does indeed provide an upper bound
to convexity that triggers these properties.14
One di culty with this model is that its predictions are driven by a single state variable,
the surplus consumption ratio, . One implication is that the conditional correlation between
consumption growth and stock returns is one. In the data, this correlation is much lower.
Naturally, the model predicts this correlation is unconditionally less than one, although still
too large, once compared with that in the data.
Brunnermeier and Nagel (2007) nd that US investors do not change the composition of
their risky asset holdings in response to changes in wealth. The authors interpret this evidence
against external habit formation. Naturally, time-varying risk-premiums do not exclusively arise
through external habit formation. Barberis, Huang and Santos (2001) develop a theory distinct
from habit formation, which leads to time-varying risk-premiums. The next chapter explains
there are many instances of economies in which risk-premiums are time-varying as a result of
alternative mechanisms.
7.5.4 Large price swings as a learning induced phenomenon
We now develop models in which expected dividend growth is unobserved. This leads to a natural question: How do agents process available data while they formulate their guesses regarding
the growth of their economy? Inevitably, these guesses lead the agents to face situations with
14 Alternatively, Mele (2007) shows that for any model in which the price-dividend ratio is driven by a di usion variable
, there
such that the price-dividend ratio is concave for all
whenever lim 0 R ( ) = . Note indeed that the
is a threshold
Campbell-Cochrane model fails to satisfy restrictions (i) and (ii) over the entire range of variation of , althrough then it satises
lim 0 R ( ) = . There exists additional models with external habit formation that lead to countercyclical volatility (see, for
example, Menzly, Santos and Veronesi, 2004; Mele, 2007).
310
c
by
A. Mele
stochastic expected growth, such that in addition to consumption, expected gorwth becomes
a new state variable with the potential to introduce interesting price dynamics. Note that although the focus of this section regards models with unobserved expected growth, models with
observed expected dividend growth have always had an interest on their own (see, e.g., the
early survey of Campbell, 2003), as explained in more detail in Chapter 8.
7.5.4.1 The information channel
Time variation in stock volatility may also arise due to the agents learning about the economic
fundamentals. In models along these lines, public signals about the fundamentals hit the market, and agents make inference about them, thereby creating new state variables driving price
uctuations, which relate to the agents own guesses about the (unknown) state of the economic
fundamentals. Timmermann (1993, 1996) provides models with exogenous discount rates and
learning about the fundamentals. The e ects of learning increase stock volatility beyond the extent explained by a model with known fundamentals. Brennan and Xia (2001) generalize these
models to a stochastic general equilibrium. Veronesi (1999) provides a rational expectations
model with learning about the fundamentals, with nonlinear e ect regarding the asset price.
This section provides details about the mechanisms through which learning a ects asset prices
in general, and stock volatility in particular.
We shall assume that information about the fundamentals is incomplete, but symmetrically
distributed among agents. The assumption of symmetric information might appear strong. It
should not. The models in this section aim to capture the idea that markets function in a
context of incompressible uncertainty, where agents are all unaware of the crucial aggregate,
macroeconomic developments a ecting asset prices. Chapter 9 reviews models with both di erential and asymmetric information, which are more useful whilst thinking about the functioning
of markets for individual stocks. In these markets, it is plausible to assume that agents have
di erent information sets, and that acquire information in dedicated information markets. By
contrast, it seems unrealistic to assume that one could acquire crucial information about ongoing business cycle developments and that agents are, then, asymmetrically informed about
it, such that uninformed agents can learn from the asset prices: the cost of acquiring such
information appears to be incommensurable.
Note that the assumption of symmetric information simplies the analysis, as the agents do
not need to base their decisions upon the observation of the equilibrium price. In a context with
asymmetric information, agents can, instead, learn pieces of information other agents have, by
reading the equilibrium price, because agents with superior information impinge part of their
information on the asset price, through trading, as explained in Chapter 9. This complication
does not arise in the model of this section. Agents, now, need only to condition upon the
realization of signals, which convey information about the fundamentals. There is no need for
any agent to condition on prices, because prices merely convey the same information any such
agent already has.
7.5.4.2 An introductory model of learning
where and
are independently distributed, with
Pr( = ) = 1
Pr( = ) = Pr( =
) = 12 . Suppose that the state is unobserved.
311
(7.42)
Pr( =
), and
c
by
A. Mele
How should we update our prior probability of the good state after observing ? A
simple application of Bayes Theorem yields the posterior probabilities Pr( = | ) in Table
7.4. Considered as a random variable dened over the observable states , the posterior probability Pr( = | ) has expectation [Pr ( = | )] = and variance
[Pr ( = | )] =
1
(1
). It is an inverse U-shaped function of , and takes a value of zero exactly when the
2
prior on the state is degenerate (zero or one).
=2
Pr ( =
Pr(
=
(observable state)
2
2 = 0
3 =
1
1
(1
)
2
2
0
1
2
)
)
Pr (
Pr ( | )
Pr ( | )
= Pr ( ) P
Pr ( )
Pr ( | ) Pr (
(7.43)
1)
= Pr ( = )
Pr ( = 1 | = )
Pr ( = 1 | = )
=
Pr ( = 1 )
Pr ( = 1 )
But Pr ( = 1 | = ) = Pr ( = 1
) = Pr ( = ) = 12 . Moreover, we have that
Pr ( = 1 ) = 12 . This leaves Pr ( = | = 1 ) = 1. Its trivial, but one proceeds similarly while determining the other probabilities.
This simple example illustrates the main ideas underlying Bayesian learning. However, it leads
to a nonlinear lter, , which di ers from those we usually encounter in the literature (see, e.g.,
Chapters 8 and 9 in Liptser and Shiryaev, 2001 ) (LS, in the sequel), where the instantaneous
variance of the posterior probability changes,
say, is proportional to 2 (1
)2 , not to
(1
).
This distinction arises due to technical reasons, notably because
is a discrete random
variable. Indeed, assume that has some arbitrary, but continuous density , and zero mean
and unit variance. Let ( ) Pr ( = |
). By the Bayes rule in Eq. (7.43),
( ) = Pr ( = )
But Pr (
Pr ( =
Pr (
Pr (
| = )
| = ) Pr ( = ) + Pr (
| =
| = ) = Pr ( =
)= (
) and, similarly, Pr (
+ ) = ( + ). Therefore, simple calculations leave
( )
= (1
(
(
312
)
( + )
) + (1
) ( + )
) Pr ( =
| =
)
)=
(7.44)
c
by
A. Mele
(7.45)
=
|( )
(7.46)
+ 0
(1
0)
). In
(7.47)
1
|( )
).15
where
(the lter) and
0 (
It is possible to show that if = (resp. =
), then, lim
= 1 (resp. 0) a.s. It
is the Strong Law of Large Numbers for Brownian motions (e.g., Karatzas and Shreve, 1991):
lim
=
or
according to whether =
or =
. Intuitively, if = ,
the Brownian noise in Eq. (7.45) will be dominated in the long-run, such that
becomes
arbitrarily large, which leaves any agent condent that = . In other words, in this model,
the agents are able to gure out the truth in the long-run. Below, we shall specify an alternative
model in which the agents can never completely learn.
7.5.4.3 Pricing implications
0) + 0
=(
(7.48)
= ( ) + ( )
where ( ) (
) ( + ) 0 , is a Brownian motion under the risk-neutral probability,
+ 0
=(
0 )
(7.49)
= ( ) + ( )
and, again, is a risk-premium, assumed to be constant. The instantaneous volatility of the
expected dividend growth, , is inverse U-shaped in this example, too. In the presence of positive
compensation for risk, 0, the risk-neutral drift of (i.e., ( )) is, then, a convex function
of . Our discussion of the canonical economy in Section 7.4 suggests that this property can
15 This construction is heuristic, but it can be made rigorous (see LS, Theorem. 8.1 p. 318 and Example 1 p. 371). In particular,
is a Brownian motion with respect to the agents information set ( ,
).
it can be shown (LS, Theorem 7.12 p. 273) that
313
c
by
A. Mele
imply that the asset price is convex with respect to expected growth, . Note that this convexity
arises only once 0: only in this case would the risk-neutral drift of be convex in .
The economic interpretation of the convex drift of is simple. An utility maximizing representative agent requires compensation for the dividend risk, but also for the risk regarding
his estimated probability of living in the good state. In very good and in very bad times, this
probability is either one or zero, such that there is not risk to be compensated for regarding
this probability (i.e., () is close to zero), whence the convexity of risk-adjustment. In the
next subsection, we develop more intuition on these risk-adjustements, relying on the portfolio
choices of a representative agent.
The implications of this convexity is that prices undergo large swings in good times. The
economic mechanism is the following. In good times, that is, after a string of repeated positive
news on growth, the agents nd it very likely that the true dividend drift is = . In particular,
the higher
, the more likely it is that the asset is good. In these states of the world, the
agents do not feel exposed to errors on , and therefore require little risk-aversion corrections
regarding
(only regarding realized dividend growth,
).16 Note that in bad times, the
agents are not exposed to these errors either, such that asset prices fall, albeit moderately so.
In other words, a convex price might lead to uctuations that an econometrician (say) could
interpret as being generated by a bubble even if no bubbles are present whatsoever.
Finally, it is instructive to provide the expression for expected returns and volatility predicted by this model, based on the analysis of Section 7.5.2. For example, assuming that the
representative agent has CRRA equal to , we have that = 0 , such that expected returns
are
0
E =
+ 20 + (( )) 0 ( )
(7.50)
| {z }
|
{z
}
R
and volatility is
Vol =
0+
( )
( )
( )
(7.51)
We now rely on Theorem 7.1 and analyze these convexity properties in two famous (and more
general) models of learning, one of them generalizing Eqs. (7.48).
7.5.4.4 With Bellman
The previous explanations regarding risk-aversion corrections can be illustrated while solving
the agents dynamic programming problem. In particular, we shall show, it is the hedging
component of the agents portfolio choice to determine how much the agent is willing to invest
in the asset in periods of uncertainty.
[In progress]
7.5.4.5 Convexity again, and two models of learning
The model in Eqs. (7.48) can be considered as a special case of that considered by Veronesi
(1999), in which an innitely lived agent has constant absolute risk aversion equal to
0,
and observes realizations of , generated by:
=
(7.52)
16 In terms of Eqs. (7.49), the drift of dividend growth is tilted to the left due the risk-adjustment term,
0 . Moreover, we also
have that
is tilted to the left, due to the negative drift, ( ). The e ects of ( ) are small in very good and very poor
times, bacause () is small in these cases. The e ects of 0 are independent of the state of the economy.
314
c
by
A. Mele
where 1 is a Brownian motion, and is the expected dividend change, supposed to follow a
two-state ( ) Markov chain. (See, also, David, 1997, for a related model.) In Eqs. (7.48), the
dividend is a type, in that once nature draws , this is there forever. In Eqs. (7.52), instead,
is allowed to change, according to a Markov chain, as explained.
The key aspect in this economy is that the expected dividend change, , is unobserved. As
a result of this lack of knowledge, the agent attempts to learn about the state in which he is
living, through Bayesian learning. The resulting economy is one generalizing that in the previous
section. In this economy, the price is the same as that in a full information economy in which
the dividend is solution to
=
+ 0
(7.53)
= (
) + ( )
2
=(
+ 0
0)
(7.54)
= ( (
)
+ ( )
0 ( ))
Veronesi (1999) also assumes the riskless asset is innitely elastically supplied, and therefore
that the interest rate is a constant.
Let us analyze the properties of the equilibrium price predicted by this model. It is easy to
see that given Eq. (7.54), the asset price is:
Z
Z
=
(
)=E
(
)
where
)=
2
0
and
E(
| )
(7.55)
The conditional expectation, E ( | ) can be read as a special case of the canonical price
in Eq. (7.13), namely, for = 0 and ( ) = . By Theorem 7.1-(ii), E ( | ) is convex in
whenever the drift of in Eq. (7.54) is convex. This condition always holds true because
0.
That is, the conditional expectation of in Eq. (7.55), inherits the same second order properties
(convexity) of this drift function.
The economics behind these convexities is the same as that behind Eqs. (7.48) or (7.49). Prices
are convex in the expected dividend growth because of a kind of speculative enthusiasm: after
a sequence of high realized growth, investors believe that the likelihood is high they are living in
good times, and that the likelihood is low they are making mistakes in their assessment. Thus,
asset prices increase fueled by low risk-premium discounts. Note that these low premiums regard
315
c
by
A. Mele
the uncertainty the agents face while assessing the world in which they live. Their speculative
enthusiasm is rational, so to speak, i.e., not determined by animal spirits.
These properties hold as we are assuming that the riskless asset is innitely elastically supplied. When investors demand for the safe assets a ects the interest rate, the interest rate
becomes a function of the expected growth, . Thus when increases, the increase in the asset
price is mitigated by the fact that relatively less savings are now being made, reecting the fact
that more resources are likely to be available in the future. The interest rate increase. Formally,
1 2 2
by Eq. (7.29), the short-term rate is ( ) = +
0 , such that the asset price is, now,
2
Z
Z
(
)
0
=
(
) E
(
)
where
)=(
)
2
0
)
+Z
( )
c
by
A. Mele
(
)=
)=
1
( )=
( ) = (
),
+
0
and
is the positive solution to 1 ( ) = 21 + 22 2 .17 By Eq. (7.30), the risk-premium
is constant, and equal to =
is the CRRA coe cient, and by Eq. (7.29), the
0 , where
short-term rate is linear in . Therefore, and by the same reasoning leading to Eq. (7.41), the
price-dividend ratio is independent of , and is given by:
Z
(
( ))
0
0
( )=
(7.56)
E
where
=( ( )+(
) ( ))
+ ( )
(7.57)
and is a -Brownian motion.18 The two functions, ( ) and ( ), are momentarily left
unspecied, as we wish to provide general results within this model.
Under regularity conditions, monotonicity and convexity properties are inherited by the inner
expectation in Eq. (7.56). Precisely, in the notation of the canonical pricing problem, we have
that ( )
+ ( ) + 0 and ( )
( )+( 0
) ( ). Therefore, by Theorem 7.1, we
have:
(i) If
( )
(ii) Suppose that the price-dividend ratio is increasing in . Then, it is also convex in
2
) ( ))
2 + 2 0 ( ).
whenever 00 ( ) 0 and 2 ( ( ) + ( 0
For example, assume that the short-term rate is constant (it could be in innitely elastic
supply). Then, the price-dividend ratio is increasing and convex in the expected dividend growth
if:
2
( )+(
) ( ))
These conditions are satised by Brennan and Xia (2001). [...] Provide economic interpretation.
[in progress] Moreover, explain that this is due to the fact the inner expectation in Eq. (7.56)
is indeed one for an a ne model to be introduced and explained in full detail in Chapter 12.
Finally, expected returns and returns volatility have the same expression as in Eqs. (7.50) and
(7.51).
7.5.5 Linearity-generating processes
The focus of the previous sections is a search for pricing kernels that make asset pricing models qualitatively consistent with countercyclical statisticsa search relying on theoretical test
conditions, those emanating from Theorem 7.1 in Section 7.4.
17 Brennan and Xia (2001) actually consider a slightly more general model, where consumption and dividends di er. They derive
a model with a reduced-form identical to that in this example. In the calibrated model, Brennan and Xia found that the variance
of the ltered is higher than the variance of the expected dividend growth in an economy with complete information. The results
in this example can be obtained through an application of theorem 12.1 in Liptser and Shiryaev (2001) (Vol. II, p. 22). They
on
generalize results in Gennotte (1986) and are a special case of results in Detemple (1986). Both Gennotte and Detemple did not
emphasize the impact of learning on the pricing function.
18 By Girsanovs theorem,
=
=
+
are Brownian motions under and under . That is, =
0 and
+(
)
,
whence
Eq.
(7.57).
0
317
c
by
A. Mele
We can actually use Theorem 7.1 for another (somehow surprising) purpose: a search for
asset pricing models that have a closed-form solution. The idea is simple. Theorem 7.1-(ii) provides conditions under which price-dividend ratios can be either concave or convex in the state
variables driving them. Specically, consider the representation of the price-dividend ratio in
Eqs. (7.15)-(7.16)-(7.18), which ts the canonical pricing problem in in Section 7.4, as observed,
once we identify the primitives of the models with Eqs. (7.19), reported here for convenience,
( )
where as usual, R ( )
( )+
( )
R( )
( )
0 CF
( )
( )= ( )
( )
( )
( )+
0 1
( )
=1
and ( ) denotes the drift of under the physical probability. Then, given that
we have that by Theorem 7.1-(i),
is:
If 00 = 2 (R0
concave in
if (R00
convex in
if (R00
00
= 0,
00
)
00
)
0 and
0 and
00
00
0
0
(7.58)
and R00
00
=0
(7.59)
Gabaix (2009) is the rst to note that price-dividend ratios are a ne in the state variables
driving them, should the drift of these state variables be quadratic. His remarks are consistent
with Theorem 7.1, and the two conditions in (7.59). In fact, Gabaix develops a unied theory of linearity-generating processes, which generalizes the single state variable framework
underlying Theorem 7.1 and the related conditions in in (7.59).
To illustrate these facts, consider a model that ts the class of linearity-generating processes,
one of external habit formation by Menzly, Santos and Veronesi (2004) (MSV, henceforth). In
this model, a representative agent maximizes,
Z
=
ln (
)
(7.60)
0
where
is external habit. Relative risk-aversion equals the inverse of the surplus consumption ratio, 1 , with
=
, which in equilibrium equals
, where
is consumption
endowment.
MSV assume that the surplus consumption ratio is a continuous-time autoregressive process,
solution to,
1
1
1
1
1
=
(7.61)
0
318
c
by
A. Mele
1
1
2 2
0
2 2
0
1
2
+(
CF ( ))
Note that the drift function of the surplus ratio under is quadratic, a property that would
crucially lead to the rst condition in (7.59) to be satised.
)
By results in Section 7.5.1, the market Sharpe ratio is (
) = 1 ( 0 ( ) ), where (
is the instantaneous volatility of the habit level, = (1
) and, by Itos lemma, equals
(
) =
)
Vol ( ), such that, by Itos lemma again, Vol ( ) =
1
.
0 (1
0
Therefore, the market Sharpe ratio equals,
CF
( )
)=
1+
which is countercyclical. Finally, again by results in Section 7.5.1, we can also infer that:
( )= +
2
0
+
1
2
0
R0 ( ) =
00
( )=
( )=0
As a result, the two conditions in (7.59) are satised, and the price-dividend ratio is a ne
in . We can check that the price-dividend ratio is a ne in , through a direct computation.
Denote the instantaneous utility with ( ) ln (
), with = . By the usual asset price
representation, we have that the price-dividend ratio, ( 0 ), is,
Z
( 0) =
=
=
=
1
+
Z0
Z0
(
)
(
)
0 0
1
1
+
( + )
319
1
0
c
by
A. Mele
Figure 7.12 depicts the price-dividend ratio as a function of the current surplus consumption
ratio, using the following parameter values, = 0 04, = 0 15, and = 0 03.
P/D
35
30
25
20
15
10
5
0.00
0.01
0.02
0.03
0.04
0.05
FIGURE 7.12. Price-dividend ratio for the aggregate consumption claim predicted by the
Menzly, Santos and Veronesi (2004) model of external habit formation.
, is solution to,
2
such that,
( )=
R( ) = +
( )=
+(
It is straightforward to see that the two conditions in (7.59) hold truethe price-dividend ratio
is a ne in . Indeed, assuming that the price-dividend ratio is independent of , we have that
it satises the following di erential equation,
2
0 1 2 00
+ 0
+
+1=(
)
(7.62)
2
Let us conjecture, now, that the price-dividend ratio is a ne in , i.e. there are two constant
and such that
( )= 0+ 1
Replacing the previous expression into Eq. (7.62) allows us to pin down the two expressions for
0 and 1 such that the solution for the price-dividend ratio is,
+
( )= 2
0
c
by
A. Mele
+ (
1) +
>
(7A.1)
= + (1
) 1
(7A.2)
= + 1+ > 1
where is the lag operator,
1 , such that yearly expected returns could be dened just as in
Eq. (7.2) in the main text, with ( + ) denoting the projection of +1 based on model (7A.2).
To calculate the projection terms in Eq. (7.2) obviously necessitates specifying the dynamics of
in (7A.2). We proceed, alternatively, as follows. First, we estimate Eq. (7A.1). Second, we reconstruct
a time series of monthly expected returns, in the second of Eqs. (7A.2) given the estimates of ,
and . Third, we t an AR(1) model to the reconstructed series , and calculate projections on this
series to calculate E in Eq. (7.2).
321
c
by
A. Mele
2
2
( ) =
(1
), to their sample counterparts = 1 0594 and 0 = 0 0602
0
obtained with US aggregate dividend data. The result is ( ) = (0 158 0 082). Given these calibrated
and .
values of ( ), we x = 1 0%, and proceed to calibrate the probabilities
), we need an explicit expression for all the payo s at each node. We rely on
To calibrate (
, which we obtain in closed-form, as follows. For each state
{
},
the price of the claim,
is solution to,
(7A.3)
+ (1
+ (1
)
)
(7A.4)
We calibrate (
= ) to make the average price-dividend (P/D henceforth) ratio
,
the good P/D ratio 2 and the bad P/D ratio 2 in Eq. (7A.4) perfectly match the average
P/D ratio, the average P/D ratio during NBER expansion periods, and the average P/D ratio during
), we
NBER recession periods (i.e. 31.99, 33.21 and 26.20, from Table 7.1). Given (
=
compute
P/D ratios
in states
and . For example, the price of the asset in state
is,
the
2 +
[
) (1 +
)]. Given
, we compute the log-return in the bad state as
+ (1
ln(
), where
with probability
with probability 1
1+
Then, we compute the return volatility in state . The P/D ratios, the expected log-return and
return volatility in state are computed similarly. (Please notice that volatilities under and under
{ } {
} are not the same.)
Next, we recover the risk-aversion parameter
in the three states
{
} implied by the
and =
. As we shall show below, the relevant formula
previously calibrated probabilities ,
to use is,
=
+ (1
(7A.5)
The values for the implied risk-aversion parameter in Table 7.2 are obtained by inverting Eq. (7A.5)
).
for , given the calibrated values of (
is the Sharpe ratio,
Finally, we determine the risk-adjusted discount rate as + 0 , where
which we shall show below to equal,
=p
(1
322
(7A.6)
c
by
A. Mele
Proof of Eq. (7A.5). We only provide the derivation of the risk-neutral probability
, since
the proofs for the expressions of the risk-neutral probabilities
and =
are nearly identical. In
equilibrium, the Euler equation for the stock price at the bad node is,
#
"
i
0 ( )
+
+
=
{
}
(7A.7)
=
0 (
)
where: (i) is the discount
is state dependent and
rate; (ii) the utility function for consumption
(1
( )= 1
); (iii) () is the expectation taken under the physical probability
equal to,
2 and
; and (iv) the dividend and the gross dividend growth rate are either =
2
=
=
with probability , or
= 1 and
= 1 = with probability 1
.
In the model, the asset is elastically supplied or, equivalently, there exists a storage technology with
a xed rate of return equal to = 1%. Let us derive the agents private evaluation of this asset. The
Euler equation for the safe asset is,
P
=
(
)=
(7A.8)
, is state dependent,
and
(1
=1
. Therefore,
(7A.9)
+ (1
+ (1
(7A.10)
Std
(1
(1
)= +
(1
from which Eq. (7A.6) follows immediately. Note, also, that in terms of this denition of the Sharpe
ratio, the risk-neutral expectation of the dividend growth is, E ( ) = ( )
0.
323
c
by
A. Mele
( )
+ ( )
+
..
.
+
1
1
1
1
= |{z} |{z}
where
..
.
(7A.11)
324
c
by
A. Mele
+(
1)
is solution to,
(
2 2
RR
(7A.12)
Under regularity conditions, the Feynman-Kac representation of the solution to Eq. (7A.12) is,
Z
(
)
(
)=
0
where,
(
)=E
( )
and
is the stochastic discount factor:
= , 0 = 1.
0
Alternatively, we represent the price under the physical probability. Given the previous assumptions, we have that necessarily satises,
=
( )
1(
2(
(7A.13)
Next, dene the undiscounted Arrow-Debreu adjusted asset price process, dened as:
(
where
) (
) is as in Eq. (7.26),
(
0 (
)=
By results in Section 7.4.2, we know that the following price representation holds true:
Z
=
0
Under regularity conditions, the previous equation can then be understood as the unique Feynman-Kac
stochastic representation of the solution to the following partial di erential equation
(
where
)+ (
)= (
) (
325
RR
[ ]
c
by
A. Mele
=0
sign (
) = constant on
remains bounded on (
sign ( ) =
(7A.14)
sign ( )
(7A.15)
Figure 7.A.1 illustrates the intuitive reasons leading to Eq. (7A.15). Consider the following heuristic
arguments. Note that
Z
Z
0=
= +
=
=
such that Eq. (7A.15) holds.
still satises Eq. (7A.14), but that at the same time,
Next, suppose that
and time,
a state variable
=
where
is some function of
satises:
=
(7A.16)
Because
= (
(7A.17)
)=
(7A.18)
Therefore, and extending Eq. (7A.15), we have the following result. Suppose that
(
). Thenn, by Eq. (7A.18):
and that sign ( ) = constant on
sign ( (
)) =
= (
) = 0,
sign ( )
These results can be extended to stochastic di erential equations. Consider the more elaborate
operator-theoretic format version of Eq. (7A.17), the one that arises in typical asset pricing models
with Brownian motions:
0=
326
(7A.19)
c
by
A. Mele
x(t)
<0
x(T)
t
>0
x(T)
t
x(t)
FIGURE 7A.1. Illustration of the maximum principle for ordinary di erential equations
Let
+
+
+
=
where
is a local martingale, and the last equality holds due to Eq. (7A.19). But if
Z
+
= = ( )=
is a martingale,
where
is the expectation taken with respect to the information at time . Therefore, conclude as
= 0 and that sign ( ) = for each . Then sign ( ) = for each .
follows. Assume that
327
c
by
A. Mele
(7A.20)
Theorem 7.A.1. (Dynamic Stochastic Dominance) Consider two economies A and B with two
fundamental volatilities
and
and let ( )
( ) ( ) and ( ) ( =
) the corresponding
, the price
in economy A is lower than the price price
risk-premium and discount rate. If
in economy B whenever for all (
) R [0 ],
(
( )
( ))
( )
( ))
)+
1
2
( )
( )
(7A.21)
0 = 2(
( 0) = ( )
)+
) R [0
R
(7A.22)
(
) = 12 2 ( )
(
)+ ( ) (
) and subscripts denote partial derivatives. Clearly,
where
and
are both solutions to Eq. (7A.22) but with di erent coe cients. Let ( )
( ).
0( )
(
)
(
) is solution to
For each ( ) R [0 ), the price di erence
(
)
0=
2(
)+
1
2
( )
)+
( )
( )
)+ (
with
( 0) = 0 for all
R, and
is as in Eq. (7A.21) of the theorem. The result follows by
results given in Appendix 5 to this chapter.
328
c
by
A. Mele
1 2
1 2
(1)
(1)
0
(1)
)+
( ) (
) + ( ) + ( ( ))
(
0= 2 (
2
2
0
0
( ) (1) (
)
( ) (
)
( )
with
(1) (
0) =
0=
( )
R, and (
(2)
2 (
( )
0
2 ( )
)+
2 0( )
00
( )
R [0
1 2
( ) (2) (
2
1 2
00
( ( ))
2
(1)
R++ [0
),
),
)+
(2)
( ) + ( 2 ( ))0
00
( ) (
(2)
with (2) ( 0) = 00 ( )
R. By results reviewed in Appendix 5 to this chapter, (1) (
) 0
R. This
(resp. 0) ( ) R [0 ) whenever 0 ( ) 0 (resp. 0) and 0 ( ) 0 (resp. 0)
completes the proof of Part (i) of the theorem. The proof of Part (ii) is obtained similarly.
Remark 7.A.2. (Alternative proof) An alternate proof can directly rely on the convexity of the
payo function, and a result due to Hajeks (1985). This result says that if is increasing and convex,
and 1 and 2 are two di usion processes, both starting o from the same origin, with integrable
drifts 1 and 2 and volatilities 1 and 2 , then, [ ( 1 )]
[ ( 2 )], whenever 1 ( )
2 ( ) and
(0 ). This result generalizes classic comparison theorems (e.g., Karatzas and
1( )
2 ( ) for all
Shreve, 1991, p. 291-295), where is an increasing function and 1
2.
Remark 7.A.3. (Bounds on convexity) An inspection of the proof of Theorem 7.1 reveals that
concavity of prices holds under a less restrictive conditions than that given in Part (ii) of the theorem.
We would only need that,
0
00
00
2 ( )
( ) (1) (
)
( ) (
) 0
or that,
00
( )
(1)
0
2 ( )
00
( )
(1) (
)
)
(7A.23)
Assuming that 0
0, such that
0, the inequality (7A.23) imposes a theoretical upper bound
to convexity of discount rates such that prices are concave.
329
c
by
A. Mele
) (
1
2
ln )
2 2
0 (1 +
( ))2 =
const + (
ln )
(7A.24)
for some , and where const is to be determined. The working paper version of Campbell and Cochrane
(1999) considers exactly this case.
Dene the log of the surplus ratio as
ln 1
(7A.25)
where
ln ,
steady state
ln 1
where
ln and
(
+1
),
+ 1
and
1
(
)+ ( )
+1
where
( ). Replacing Eq. (7A.26), evaluated at
previous approximation, and rearranging terms, leaves:
+1
+1
(7A.26)
( )
)+
and
1
2
2
0
(7A.27)
+1
1
2
2
0
(7A.28)
The function in Eq. (7.39) is found by imposing the following three conditions, where the rst is
a slight generalization of that mentioned in the main text, and the remaining two are the last two
conditions in the main text:
First, the short-term rate in Eq. (7.38) is a ne in ln , i.e. Eq. (7A.24) holds, such that:
s
1
2
)
) ( ln ) + 2 2 const 1
(7A.29)
( ) = 2 2 2 ( (1
0
+1
(7A.30)
Evaluating Eq. (7A.29) at the steady state , and using the previous condition delivers,
=
1
,
2
2
2 2
0
const
1
2 2
0
( (1
330
) (
ln ) +
1
2
(7A.31)
c
by
A. Mele
=0
(7A.32)
of
+1 . By the denition
(
+1 )
the log-surplus consumption ratio in Eq. (7A.25), we have that
+ +1 ,
+1 = ln 1
where ( +1 )
+1 and
+1 is as in Eq. (7A.27), such that, using Eq. (7A.27):
+1
+1
( )
=1
+1 ,
expressed as a function of
1
(
+1 )
+1 )
( ) =
=1
)
1
+1
( )
= 0, which leaves,
By taking the derivative in Eq. (7A.31), and replacing into the left hand side of the previous
equation, and solving for , yields,
s
2
(1
Finally, note that, now, the expression of the short-term rate can be found after simple computations:
1 2
1
( (1
)
) + ( ln )
( )= +
0
0
2
2
331
c
by
A. Mele
(
(
0)
say, satises:
0
0
1+
0+
1+
11
..
.
..
.
..
.
=1
= (
..
.
..
.
( ),
..
.
= Pr ( | )
min +
max
}=
, let
. Let
}=
, let
. Let
min
max
,
+
The previous algorithm avoids interpolations, and ensures that during the simulations, is computed
that is drawn. Precisely, once
is drawn, we proceed to
in correspondence of exactly the state
= max
the following two steps: (i) create the corresponding grid 1 = min , 2 = min +
according to the previous rules; and (ii) compute the solution from Eq. (7A.33). In this way, one has
is drawn.
( ) at handthe simulated P/D ratio when state
332
c
by
A. Mele
)=
(
) + (1
) (
+ (1
2
0
where the rst equality follows by a straightforward generalization of the arguments leading to Eq.
(7.44), and the second holds by the assumption that 0
is Gaussian with mean zero and variance
2 . Hence, we have that (
)
is
a
function
of
the
dividend
only. To simplify notation, let ( )
0
). We have,
(
2
2
1
1
2
2
2
0
2
00
0
0
0
2 ( )
( )=2 ( )
( )= ( ) 2
1
2
0
Note that,
2
( )
(1
=
( )
2
0
and
( )=
|(
(2 (
1)
such that,
0
( )=
( ) (1
2
0
)=
)
(
(
), where
1
2
1
+
2
+
00
( ))
) (1
00
00
333
2
2
0
(7A.34)
(
(
( )
)
)
))
2
0
c
by
A. Mele
(
(
(
0 0)
3)
6)
and
= ( 1 2 3 )> denotes a vector standard Brownian motion, with being a two-state ( )
are constants. Let 3 5 6= 2 6 . Then, there are no CRRA representative agent
Markov chain, and
equilibria in which price-dividend ratios are convex in expected dividend growth. To demonstrate this
claim, we apply the ltering results of Liptser and Shiryaev (2001) (Vol. I), and nd that the previous
economy is isomorphic to one in which,
/
/
=
(
+
)
0
(
334
c
by
A. Mele
, and let (
)=E
when
As pointed out in Section 7.6, Theorem 7.1-(ii) implies that in scalar di usion models of the short0 whenever 00
2, where
term rate, such as those dealt with in Chapter 12, one has 11 ( 0 )
is the risk-neutralized drift of . This result, obtained by Mele (2003), can be proved through the
Feynman-Kac representation of 11 , and a similar proof can be used to show Theorem 7.1-(ii). This
appendix provides a more intuitive derivation under a set of simplifying assumptions. By Eq. (6) p.
685 in Mele (2003),
"Z
! R
#
2 Z
2
11 ( 0
Hence
11 ( 0
)=E
0 whenever
2
2
0
= ( )
is a constant. We have,
Z
1
0
= exp
( )
2
0
0
(7A.35)
is solution to:
where
2
0
2
0
and
2
0
=
0
00
( )
0
(7A.36)
2
Therefore, if 00 0, then 2
0, and by the inequality in (12.56), 11 0.
0
This result can be improved. Suppose that 00 2, instead of 00 0. By the second of Eqs. (7A.36),
Z
2
2
2
0
and consequently,
Z
2
2
0
335
c
by
A. Mele
References
Abel, A.B. (1988): Stock Prices under Time-Varying Dividend Risk: An Exact Solution in
an Innite-Horizon General Equilibrium Model. Journal of Monetary Economics 22,
375-393.
Abel, A.B. (1990): Asset Prices under Habit Formation and Catching Up with the Joneses.
American Economic Review Papers and Proceedings 80, 38-42.
Andersen, T. G., T. Bollerslev and F.X. Diebold (2002): Parametric and Nonparametric
Volatility Measurement. Forthcoming in At-Sahalia, Y. and L. P. Hansen (Eds.): Handbook of Financial Econometrics.
Bajeux-Besnainou, I. and J.-C. Rochet (1996): Dynamic Spanning: Are Options an Appropriate Instrument? Mathematical Finance 6, 1-16.
Barberis, N., M. Huang and T. Santos (2001): Prospect Theory and Asset Prices. Quarterly
Journal of Economics 116, 1-53.
Barsky, R.B. (1989): Why Dont the Prices of Stocks and Bonds Move Together? American
Economic Review 79, 1132-1145.
Barsky, R.B. and J.B. De Long (1990): Bull and Bear Markets in the Twentieth Century.
Journal of Economic History 50, 265-281.
Barsky, R.B. and J.B. De Long (1993): Why Does the Stock Market Fluctuate? Quarterly
Journal of Economics 108, 291-311.
Bergman, Y.Z., B.D. Grundy, and Z. Wiener (1996): General Properties of Option Prices.
Journal of Finance 51, 1573-1610.
Black, F. and M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal
of Political Economy 81, 637-659.
Brennan, M.J. and Y. Xia (2001): Stock Price Volatility and Equity Premium. Journal of
Monetary Economics 47, 249-283.
Brunnermeier, M.K. and S. Nagel (2007): Do Wealth Fluctuations Generate Time-Varying
Risk Aversion? Micro-Evidence on Individuals Asset Allocation. Forthcoming in American Economic Review.
Campbell, J.Y. (2003): Consumption-Based Asset Pricing. In: Constantinides, G.M., M.
Harris and R. M. Stulz (Editors): Handbook of the Economics of Finance (Volume 1B:
Chapter 13), 803-887.
Campbell, J.Y., and J.H. Cochrane (1999): By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior. Journal of Political Economy 107, 205-251.
Christiansen, C., M. Schmeling and A. Schrimpf (2012): A Comprehensive Look at Financial
Volatility Prediction by Economic Variables. Journal of Applied Econometrics 27, 956977.
336
c
by
A. Mele
Clark, T.E. and K.D. West (2007): Approximately Normal Tests for Equal Predictive Accuracy in Nested Models. Journal of Econometrics 138, 291-311.
Constantinides, G.M. (1990): Habit Formation: A Resolution of the Equity Premium Puzzle.
Journal of Political Economy 98, 519-543.
Corradi, V., W. Distaso and A. Mele (2013): Macroeconomic Determinants of Stock Volatility
and Volatility Premiums. Journal of Monetary Economics 60, 203-220.
David, A. (1997): Fluctuating Condence in Stock Markets: Implications for Returns and
Volatility. Journal of Financial and Quantitative Analysis 32, 427-462.
Detemple, J.B. (1986): Asset Pricing in a Production Economy with Incomplete Information.
Journal of Finance 41, 383-391.
Duesenberry, J.S. (1949): Income, Saving, and the Theory of Consumer Behavior. Cambridge,
Mass.: Harvard University Press.
El Karoui, N., M. Jeanblanc-Picque and S.E. Shreve (1998): Robustness of the Black and
Scholes Formula. Mathematical Finance 8, 93-126.
Fama, E.F. and K.R. French (1989): Business Conditions and Expected Returns on Stocks
and Bonds. Journal of Financial Economics 25, 23-49.
Fama, E.F. (2014): Nobel Lecture: Two Pillars of Asset Pricing. American Economic Review
104, 1467-1485.
Ferson, W.E. and C.R. Harvey (1991): The Variation of Economic Risk Premiums. Journal
of Political Economy 99, 385-415.
Fornari, F. and A. Mele (2013): Financial Volatility and Real Economic Activity. Journal
of Financial Management, Markets and Institutions 1, 155-198.
Gabaix, X. (2009): Linearity-Generating Processes: A Modelling Tool Yielding Closed Forms
for Asset Prices. Working paper New York University.
Gennotte, G. (1986): Optimal Portfolio Choice Under Incomplete Information. Journal of
Finance 41, 733-746.
Giacomini, R. and H. White (2006): Tests of Conditional Predictive Ability. Econometrica
74, 1545-1578.
Glosten, L., R. Jagannathan and D. Runkle (1993): On the Relation between the Expected
Value and the Volatility of the Nominal Excess Return on Stocks. Journal of Finance
48, 1779-1801.
Gordon, M. (1962): The Investment, Financing, and Valuation of the Corporation. Homewood,
IL: Irwin.
Hajek, B. (1985): Mean Stochastic Comparison of Di usions. Zeitschrift fur Wahrscheinlichkeitstheorie und Verwandte Gebiete 68, 315-329.
337
c
by
A. Mele
Huang, C.-F. and Pag`es, H. (1992): Optimal Consumption and Portfolio Policies with an
Innite Horizon: Existence and Convergence. Annals of Applied Probability 2, 36-64.
Jagannathan, R. (1984): Call Options and the Risk of Underlying Securities. Journal of
Financial Economics 13, 425-434.
Karatzas, I. and S.E. Shreve (1991): Brownian Motion and Stochastic Calculus. Berlin: Springer
Verlag.
Kijima, M. (2002): Monotonicity and Convexity of Option Prices Revisited. Mathematical
Finance 12, 411-426.
Liptser, R. S. and A. N. Shiryaev (2001): Statistics of Random Processes. Berlin, SpringerVerlag. [2001a: Vol. I (General Theory). 2001b: Vol. II (Applications).]
Ljungqvist, L. and H. Uhlig (2000): Tax Policy and Aggregate Demand Management under
Catching Up with the Joneses. American Economic Review 90, 356-366.
Malkiel, B. (1979): The Capital Formation Problem in the United States. Journal of Finance
34, 291-306.
Mehra, R. and E.C. Prescott (2003): The Equity Premium in Retrospect. In Constantinides,
G.M., M. Harris and R. M. Stulz (Editors): Handbook of the Economics of Finance (Volume 1B, chapter 14), 889-938.
Mele, A. (2003): Fundamental Properties of Bond Prices in Models of the Short-Term Rate.
Review of Financial Studies 16, 679-716.
Mele, A. (2005): Rational Stock Market Fluctuations. WP FMG-LSE.
Mele, A. (2007): Asymmetric Stock Market Volatility and the Cyclical Behavior of Expected
Returns. Journal of Financial Economics 86, 446-478.
Menzly, L., T. Santos and P. Veronesi (2004): Understanding Predictability. Journal of
Political Economy 111, 1, 1-47.
Pindyck, R. (1984): Risk, Ination and the Stock Market. American Economic Review 74,
335-351.
Paye, B.S. (2012): Dej`a Vol: Predictive Regressions for Aggregate Stock Market Volatility
Using Macroeconomic Variables. Journal of Financial Economics 106, 527-546.
Poterba, J. and L. Summers (1985): The Persistence of Volatility and Stock Market Fluctuations. American Economic Review 75, 1142-1151.
Romano, M. and N. Touzi (1997): Contingent Claims and Market Completeness in a Stochastic Volatility Model. Mathematical Finance 7, 399-412.
Rothschild, M. and J. Stiglitz (1970): Increasing Risk: I. A Denition. Journal of Economic
Theory 2, 225-243.
Rothschild, M. and J. Stiglitz (1971): Increasing Risk: II. Its Economic Consequences. Journal of Economic Theory 5, 66-84.
338
c
by
A. Mele
Ryder, H.E. and G.M. Heal (1973): Optimal Growth with Intertemporally Dependent Preferences. Review of Economic Studies 40, 1-33.
Schwert, G.W. (1989a): Why Does Stock Market Volatility Change Over Time? Journal of
Finance 44, 1115-1153.
Schwert, G.W. (1989b): Business Cycles, Financial Crises and Stock Volatility. CarnegieRochester Conference Series on Public Policy 31, 83-125.
Shiller, R.J. (2014): Nobel Lecture: Speculative Asset Prices. American Economic Review
104, 1486-1517.
Sundaresan, S.M. (1989): Intertemporally Dependent Preferences and the Volatility of Consumption and Wealth. Review of Financial Studies 2, 73-89.
Timmermann, A. (1993): How Learning in Financial Markets Generates Excess Volatility
and Predictability in Stock Prices. Quarterly Journal of Economics 108, 1135-1145.
Timmermann, A. (1996): Excess Volatility and Return Predictability of Stock Returns in
Autoregressive Dividend Models with Learning. Review of Economic Studies 63, 523577.
Veronesi, P. (1999): Stock Market Overreaction to Bad News in Good Times: A Rational
Expectations Equilibrium Model. Review of Financial Studies 12, 975-1007.
Veronesi, P. (2000): How Does Information Quality A ect Stock Returns? Journal of Finance
55, 807-837.
Wang, S. (1993): The Integrability Problem of Asset Prices. Journal of Economic Theory
59, 199-213.
339
8
Macronance
8.1 Introduction
This chapter discusses models that aim to address the empirical puzzles surveyed in the previous two chapters. The most prominent are the equity premium puzzle, its pattern over the
business cycle as well as that of aggregate stock volatility. Because these puzzles mainly relate
to the behavior of aggregate variables, it is natural to attempt to provide explanations hinging
upon macroeconomic variables such as consumption or output, whence the title of this chapter.
The equity premium puzzle is the di culty of the neoclassical model to predict expected
returns that are quantitatively consistent with those in the data. In particular, Chapter 6
explains that an implausibly high level of risk-aversion is needed to reconcile models with data.
Moreover, a high risk-aversion implies a low elasticity of intertemporal substitution and, hence,
an implausibly high volatility of the interest rates, which gives rise to the interest rate puzzle.
In the early attempts to address the equity premium puzzle, the assumption of a representative agent with CRRA preferences was replaced with that of a representative agent with
non-expected utility. In the non-expected utility framework, risk-aversion can be understood
independently of the elasticity of intertemporal substitution. This approach, described in Section 8.2, does not necessarily lead to address the equity premium puzzle. We shall explain that
within the non-expected utility model, a possible resolution of the puzzle requires that the
price-dividend ratio is a ected by a number of state variables. One instance of state variables
identied in the literature is a long-run risk, dened as a mechanism capable of turning even
a small shock into an economic damage perduring for years. For example, it has been argued,
expected consumption growth is highly persistent, such that shocks to expected growth have
long-lasting implications. While the model has the potential to address the equity premium
puzzle, we shall explain that there are nuances regarding the realism of this precise mechanism
underlying this model.
Section 8.3 explores channels that address the equity premium puzzle based on a variant
of habit formation. The external habit formation model reviewed in Chapter 7 relies on the
existence of a representative agent with high risk-aversion. Alternatively, one can consider
economies with heterogenous agents, in which each agent has a consumption reference that he
wants to benchmark to (catching up with the Joneses). The economy is heterogeneous in that
8.1. Introduction
c
by
A. Mele
each agent displays a di erent curvature of his utility function. In bad times, when markets
are down, the less risk-averse agents are those who su er the most, reecting their previous
relatively more aggressive investment decisions. Thus, their relative wealth and, hence, social
weight decreases. That is, the aggregate risk-aversion increases in bad times, leading to the
countercyclical mechanisms explained in the previous chapter. This model has the potential to
explain the equity premium, but is also subject to a number of caveats.
Section 8.4 describes economies in which agents can be hit by idiosyncratic shocks, a risk
that they cannot hedge against through security trading. How is it that idiosyncratic shocks
could a ect asset prices? The key assumption of the models we survey is that although agents
are ex-ante the same, they will be a ected by shocks that have di erent amplitude. A job loss
is one important instance that illustrates how these models work. Recessions do not necessarily
a ect agents in the same way. Some agents can be hurt more than others. The possibility of
a job loss can, then, induce agents to act prudently while investing in the stock market. That
is, risk-premium is countercyclical. This explanation can lead to a potential resolution of the
equity premium puzzle, theoretically at least.
While idiosyncratic risk is a clear explanation, its empirical implications do not seem to
be entirely exhaustive. Note that a natural hedge against idiosyncratic risk is self-insurance,
that is, the ability to save in good times to cope with adversities possibly occurring in bad.
Agents might actually eliminate a large portion of idiosyncratic risk by insuring themselves
while having access to capital markets, thereby making idiosyncratic risk practically irrelevant
to the explanation of the equity premium. For idiosyncratic risk to really matter, we would
need to observe a large and persistent idiosyncratic risk, or capital markets transactions to
be so expensive to prevent agents from implementing self-insurance plans. But empirically,
idiosyncratic risks do not appear to be as large and persistent, and market transaction costs
are not as large, as required by our standard models with idiosyncratic risk. Naturally, there are
historical instances in which idiosyncratic risk may have a better appeal: the Great Recession
occurring after the 2007 crisis is one teaching us idiosyncratic risks could be quite persistent.
Section 8.5 considers an alternative channel, market incompleteness. Economies with incomplete markets are not necessarily consistent with a sizeable equity premium. If agents have
comparable access to equity markets, they share the same level of risk, such that the premium
they require is similar to that in economies with complete markets. But market incompleteness
has the potential to resolve the puzzle, when a large fraction of the agents do not participate.
The mechanism is simple. If the markets are being shut down to a large proportion of agents,
the agents who participate are concerned for the aggregate macroeconomic risk they bear. This
concern leads them to require a sizeable premium.
Sections 8.6 and 8.7 deal with economies where agents have di erences in beliefs and also
uncertainty (as opposed to risk) regarding the fundamentals. [In progress]
Section 8.8 deals with issues arising within production-based economies. In these economies,
consumption is endogenous, and an increase in the agents risk aversion might actually lead
to a decreased consumption volatility. One additional important di culty in these economies
is that capital supply is innitely elastic, such that the price of capital is quite smooth. To
increase capital price volatility, we need hindrances in the capital formation process, such as
the presence of adjustment costs, or an added volatility of the demand for capital, obtained
for example when agents have habit formation over their consumption plans. Both rigidities in
the capital formation process and volatility in the demand of capital are needed, in order to
explain the equity premium.
Section 8.9: In progress.
341
c
by
A. Mele
Section 8.10 presents a simple model to assess a very old hypothesis in nancial economics:
the extent to which equity volatility can be explained by rms leverage.
Section 8.11 reviews recent models capable to explain the cross-section of asset returns,
relying on multiple trees.
Section 8.12 surveys predictions that the previous models make about the yield curve. Chapter
12 contains many more models and explanation of the yield curve and its relation to macroeconomic developments.
Section 8.13 deals with an intriguing topic: what do nancial economists and macroeconomists
have really in common? Granted, in many of the models surveyed in this chapter, we aim to understand asset prices in a context of the business cycle. However, these models are built upon a
revamp of many of the assumptions underlying the neo-classical paradigm. Yet macroeconomists
do not necessarily seem to acknowledge our asset pricing lessons. Are macroeconomists mistaken? Or is there a case for a modern version of a dichotomy between the real and the nancial
spheres of the economy? A simple model shows there is a potential for the hypothesis of a separation between nance and macroeconomics. However, this potential is seriously undermined
once we allow nancial markets to feed back the economy.
Section 8.14 concludes the chapter and reviews explanations of macroeconomic developments
relying on such asset price feedbacks. [In progress]
+1 )
( )+
( +1 )
(8.1)
11
( ) = 1 + 1
In this formulation, the decision maker attitude vis-`a-vis risk is encoded into the certainty
equivalent +1 through the utility and, as is well-known (see Chapter 3), in this specic
342
c
by
A. Mele
( ) = (
() = 1
and
(8.2)
for three positive constants , and . In this formulation, risk-attitudes for static wealth
gambles have still the classical CRRA avor. Precisely, we say that is the CRRA for static
wealth gambles, and
(1
) 1 is the EIS. We still have
1
+1 =
and the parametrization for
( (
+1 ))]
1 1
1
+1
1
+1
1 1
)
(8.3)
= +
is
Naturally, in the absence of uncertainty,
+1 , another clear illustration that
the EIS. Also, it is straightforward to verify that as soon as the CRRA equals the reciprocal of
the EIS, i.e.,
P = 1 1 , in Eq. (8.3) collapses to the standard intertemporal additive utility,
1
=
+ .
=0
Let us return to the general case. Let us recall that is the certainty equivalent continuation
utility expressed in consumption units; in some cases (e.g., in Section 8.13), it is both convenient
and intuitive to formulate problems in terms of the continuation utility
= ( ), such that
we can write Eq. (8.3) as:
=
1
1
+ ((1
+1 ))
(8.4)
In the appendix, we shall rely on Eq. (8.4) while deriving the asset pricing implications of
portfolio choices within a representative agent framework.
8.2.2 Testable restrictions
=(
>
(1 + r +1 )
=1
) (1 +
X
=1
+1
+1 )
(8.5)
+1
+1
+1
and
are the price and the dividend of asset at time : the usual trading convention
where
is that the portfolio +1 is choosen at time .
343
c
by
A. Mele
Let us consider a Markov economy in which the underlying state is some process . We
consider stationary consumption and investment plans. Accordingly, let the stationary util be
a function ( ) when current wealth is and the state is . By Eq. (8.4),
1
1
(
)=
+ ((1
) ( ( +1 +1 ))) 1
(8.6)
max
1
In the Appendix, we show that the rst order conditions for the representative agent lead to
the following Euler equation,
[
+1 ) (1
+1
( +1 +1 )
; +1 +1 ) =
(
(
)
+1 ))]
= 1
=1
(1 +
(8.7)
1
1
+1 ))
(8.8)
This stochastic discount factor displays the interesting property to be a ected by the market
portfolio return,
, at least as soon as 6= 1 . In particular, when
1, the stochastic
discount factor is countercyclical not only through consumption growth, but also through the
market return, by its property to give more weight to states of nature when
is low than
to states when
is high. Moreover, the stochastic discount factor may potentially inherit the
excess volatility of market returns in a quite natural fashion.
Note, then, the interesting xed point problem in this model: market returns a ect the
stochastic discount factor, which a ects market returns in turn! Due to this xed-point, asset
prices predicted by these models are not known in closed-form, except for isolated exceptions.
Furthermore, the potential of this model to explain the empirical puzzles needs to be further
qualied, as discussed in the next section.
8.2.3 Risk premiums and interest rates
So the Euler equation is,
+1
(1 +
+1 )
(1 +
+1 )
=1
(8.9)
Eq. (8.9) obviously holds for the market portfolio and the risk-free asset. Therefore, by taking
logs in Eq. (8.9) for = , and for the risk-free asset, = 0 say, yields the following conditions:
ln( +1 )+
+1
0 = ln
= ln (1 +
)
(8.10)
where
ln , and
ln (1 +
) = ln
ln(
+1
)+(
1)
+1
(8.11)
Next, suppose that consumption growth, ln +1 , and the market portfolio return,
+1 ,
are jointly normally distributed. In the appendix, we show that the expected excess return on
the market portfolio is given by
(
+1 )
1
2
=
344
+ (1
(8.12)
c
by
A. Mele
where 2 =
(
) and
=
(
)), and the term 12 2 in the left
+1 ln ( +1
hand side is a Jensens inequality term. Note, Eq. (8.12) is a mixture of the Consumption
CAPM (for the part
) and the CAPM (for the part (1
) 2 ).
The risk-free rate is
1
1
1
+1
2
ln
= +
(8.13)
(1
) 2
2
2 2
(ln ( +1 )).
where 2 =
Eqs. (8.12) and (8.13) can be elaborated further. In equilibrium, the asset price and, hence,
the asset return, is certainly related to consumption volatility. Precisely, assume that
2
(8.14)
where 2 is a positive constant that may arise when the asset return is driven by some additional
state variable.1 Under the assumption that the asset return volatility is as in Eq. (8.14), the
equity premium in Eq. (8.12) is:
(
+1 )
1
+
2
1
2
+ (1
(8.15)
1
1
1
1
1
2
2
= +
(8.16)
1+
0
1
2
21
As we can see, we may increase the level of relative risk-aversion, , without substantially
a ecting the level of the risk-free rate,
. This is because the e ects of on
are of a
second-order importance (they multiply variances, which are orders of magnitude less than the
expected consumption growth, 0 ).
8.2.4 Campbell-Shiller approximation
Consider the denition of the return on the market portfolio,
+1
+1
+1 +
+1
= ln
+ +1
+1 = ln
1 This
is for example the case of the Bansal and Yaron (2004) model described below.
345
+1
)+
+1
c
by
A. Mele
where
is the value of the market portfolio, +1 = ln +1 is the aggregate dividend growth,
and = ln is the log of the aggregate price-dividend ratio. A rst-order linear approximation
of ( +1 ) around the average level of leaves,
+1
+1
(8.17)
+1
+1
where 0 = ln
+ +1 , 1 = +1 and is the average level of the log price-dividend
+1
(ln
+1 )
+1
+1 ))
(1
+1
+1 ))
(8.18)
where +1 ln +1 and
(
; +1 +1 ). That is, there can be additional sources
+1
of risk the agents require to be compensated for, provided that (i) the price-dividend ratio is
random and (ii) agents have a preference for an early resolution of uncertainty (
1). We now
illustrate how this mechanism operates in the context of risks that may be very persistent.
8.2.5 Risks for the long-run
Bansal and Yaron (2004) consider a model in which persistence in the expected consumption
growth has the potential to explain the equity premium puzzle. To illustrate the main points
of this explanation, assume that consumption growth is solution to,
+1
where
= ln
+1
1
2
+1
+1
+ +1
0 2
+1 =
+1
(8.19)
(8.20)
To nd an approximate solution to the log of the price-dividend ratio, replace the CampbellShiller approximation in Eq. (8.17) into the Euler equation (8.10) for the market portfolio,
ln( +1
)+ ( 0 + 1 +1
+ +1 )
(8.21)
0 = ln
Conjecture that the log of the price-dividend ratio takes the simple form, = 0 + 1 , where
0 and 1 are two coe cients to be determined. Substituting this guess into Eq. (8.21), and
identifying terms, leaves:
1 1
(8.22)
= 0+ 1
1
1
1
where the constant
2 In this version of the model, the equity premium and volatility are constant. Bansal and Yaron assume that
to make the model consistent with time-varying statistics.
346
is heteroskedastic
c
by
A. Mele
The model discussed by Campbell, Lo and MacKinlay is one where expected returns are
directly modeled as possibly persistent processes. Instead, the model of this section is one
where expected growth is possibly persistent. We shall see soon, what the implications are, of
such a broader perspective. Note, however, a crucial point. High volatility of the price-dividend
ratio does not necessarily lead to a resolution of the equity premium puzzle. For example, in the
context of non-expected utility, Eq. (8.15) suggests that in equilibrium, relative risk-aversion
and intertemporal elasticity of substitution should play together in the right direction, and the
1
variance
is equally important. For example, if =
, the price-dividend ratio would not
even enter the Euler equation as we know. Therefore, we need to check how this high volatility
of the price-dividend ratio translates into a high equity premium.
2
We use the expression of
in (8.14), leading to the
+1 in Eq. (8.17) and determine
model prediction regarding the equity premium and the risk-free rate through Eqs. (8.15) and
(8.16). We have
2
2
1
2
1
=
|
+1 )
{z
+
!2
That is, and as anticipated, non-expected utility can lead to a resolution of the equity premium puzzle when asset returns are driven by sources of variation in addition to consumption
growth. The previous expression for 2 shows precisely how long run risks accomplish this: this
added volatility stems from large uctuations in the price-dividend ratio, arising due to the high
persistence of the small component in Eq. (8.20). An alternative albeit entirely consistent
way to explain these facts relies on the expression for the innovations to the stochastic discount
factor in this model: by Eq. (8.18),
ln
+1
(ln
+1 )
+1
+1 ))
(1
1 1
+1
+1 ))
That is, long-run risks are priced through 1 , the sensitivity of the price-dividend ratio to
changes in (see Eq. (8.22)): even if the innovations to are small, 1 is large, as explained,
and leads to large compensation for this risk, provided
1.
[Survey briey developments in the LRR literature]
c
by
A. Mele
and volatilities, through the channel of a countercyclical price of risk. It does rely on a high
risk-averse economy, though. Chan and Kogan (2002) show that a countercyclical price of risk
might arise, without assuming the existence of a representative agent with a high risk-aversion.
They consider an economy where heterogeneous agents have preferences displaying catching
up with the Joneses features introduced by Abel (1990, 1999).
In this economy, there is a continuum of agents, indexed by a parameter
[1 ) appearing
in their instantaneous utility,
1
(
)=
(8.23)
By assumption, the standard of living of others, , is a weighted geometric average of the past
realizations of the aggregate endowment , viz
Z
(
)
+
ln
with
0
ln = ln 0
0
Therefore,
satises
=
where
is solution to,
1
= 0
2
ln
2
0
(8.24)
This model can be interpreted as one displaying catching up with the Joneses features
because the utility of each agent is a ected by a benchmark, , which is a weighted average of
the past aggregate endowment
, the latter obviously being equal to aggregate consumption
(see the constraint in [P1] below). This model is important. We already know (see Chapter 7)
that a realistically calibrated economy with habit formation and a representative agent relies
on a high risk aversion. Moreover, an economy with catching up with the Joneses and a
representative agent would also rely on a high risk-aversion, as argued below. Chan and Kogan
(2002) show that their model, while capturing the spirit of habit formation through catching
up with the Joneses, does not need to rely on a high risk-aversion, once we populate the
economy with heterogenous agents, thereby allowing habit formation and catching up with
the Joneses to perform a role in the explanation of the equity premium puzzle.
In this economy with complete markets, we can determine the asset price, and solve the
model by relying on the centralization of competitive equilibrium through Pareto weightings,
along lines similar to those in Theorem 2.7 of Chapter 2. As explained in the Appendix, the
equilibrium price process is the same as that in an economy with a representative agent with
instantaneous utility,
Z
Z
(
) max
(
)
s.t.
=
[P1]
1
348
c
by
A. Mele
1
where
is the marginal utility of income of the agent . The Appendix provides further
details on the derivation of the value function of the program [P1], which is:
Z
1
1
1
( )
( )
(8.25)
1
1
where is a Lagrange multiplier, a function of the state , satisfying:
Z
1
1
=
( )
(8.26)
1
Finally, the Appendix shows that the unit risk-premium predicted by this model is,
( )=
0R
1
1
( )
(8.27)
To summarize, Eq. (8.26) determines the Lagrange multiplier, ( ), which then feeds ( )
through Eq. (8.27). Empirically, the Pareto weighting function, , can be parametrized by a
function, which can be calibrated to match selected characteristics of the asset returns and
volatility. Note, nally, that this economy collapses to an otherwise identical homogeneous
economy, once the social weighting function
= (
), the Diracs mass at . In this case,
( ) = 0 , a constant. As anticipated, an economy with a single agent with catching up with
the Joneses is unlikely to resolve the equity premium puzzle or address other issues such as
predictability, because 0 is both small and constant.3
A technical, albeit crucial assumption of this model is that the standard of living of others,
, is a process with bounded variation (see Eq. (8.24)). This assumption implies that
is
not a risk for which the agents require a compensation. In this model, then, it is the agents
heterogeneity that drives variation in the risk-premium ( ) (see Eq. (8.27)). By calibrating
the model to US data, Chan and Kogan nd that ( ) is decreasing and convex in .4 The
mechanism of the model is an endogenous redistribution of wealth. Note that the less riskaverse agents obviously invest a higher proportion of their wealth into risky assets, compared
to the more risk-averse. In the poor states of the world, then, when stock prices decrease, the
wealth of the less risk-averse agents lowers more than that of the more risk-averse. The result
is a reduction in the fraction of wealth held by the less risk-averse individuals in the whole
economy. Thus, in bad times, the contribution of these less risk-averse individuals to aggregate
risk-aversion decreases and, hence, aggregate risk-aversion increases and so does ( ) in Eq.
(8.27).
[Discuss the criticism of Xiouros and Zapatero (2010)]
349
c
by
A. Mele
Mankiw (1986) is the rst to point out the asset pricing implications of idiosyncratic risks. In
his model, aggregate shocks to consumption do not a ect individuals in the same way, ex-post.
Ex-ante, individuals know that the business cycle may adversely changean aggregate shock
although they also anticipate that the very same same shock might be particularly severe to only
a portion of theman idiosyncratic shock. To illustrate, everyone faces a positive probability
of experiencing a job loss during a recession, although then, only a part of the population will
actually su er from a job loss. Alternatively, one thing is to say that a recession will lead to a
salary reduction to everyone, and another thing is to say that a number of individuals will not
even be a ected by the recession but then the others will bear its entire burden.
Idiosyncratic risk might signicantly a ect agents portfolio choice and, therefore, rational
asset evaluation. Naturally, in the presence of contingent claims able to insure against these
shocks, idiosyncratic risk would not matter. But the point is that in reality, these contingent
claims do not exist yet, due perhaps to moral hazard or adverse selection reasons. This source
of market incompleteness might then potentially explain the aggregate stock market behavior
in a way that the model with a standard representative agent cannot.
Mankiw considers the pricing of a risky asset in a two-period model, with the rst period
budget constraint given by
+ = , and the second period consumption equal to:
= +
+(
) (1 + )
where
is the amount to invest in a money market account, is the safe interest rate on
the money market account, normalized to zero, is the initial endowment, also normalized to
zero, is the price of the risky asset, is the payo promised by the risky asset and, nally,
is the asset net payo . That is, we may either endogenize the price , given the
payo
, or, then, just the net payo , , as described next. The asset is in zero net supply,
and because agents are ex-ante identical, we have that in equilibrium, = 0, such that equals
per capita consumption, .
There are two states of nature for the aggregate economy, which are equally likely. In the
good state, the net asset payo is = 1 + , and per capita consumption is = . In the bad
state, the net asset payo is, instead, = 1, and per capita consumption is also adversely
a ected in that, = (1
) . The payo in the good state, , equals 2 ( ). Therefore,
measures a risk premium; of course it has to be determined in equilibrium.
How does this macroeconomic shock a ect asset pricing in the presence of agents who could
be heterogeneous ex-post? The crucial feature of the model is the assumption that only a
fraction of agents will absorbe the macroeconomic risk, in that, literally, a fraction 1
of individuals will not be hit by the aggregate shock, and each of them will still consume
, for a total of (1
) . The residual portion of the population will consume the residual,
(1
)
(1
) =(
) , such that each agent hit by the shock will consume 1
.
The ratio, , is the per capita fall in consumption for any individual hit by the shock in the
bad state of nature. If = 1, the aggregate shock hits everyone. The highest concentration of
the shock arises when = , i.e. when the fall in consumption is borne by the lowest possible
fraction of the population. Table 8.1 summarizes payo s, per capita consumption and individual
consumption in this economy.
350
c
by
A. Mele
Net
Asset Payo
Bad state
Good state
Per capita
consumption
(1
1+
Individual
consumption
(consumed by )
(consumed by 1
1
2
[ 0 ( + )]
[ 0 ( )]
( 1) [ 0 1
+
( ) (1
)] + 12 (1 + )
( )
0
0
( )
1
=
0( )
is an utility
Mankiw shows that for utility functions leading to prudent behavior, 000 0, the premium is
decreasing in : an increase in the concentration of aggregate shocks leads to higher premiums.
Moreover, it is easy to see that can be made arbitrarily large, for arbitrarily close to ,
once the utility function satises the Inadas condition, lim 0 0 ( ) = , as we have that
lim
= . For example, in the log-utility case, we have that =
.
8.4.2 Self-insurance and persistence of idiosyncratic shocks
The critical assumption underlying Mankiws model is that once agents are hit by an idiosyncratic shock, the game is over. What happens once we allow the agents to act in a multiperiod
horizon? Intuitively, in a dynamic context, agents might implement self-insurance plans, by accumulating nancial assets after good shocks and selling or short-selling after bad shocks have
occurred. Telmer (1993) and Lucas (1994) show that if idiosyncratic shocks are not persistent,
self-insurance is quite e ective and asset prices behave substantially the same as they would do
in a world without idiosyncratic risk. Therefore, to have asset prices signicantly deviate from
those arising within a complete market setting, one has to either (i) reduce the extent of risksharing, by assuming frictions such as transaction costs, short-selling constraints or in general
severe forms of market incompleteness, or (ii) make idiosyncratic shocks persistent. With (i), we
just merely eliminate the possibility that agents may implement self-insurance plans through
351
c
by
A. Mele
capital market transactions. With (ii), we make idiosyncratic shocks so severe that no capital market transaction might allow agents to insure themselves and achieve portfolio solutions
close to the complete market solution; intuitively, once any individual is hit by an idiosyncratic
shock, he may short-sell nancial assets in the short-run, although then, he cannot persistently
do so, given his wealth constraints.
Heaton and Lucas (1996) calibrate a model with idiosyncratic shocks using PSID (Panel
Study of Income Dynamics) and NIPA (National Income and Product Accounts) data. They
nd that idiosyncratic shocks are not quite persistent, and that a large amount of transaction
costs is needed to generate sizeable levels of the equity premium. Naturally, idiosyncratic shocks
are not always as in the PSID dataset analyzed by Heaton and Lucas long time ago. Models
with idiosyncratic risk would actually help think about market behavior in periods such as the
Great Recessionthe recession occurred around 2009, when the persistence of idiosyncratic
shocks would arguably be larger than over the Great Moderationthe period of low volatility
of macroeconomic aggregates, starting after the Monetary experiment in 1982 (e.g., Bernanke,
2004) and presumably ending in 2007.
8.4.3 A model with countercyclical income inequality
Constantinides and Du e (1996) do actually take the issue of persistence in idiosyncratic risk
to the extreme, and consider a model without any transaction costs, but with permanent idiosyncratic risk. They show that in fact, given an asset price process, it is always possible to nd
a cross-section of idiosyncratic risk processes compatible with the asset price given in advance.
We now present this elegant model, which has a quite substantial theoretical importance per
se, because of its feature to make so transparent how some state variables a ecting consumer
choices can be reverse-engineered from the observation of an asset price process.
Central to Constantinides and Du e analysis is the assumption that each individual has a
consumption equal to
at time , given by:
P
1 2
=
= exp
2
=1
1 2
+1
+1
2
ln
(8.28)
2 +1 +1
F { +1 }
where F is the information set as of time . Denoting with ( ) is the measure of agent , we
have that by Eq. (8.28) and the Law of Large Numbers, the cross-sectional variance of ln
1
R 2
R
1 2 2
2
2
is, (ln
+2 )
()=
()= .
1
The meaning of the consumption share
is that of an idiosyncratic shock every agent
receives on his consumption share at . From the perspective of each agent, this shock is uninsurable, in that it is unrelated to the asset returns. Moreover, by construction, the consumption
1 2
share has a unit root, as ln
ln
: a change in
and/or a shock in
1 =
2
have a permanent e ect on the future path of .
All agents have a CRRA utility function. We want to make sure this setup
R is consistent
with any given equilibrium asset price process, by requiring two conditions: (i)
()= ,
352
c
by
A. Mele
R
i.e.
( ) = 1, a condition satised by the law of large numbers; (ii) the cross-sectional
variances 2 are reverse-engineered so as to be consistent with any stochastic discount factor
and, hence, any asset price process given in advance. To achieve (ii), note that by the law of
iterated expectations, for any agent , the value of an asset delivering a payo equal to at
time + 1 is:
#
"
+1
( +1 +1 2 2+1 ) F {
F
+1 }
#
"
+1
( +1) 2+1
2
=
F
where is the discount rate and is the CRRA coe cient. It is independent of any agent ,
such that the stochastic discount factor is:
+1
+1
+1
1
2
( +1)
2
+1
That is, given an aggregate consumption process, and an arbitrage free asset price process, there
exists a cross-section of idiosyncratic risk processes that supports the given price process. As
a trivial example, consider the standard Lucas stochastic discount factor, which obtains when
0.
Which properties of the stochastic discounting factor are we looking for? Naturally, we wish
to make sure
is as countercyclical as ever, which might be the case should the dispersion of
the cross-sectional distribution of the log-consumption growth, 2 , be countercyclical. However,
Lettau (2002) shows that empirically, such a dispersion seems to be not enough, even when
multiplied by 12 ( + 1), unless of course, we are willing to assume, again, a high level of
risk-aversion. Note that Lettau analyzes a situation that favourably biases his nal outcome
towards not rejecting the null that idiosyncratic risk matters, as he assumes agents cannot
insure themselves at all: once they are hit by an idiosyncratic shock, they just have to consume
their income. Constantinides, Donaldson and Mehra (2002) consider an OLG to mitigate the
issue of persistence in the idiosyncratic risk process.
Discuss the recent literature.
[In progress]
c
by
A. Mele
=( + ) 0+
1+
2 = +
where 0 denotes the endowment of the risky asset, 1 and 2 are rst and second period
consumption, and and are the dividends in the rst and second period. We assume that
0 = 1. Because agents are ex ante identical, although ex post heterogenous, they have no
incentives to trade. An autarkic equilibrium is therefore one where = 1, 1 = + and
2 = + . Markets are incomplete because the endowments cannot be spanned through the
risky asset. The asset price satises
=
( 0 ( + ) )
0( + )
Compare this price with that arising in a complete markets economy, where by Pareto optimality, the agents have the same marginal rate of substitution, state by state. In this complete
markets economy with agents displaying identical preferences, risk-pooling is the equilibrium
outcome, with each agent consuming the same, 2 = + ( ), such that the price in the
complete market setting is,
( 0 ( + ( )) )
=
c
0( + )
If 0 is convex, and as soon as dividend and endowment risks are independent, ( 0 ( +
( )) )
( 0 ( + ) ), thereby leading to (i) the risk-free rate in the incomplete market
case lower than in the complete case, and (ii) c
.
The interpretation of the condition that 0 be convex relates to prudence, similarly as for
the condition of the previous section that ensures the equity premium in Mankiws model is
increasing in the concentration of aggregate shocks. In fact, Weil argues that the risk-free rate
and equity premium puzzles might be rationalized by agents facing incomplete markets while
engaging in precautionary savings, and restricted by preferences exhibiting decreasing absolute
risk aversion (DARA) and decreasing absolute prudence (DAP)an utility function exhibits
000 ( )
decreasing absolute prudence if the coe cient of absolute prudence,
00 ( ) , is decreasing. The
mechanism is quite illuminating. Incomplete markets lead agents to act more prudently than
they would if they had to face complete markets and as a result, to increase their demand of
both the riskless and the risky assets. Clearly, then, the riskless interest rate decreases although
to ensure an increase in the equity premium, we need to ensure that the increase in the price of
the risky asset is less than the riskless, which is ensured by DARA and DAPour agents need
then to be less prone to bear dividend risk once they have nontraded labor income.
While the model seems somehow restrictive due to its reliance on DARA and DAP, many
standard utility functions satisfy DARA and DAP. Du e (1992) develops a model with a nite
number of agents without exploring its implications. Note that these models are static. It is not
clear whether self-insurance would play a role in these models where idiosyncratic and aggregate
354
c
by
A. Mele
risk are unrelated. The model in the next section has a completely di erent mechanism, relying
on the assumption some agents would never be able to implement any self-insurance plans,
living in a world with an extreme form of market incompleteness, with all the macroeconomic
risk being borne by a handful portion of remaining agents.
8.5.2 A two-agents economy
The economy of this section relies on the presence of heterogenous agents. One agent has access
to the market for the risk asset, while a second agent has not. The equity premium is the
expected excess return the rst agent requires to enter the risky asset market, and can be quite
large, even in the presence of small risk-aversion, because the agent is being willing to take on
the entire aggregate macroeconomic risk.
Basak and Cuoco (1998) consider the following model. An agent does not invest in the stock
market, and has logarithmic instantaneous utility,
( ) = ln . From his perspective, markets
are incomplete. A second agent, instead, can invest in the stock market, and has instantaneous
utility equal to ( ) = ( 1
1)/ (1
). Both agents are innitely lived. The competitive
equilibrium of this economy cannot be Pareto e cient, and so aggregation results such as those
underlying the economy in Section 8.2 cannot obtain. However, Basak and Cuoco show that
aggregation still obtains in this economy, once we dene social weights in a judicious way.
Let be the general equilibrium allocation of agent , =
. In equilibrium, + = ,
where is the instantaneous aggregate consumption, taken to be a geometric Brownian motion
with parameters 0 and 0 ,
=
Dene
(8.29)
(8.30)
where
( ) =
(8.31)
and, nally,
+ (1
(8.32)
1
1
2 ( + (1
))
2
0
(1 + )
(8.33)
and
( )=
(8.34)
The expressions for and in Eqs. (8.33)-(8.34) are derived below. Appendix 2 provides a
further derivation relying on the existence of a representative agent, as originally put forward
by Basak and Cuoco (1998), and explained below.
In this economy, the marginal investor bears the entire macroeconomic risk. The risk premium
he requires to invest in the aggregate stock market is large when his consumption share, ,
355
c
by
A. Mele
is small. With just a risk aversion of = 2, and a consumption volatility of 1%, this model
can explain the equity premium, as the plot of Eq. (8.34) in the Figure 8.1 illustrates. For
example, Mankiw and Zeldes (1991) estimate that the share of aggregate consumption held by
stock-holders is approximately 30%, which in terms of this model, would translate to an equity
premium of more than 6 5%.
Guvenen (2009) makes an interesting extension of the Basak and Cuoco model. He consider
two agents in which only the rich invests in the stock market, and is such that EISrich
EISpoor . He shows that for the rich, a low EIS is needed to match the equity premium. However,
US data show that the rich have a high EIS, which can not do the equity premium. (Guvenen
considers an extension of the model where we can disentangle EIS and CRRA for the rich.)
lambda
0.10
0.09
0.08
0.07
0.06
0.05
0.04
0.2
0.3
0.4
0.5
0.6
FIGURE 8.1. The equity premium in the Basak and Cuoco (1998) model, for
0 = 1%.
= 2 and
To derive Eqs. (8.33)-(8.34), note that the consumption of the agent not participating in the
stock market satises, by Eq. (8.31):
=(
(8.35)
=
=
, satises
) (1
))
1
0
(8.36)
where the second equality follows by the denition of in Eq. (8.30), and by Eqs. (8.29) and
(8.35). Moreover, by the rst order conditions of the market participant in Eq. (8.31), and the
CRRA assumption for ,
1 2
+
ln =
(8.37)
ln =
2
356
c
by
A. Mele
2
1
Using the relation, ln =
, then identifying terms in Eq. (8.36) and Eq. (8.37),
2
delivers the two expressions for and in Eqs. (8.33)-(8.34).
How do these results technically relate to aggregation? Basak and Cuoco dene the instantaneous utility of a representative agent, a social planner, as:
(
( )+
max [
+
where
( )
=
0 ( )
( )]
(8.38)
( )
is a stochastic social weight and, once again, and are the private allocations, satisfying the
rst order conditions in Eqs. (8.31). By the denition of , and Eqs. (8.31),
is solution to,
=
(8.39)
Then, the equilibrium in this economy is supported by a ctitious representative agent with
utility (
). Intuitively, the social planner allocations satisfy, by construction,
0
(
0 (
)
=
)
0
0
( )
=
( )
where the starred variables denote social planners allocations. In other words, Basak and
Cuoco approach is to nd a stochastic social weight process
such that the rst order conditions of the representative agent leads to the market allocations. The utility in Eq. (8.38)
can then be used to compute the short-term rate and risk premium, and lead precisely to Eqs.
(8.33)-(8.34), as shown in Appendix 2.
c
by
A. Mele
the absence of frictions. Section 8.6.3 also deals with issues of aggregation in markets with
heterogeneous beliefs, providing a general framework to address a number of issues, from the
analysis of survival of irrational traders, to themes such as the equity premium puzzle or excess
volatility in irrational markets.
Note that the vast majority of the explanations in this section have a behavioral slant. We
are about to assume that agents have psychological biases, in that they make systematic mistakes whilst assessing the probability distribution of the fundamentals, by emphasizing aspects
of the markets such as correlations or informativeness of signals, which are less pronounced than
in the real markets they operate in. In Section 8.7, we move to an alternative approach, relying
on Knightian uncertainty: while agents still have a poor knowledge of the complex environment
in which they take decisions, they behave rationally, by explicitly acknowledging their ignorance
and acting while fearing the errors possibly arising from it.
8.6.1 Learning with multiple signals
We state ltering results on Bayesian learning that generalize those in Chapter 7 while taking
into account the possibility that agents might update beliefs relying on multiple signals.
Consider the following result, a special case of Theorem 12.7 in Liptser and Shiryaev (2001;
page 36). Suppose that some unobservable process is solution to
=(
(8.40)
and
( |F )
where F is the information set available at time . Then, is a di usion process with time
varying but deterministic volatility.
Assuming enough time has elapsed to have made the variance of stationary, we have
=(
where
:0=2
>
>
>
>
>
) 1( S
>
(8.42)
(8.43)
Note that the Brownian motions driving the unobservable and the signals S are (potentially) the same, such that the interpretation of > into the variance term of Eq. (8.42) is that
of the instantaneous covariance between and S ,
(
S)=
>
(8.44)
The next sections rely on these ltering results and evaluate how agents update their beliefs
in light of new information in a number of models.
358
c
by
A. Mele
Scheinkman and Xiong (2002) consider a market in which the cumulative dividend process
of an asset satises
=
+
(8.45)
is a Brownian motion, is a constant, and
is the expected instantaneous dividend.
where
Note that in all the models dealt with in these lectures so far, cumulative dividend is locally
deterministic, i.e.
= 0 in Eq. (8.45). So this model di ers, and relies on the additional
assumption that
is not observedonly
is, although known to be a mean-reverting
process,
=
(8.46)
+
is a standard Brownian motion, and and
are two positive constants.
where
There are two sets of risk-neutral agents, and , who observe signals on , which satisfy:
=
(8.47)
as being his own signal and believes its volatility is , for some
However, agent thinks of
1, not , and agent thinks the same regarding his own signal . This overcondence
is the source of a behavioral bias similar to that considered in previous work by Kyle and Wang
(1997) and Odean (1998): agents make systematic mistakes about the asset fundamentals while
processing information, by assigning higher weight to a signal they arbitrarily perceive as their
own.
These assumptions are those appearing in the working paper version of Scheinkman and Xiong
(2002). In the published version, the assumption is that agents perceive signals to be correlated
with the fundamentals even if they are not. These two alternative assumptions give rise to the
same conclusions. We maintain the assumptions in this section because we deal with the latter
assumption in Section 8.6.3, while discussing another model, thereby providing the reader with
an additional exercise about learning in an overcondence context.5 We now apply the ltering
results in the previous section and analyze how overcondence leads to disagreement formally.
We then turn to the determination of an equilibrium, and explain the mechanism leading to
bubbles.
8.6.2.1 The inference process
2 2
is the positive root of 0 =
+2
where
early models of learning reviewed in Chapter 7.
(
2
359
(8.48)
(signal
c
by
A. Mele
Next, assume that agents also observe the signals in Eq. (8.47) but have no overcondence.
Then, below, we shall argue that all agents update priors through new information as follows,
1
1
1
= (
) +
(8.49)
+
+
is a Brownian motion under the information set F available to the agents, and is
where
dened as:
1
)
(
are dened similarly, and will be dealt with later (see Eqs.
The two Brownian motions
(8.53) below). Note that Eq. (8.49) collapses to Eq. (8.48), once we assume that agents have no
access to the two signals in Eq. (8.47) or, equivalently, that these signals are uninformative,
i.e.
.
Finally, consider the case in which agents have overcondence. Agent , to start with, thinks
of the signal
as being generated by Eq. (8.47), but regarding , he assumes that,
=
(8.50)
In terms of the ltering problem in Section 8.6.1, the information set of this agent includes
realizations of S = [
] up to time = , where
is solution to Eq. (8.50) and
is
solution to Eq. (8.47), such that and in Eq. (8.40) and (8.41) are,
=[
0 0 0]
0
0
0
0
0
0
0
0
| F , satises
1
1
= (
) +
(8.51)
+
+
where the Brownian motions will be dened in a moment, and
is the positive root of,
2
1
1
2
2
.
0 = ( 2 + 2 + 2) + 2
1
1
= (
) +
(8.52)
+
+
Finally, the Brownian motions in Eqs. (8.51)-(8.52) satisfy,
=
where,
=
1
1
360
if =
6
if =
(8.53)
c
by
A. Mele
and
(8.54)
That is, a negative value of is interpreted as a state in which agent is more optimistic
than .
We wish to express the dynamics of
under the probability space of agent , and the
dynamics of
under the probability space of agent . Let us consider the terms of disagreement
between the two types of agents. As regards the dynamics of cumulative dividends, we have,
by Eq. (8.53), that:
1
=
and concerning the signals,
=
(8.55)
The distinctive mark of the model is that trading occurs between agents due to di erence in
beliefs, and leads to bubbles arising as a result of short-selling constraints, consistent with
previous insights of Harrison and Kreps (1979). Intuitively, at any point in time, any asset is
held by the relatively more optimistic agent, whose asset evaluation is higher than his own
assessment of the fundamentals, expecting as he is to sell the asset to some future relatively
more optimistic agent. Prices then deviate from any agents fundamental evaluation because
short-selling constraints bias the price towards the more optimistic agents.
What is the optimal time at which to sell? It is a real option problem, of the kind introduced
in Chapter 4. Let
denote the asset price at that agents in group are willing to pay,
such that,
Z
= sup
(8.56)
where
denotes the time conditional expectation of agents ,
is the cumulative dividend
process in Eq. (8.45), is the constant interest rate, and is a transaction cost, to be discussed
below.
6 The
1+
+ ( 1+2
constants are
2
1
2
and
).
361
1)
1+
c
by
A. Mele
In words, the price agents are willing to pay at reect the dividends paid out over the
holding period and the price agents are willing to pay at time + , net of the transaction
cost. The conjecture is that the solution
contains a bubble component,
=
+B( )
(8.57)
The rst two terms on the R.H.S. of REq. (8.57) amount to the payo of the asset over its
(
)
(
)the fundamentals. According to the
life expected by agents in group ,
conjecture, agents are willing to pay more than the fundamentals, with B ( ) representing a
bubble, a function of , the di erence in beliefs in Eq. (8.54). The bubble arises exactly because
agents in group bid up the price to their own evaluations and agents in group cannot sell
short.
By replacing Eq. (8.57) into Eq. (8.56), and using the denition of in Eq. (8.54) leaves:
+
B ( ) = sup
+B( + )
(8.58)
+
0
The bubble is a re-sale option: it arises because the current asset holders have the option to
re-sell it in the future at a price higher than their own evaluation. Eq. (8.58) shows that its value
is determined through an optimal stopping time to sell, similarly as with a perpetual American
option (see Chapter 4), with the complication that the strike is endogeneous, reecting the
value of the bubble as perceived by the future buyer at the optimal stopping time.
From results in Chapter 4 (see Section 4.6), one can conjecture that there exists a region
of inaction, where the asset owner holds the asset, and a threshold , such that the optimal
stopping time in Eq. (8.58) is
= inf ( : +
). In particular, in the continuation region,
where the agent does not sell, the (discounted) bubble is a martingale, such that by Itos lemma
and Eq. (8.55),
L [B]
B=0
(8.59)
+B( )
(8.60)
+
Finally, a solution to this optimal stopping time satises the usual smooth-pasting conditions:
is the same as the function satisfying Eq. (8.60)
(i) the function solution to Eq. (8.59) for
at = ; and (ii) the derivatives of these two functions are the same at = . Scheinkman
and Xiong (2002) show that there is a solution to this problem, and that
0, and that
= 0 only when the transaction cost = 0. To illustrate, agent needs to become su ciently
pessimistic (
0) to be able to justify the transaction cost while selling the asset.
Intuitively, Brownian motions hit zero innitely many times, such that agents will trade very
often, driven by their ever switching di erences in opinions. The presence of a transaction cost
mitigate the occurrence of a such a trading frenzy.
B( ) =
c
by
A. Mele
about at least since Friedman (1953): how long would irrational investors survive in nancial
markets?
The technical issue arising while addressing this question links to how the agents priors are
aggregated in equilibrium: irrational agents are those who systematically believe in a model
deviating from the truth, and never wish to learn, thereby holding di erent beliefs than
those of the rational agents. Section 8.6.3.1 studies such a model and Section 8.6.3.2 generalizes
it to a context of learning, in which there are no agents who are more or less rational than
others, along the lines of Scheinkman and Xiong (2002), but within a framework of frictionless
markets without short-sales constraints. The sentiment model of Dumas, Kurshev and Uppal
(2009) is a special a case of this framework, and vividly illustrates the main predictions arising
therefrom. It will be discussed in Section 8.6.3.3.
8.6.3.1 Di erence in beliefs and extinction of irrational traders
Kogan, Ross, Wang and Westereld (2006) (KRWW, in the sequel) analyze the market impact of
a drastic source of di erences in beliefs, arising due to the irrationality of some agents. We briey
review this model both because of its outstanding economic importance and the aggregation
techniques needed to solve it. The ensuing aggregation results are generalized in the following
subsection devoted to the analysis of disagreement and learning in general equilibrium with
multiple agents and frictionless markets.
Consider an economy in which there is a riskless asset in zero net supply, and one risky asset
that entitles to an instantaneous dividend ow. Dividends follow a geometric Brownian motion
under the physical probability , with parameters and ,
=
(8.61)
with obvious notation. Rational agents correctly believe dividends are as in Eq. (8.61). However,
irrational agents think that the dividend process has a higher drift than in Eq. (8.61),
=
(8.62)
where F is the information set available to all agents, and the density process
=
is solution to
(8.63)
In this model, irrational agents do not attempt to learn from the history of dividends and
their cognitive bias is not eliminated as a result.
363
c
by
A. Mele
KRWW assume that all agents consume at some time , that they have CRRA equal to
, and that they have the same endowments.7 Therefore, the rational and irrational investors
maximize,
1
1
1
2
1
2
2
and
=
(8.64)
1
1
1
respectively, where 2 denotes the expectation under 2 taken by the irrational agent, and
the equality follows by a change of probability. Note that because
is strictly positive, the
irrational agent becomes more and more aggressive as he disagrees more with the rational,
where disagreement is captured by .
Because markets are complete, consumption allocations can be determined by solving a central planner program as explained in Chapter 2, whose value is:
=
( )
max
=1 2 : 1
1
1
1
2
[8.P1]
denotes the reciprocal of the marginal utilities of income for the two agents. The
where
solution is
1
( )1
2
(8.65)
1 =
2 =
1
1
1
1+( )
1+( )
As anticipated, the irrational agent receives an allocation that increases with his disagreement: it is as if his utility function in (8.64) had a random weight, , arising through how
much he will disagree regarding the dividend path leading to
at the consumption date .
We need to determine the ratio of marginal utilities of income, . Replace the consumption
allocations in (8.65) into the central planner program [8.P1], leaving:
(
1
1
+(
1
2)
such that, denoting partial derivatives with subscripts, the pricing kernel is given by
=
(
1
)
(
)]
(1 + ( )1 )
h
(1 + ( )1 )
(8.66)
Note that because consumption only takes place at terminal date, , the denominator of
contains the expectation of the representative agents marginal utility, not its value at zero as
it is instead the standard case (Chapter 7, Section 7.5.2). As shown in Chapter 4 (Appendix
2),
is the pricing kernel that determines the equilibrium prices using the money market
account as the numeraire. KRWW show that (see the Appendix for a few steps underlying this
derivation),
2
= (1 )
(8.67)
Eqs. (8.66) and (8.67) can now be used to dermine the asset price and the relative consumption
share. Regarding the asset price process (expressed in terms of the money market account
7 Work on extinction by Cvitani
c and Malamud (2011) relaxes the assumption the mass of rational and irrational agents are the
same.
364
c
by
A. Mele
1+
1+
2(
(8.68)
where
denotes the price in an economy that is only populated by rational agents, i.e. =
0. That is, the presence of optimistic agents (
0) inates asset prices beyond rational
evaluations.
The density process can, thus, be interpreted as sentiment, leading as it does to the following euphoria. Suppose a positive shock hits dividends; then the asset price goes up because
both
and go up in Eq. (8.68). That is, prices are driven both by rational evaluations and
sentiment. The term sentiment has been used in a context with irrational agents and learning
by Dumas, Kurshev and Uppal (2009) (see Section 8.6.3.3).
This property can be generalized to the case of any CRRA coe cient. Indeed, given the
expression for in Eq. (8.66), we can rely on results in Chapter 7 (Section 7.5.2), and determine
the unit risk-premium
in this economy. It is:
=
(
1+(
)1
2
1
(8.69)
and is less than in the rational economy, due to the permanent di erence di erence in beliefs,
2
( )
( )=
0. All in all, the aggregate risk-premium in this economy is lower,
reecting the optimistic view of the irrational agents: asset prices (expressed in terms of the
money market numeraire) are higher than in the purely rational. Note, also, that the aggregate
risk-appetite is time-varying and driven by market sentiment, . In good times (i.e., after
a positive dividend shock), sentiment increases, such that
lowers in Eq. (8.69), leading to
speculative enthusiasm and higher asset evaluation.
Would irrational traders ever disappear in the long-term? Dene the relative consumption
share of the rational agent against the irrational, which by Eq. (8.65) and then (8.63) and (8.67)
is,
1
2
(1 + 12 ) 2 + 1
= ( )1 =
1
2
KRWW dene relative extinction of irrational traders occurs whenever, lim
= 0
1
almost surely. By the Strong Law of Large Numbers for Brownian motions (Karatzas and
Shreve, 1991), we have that given two constants 0 and 1 ,
0
0
0
0 + 1
=
lim
0
0
c
by
A. Mele
We consider an economy in which agents hold di erent beliefs regarding the fundamentals of
the economy, generalizing the two-person setting originally formulated by Detemple and Murthy
(1994), Zapatero (1998), and others such as Basak (2000, 2005), Berrada (2006), or Buraschi
and Jiltsov (2006). For example, some agent, the benchmark, may think that output has
expected growth and constant volatility 0 ,
=
(8.70)
We dene the agents with these beliefs as the benchmark, in that any other agents beliefs are
about to be gauged against theirs, as explained below. Accordingly, and for a given , consider
the Radon-Nikodym derivative of the probability
of agent against the benchmark = 1,
and its density process,
say, which by Girsanovs theorem satises
=
(8.71)
(8.72)
=
+ 10 is a Brownian motion under . This example generalizes the model
where
in the previous section, in which and are constant, as summarized by Eq. (8.63).
We refrain from specifying additional examples and explaining mechanisms leading to disagreement, for now and until the next subsection, and focus on the determination of the equilibrium in this economy. Without loss of generality, dene the rst agent as the benchmark. We
assume that markets are complete and that there is symmetric albeit incomplete information,
and denote with F the information set available to all agents. The Radon-Nikodym derivatives,
{2 }
(8.73)
1
F
formalize the notion of disagreement between agent and the benchmark at a notion made
more precise below.
Each agent maximizes his intertemporal utility, subject to the budget constraint,
Z
Z
max
( )
s.t.
= 0
[8.P2]
(
where
denotes expectation under , 0 is the initial wealth available to agent , and
is
the private state-price process, or pricing kernel, of agent , a concept we elaborate on in a
moment. Remaining notation is straightforward.
We know since from Part I of the lectures that a state-price process links to the evaluation
of a consumption unit at some future pre-specied state of nature. Heuristically, it is,
0
(8.74)
366
c
by
A. Mele
where denotes the risk-neutral probability as usual, which links to the value of Arrow-Debreu
securities as we know. Note that
in Eq. (8.74) can vary across agents due to di erence in
opinions, although then the agents need to agree on the price of the assets they observethey
do, simply, have di erent perspectives regarding the future developments of the assets price.
To formalize the idea that agents disagree against a same benchmark, note that,
=
and cast the program [8.P2] under the probability space of the benchmark agent, such that
each agent (note, ) maximizes his intertemporal utility subject to his budget constraint,
Z
Z
1
1
( )
s.t.
= 0
[8.P3]
max
1
(
The program [8.P3] is a standard complete markets problem in which each agent has an
instantaneous utility function equal to
( ) and a commonly agreed state-price process 1 ,
that of the benchmark agent. All agents do agree on the current price of all assets, although
they now act more aggressively on their consumption as their divergence of opinion against the
rst agent widens, as summarized by an utility distorsion factor . The program [8.P3] is the
intermediate consumption, innite horizon extension to the program of maximizing terminal
utility in (8.64).
Because markets are complete, we can study the asset pricing implications of this economy
by centralizing it as usual. Consider a representative agent with instantaneous utility equal to,
= max
( )
=1
( )
s.t.
=1
[8.P4]
=1
where
denotes output at , and 1 is the marginal utility of income of agent . We simplify
the presentation, and to focus on the salient aspects of disagreement, we set
for all .
The rst order conditions of the program [8.P4] are,
0
( )
1
1
=
, plug it
the previous equation for the consumption allocation to agent ,
into the constraint of
=
=P
)1
=1
)1
(8.75)
Intuitively, and as anticipated, the more aggressive agents are, compared to the benchmark
(i.e., the higher ), the higher their consumption allocation. Eq. (8.75) is, naturally, reminiscent of the consumption allocation in Eq. (8.65) applying to the KRWW model. However,
disagreement, , is now allowed to take a more general form than in (8.63): it is time-varying
and reects the agents learning as in (8.71).
367
c
by
A. Mele
The value function for the central planner is obtained by replacing Eq. (8.75) into the maximand of the program [8.P4],
(
( ) =1 )
=1
(8.76)
By the usual arguments (see Section 7.5.2 in Chapter 7), the pricing kernel in this economy is
given by 1
, where
(
( ) =1 )
(8.77)
=
( 0 ( 0 ) =1 )
0
where subscripts to denote a partial derivative with respect to the rst argument.
Note that to price any asset, we still need to determine the marginal utility of income for
each agent , 1 . One approach could be to search for a cross-sectional distribution of Pareto
weights that best ts selected moments of the price distribution, as Chan and Kogan (2002) did
in their model surveyed in Section 8.3. Note that the cross-section of Pareto weights depends
not only on wealth distribution, but also on beliefs. Indeed, by results in Chapter 4 (see Section
4.5.2), we have that the reciprocal of the marginal utility of income is,
Z
1
1 1
1
1
= 0
1
0
The asset market equilibrium can now be studied by determining the risk-premiums,
and interest rates as implied by the pricing kernel for = 1 in Eq. (8.74),
0
0
1
1
say,
where,
(8.78)
Finally, Eq. (8.77) shows that agents disagreement, as summarized by the cross-section
( ) =1 , is a potential source of increased volatility of the pricing kernel and, then, resolution of
the equity premium and other puzzles. The model of the next section discusses a special case
of this framework that seems to illustrate this role.
8.6.3.3 Two-person equilibrium
Dumas, Kurshev and Uppal (2009) (DKU, in the sequel) consider a model in which agents
disagree on the fundamentals due to overcondence, similarly as in the model of Scheinkman
and Xiong (2002) discussed in Section 8.6.2. This model is a special case of that in the previous
section: there are two agents ( = 2) with the same CRRA coe cient , with one of them
holding di erent beliefs due to overcondence, as explained below.
The key assumptions of the model are that the expected growth of output
,
say, is
unobserved, and that the only available information is the observation of
and one additional
signal that is totally uninformative about the state of the economy. However, an overcondence
investor believes this signal is correlated with the economic fundamentals, whence a di erence
in beliefs between him and a second, rational investor, the benchmark.
368
c
by
A. Mele
= (
where 0 , , and
are constant, and 0 and
also observe an uninformative signal, solution to
(8.79)
+
are standard Brownian motions. Investors
=
is another standard Brownian motion. However, the overcondent investor believes
where
that this signal is correlated with the fundamentals, in that
p
2
= :
=
1
+
These assumptions substantially match those in the published version of Scheinkman and Xiong
(2002), as mentioned in Section 8.6.2.
Let us solve for the inference problem of the irrational investor, the problem of the rational
being a special case, notably for = 0. In terms of the learning problem of Section 8.6.1, we
have that the vector Brownian motion is
=[
], such that and in Eq. (8.40) and
8
(8.41) are
0
0
0
p
(8.80)
= [0 0
]
=
2
0
1
( | F ) is
= (
(8.81)
where F denotes the information set available to the irrational agent. Instead, the rationally
expected output growth is
( | F ) and satises
= (
(8.82)
0
where F is the information set available to the rational agent. Remaining notation is as follows:
and
are solutions to Eq. (8.43), with 0 = , 1 =
, = [1 0], and the matrix
is as in Eqs. (8.80), where
is determined by setting = 0. Below, we shall dene the two
Brownian motions in Eqs. (8.81) and (8.82).
We take the rational agent to be the benchmark. In terms of the model in the previous subsection, he thinks output is as in Eq. (8.70), where is his expected dividend growth, solution
to Eq. (8.82)and by construction, the two Brownians in Eqs. (8.70) and (8.82) coincide. The
second overcondent investor disagrees, and thinks that the expected dividend growth is ,
solution to Eq. (8.81).
As in the previous subsections, the density process summarizing the di erence in opinions
between the irrational and the benchmark is the process
solution to Eq. (8.71) with
8 Note that the in Section 8.6.1, the dynamics of the signals, S , is expressed in basis point terms, as opposed to the dynamics
of output in Eq. (8.79). However, the inference yields the same result after suitable denition of the Brownian motions.
369
c
by
A. Mele
2
where
+
(
) 0.
0 and
DKU term the density process sentiment to emphasize it arises through overcondence
over the agents information processing of publicly available signals. Because markets are complete, the asset pricing implications of this economy are obtained by relying on the instantaneous
utility of a representative agent, which by Eq. (8.76) is,
1
1
1
=
+ ( 2)
1
1
1
1
+
(
)
2
1
=
(8.83)
1
1
0
0
+
(
)
0 2
1
)=
(
1
1
+(
)1
2
(8.84)
)1
X 2
1
1
+ ( 2)
= 1
1
=0
370
c
by
A. Mele
Relying on this formula and the expression for the pricing kernel in Eq. (8.83), one nds that
the price-dividend ratio satises
2
Z
Z
X
1 0
1
0
0
0
0
0
0
2
=0
1+ 1 0
(
0 )
where the dependence of on 0 and 0 arises through the expectation inside the integral, which
DKU calculate in closed form.
Asset prices are, then, driven by three state variables: (i) sentiment, , (ii) di erence in
opinions, , and (iii) the correctly specied expected output growth, . The model generates
plausible values of both volatility and the equity premium. The main determinant of the equity
premium is the rational agents risk-aversion to the sources of risk introduced by the irrational,
summarized by sentiment and di erence in opinions, and , as illustrated by the risk-premium
in Eq. (8.84). Expected returns and volatility are, then, high, compared to an economy with
rational agents (i.e. with = 0), with rational investors increasing the proportion of their
wealth in equity only when their evaluation of the expected fundamentals, , increases by a
considerable amount. DKU also nd that the time needed for the extinction of the irrational
investors is quite high.
This section still studies asset prices in economies where agents have limited knowledge of the
statistical laws for the fundamentals. The new element is that we assume that agents give up
thinking of having a single model to decipher the signals they receive. Rather, they formulate
multiple priors underlying the laws of the fundamentals, and act while being averse to the
uncertainty inherent their own priors.
Note the di erence between this approach and that in the previous section. In the previous
section, limited knowledge of the fundamentals leads the agents to disagree on the right
model. Naturally, disagreement is not logical necessity given limited knowledge; however, it is
a natural assumption in this context, as exemplied by the overcondence bias models dealt
with in Section 8.6.2.9 Still, the previous models with disagreement rely on the assumption
that agents have a unique prior with which they interpret the complex world where they live.
Instead, in this section, limited knowledge leads agents to be skeptical about their own ability
to process complicated pieces of information. The agents acknowledge that many explanations
are possible regarding how the economy works. We survey models that allow for this line of
reasoning but, to simplify, consider only one representative agent.
The context in which agents operate in this section has come to be dubbed as Knightian
uncertainty (Keynes, 1921; Knight, 1921), or ambiguity, that is, uncertainty that cannot be
9 Di erences in beliefs do not necessarily arise through overcondence. Cujean (2013) develops a two-agent model in which
expected dividend growth is unknown, with one agent thinking it is a continuous process, and another, thinking it is a discrete
Markov chain instead.
371
c
by
A. Mele
quantied probabilistically. In fact, one may argue that, since the dawn of nancial economics,
the entire building blocks of our models have been relying on the assumption that everything
could be quantied probabilistically; the leading examples in these early developments are
trivially easy to detect in the precursory mean-variance model of Markowitz (1952) and in the
subsequent work leading to the CAPM (see Chapter 1).
But introducing Knightian uncertainty in economics is not a trivial task. We would need to
know how to model rational decision makers who face uncertainty. Standard decision theory
would not be helpful while thinking of uncertainty. We now proceed to a short survey of the
literature on ambiguity, while insisting on the key references in decision theory on which models
can be built. In subsequent sections, we show how these decision-theoretic foundations can be
relied upon to build up models that could be used to address the typical issues arising in
nancial economics.
8.7.1.2 Survey notes regarding theoretic aspects of Knightian uncertainty
From the economic-theoretic standpoint, the important issue regards how to model aversion
to uncertainty.10 One of the rst approaches to emerge relies on the so-called capacities as
explained in more detail below. This approach goes back to at least Schmeidler (1982, published
in 1989). The idea underlying capacities is to make rigorous use of non-additive measures to
formalize the concept of loss in probability as an attitude towards uncertainty in the context
of decision theory. Dow and Werlang (1992) are the rst to analyze the implications of this
theory in the context of portfolio selection. First, they show, agents do not trade when prices
are not favourable enough. Second, and within their simple introductory example, they vividly
illustrate a fundamental albeit somehow technical result known since at least Schmeidler (1986)
and further elaborated by Gilboa and Schmeidler (1989) in a context of decision theory: once
capacities are convex, the agents behaviour is the same as that of an agent who has a maxmin criterion. Max-min criteria are best described as decision rules the agents implement while
believing nature will draw worst-case scenario events. Agents then take robust decisions, in
that their choices will lead to outcomes that are the ideal ones in bad times.
Max-min criterions lead to an analytically convenient framework utilized in both nance
and macroeconomics, as explained in this section. The approach to max-min in situations
of uncertainty was originally advocated by Wald (1950), Ellsberg (1961) and Rawls (1971),
and axiomatized by Gilboa and Schmeidler (1989). The max-min criterion of choice has been
extended to smoother formulations that allow to disentangle a cognitive notion of uncertainty
from the attitude towards it.
Provide references in decision theory and on work in macroeconomics and nance.
[In progress]
8.7.1.3 Plan of the section
[In progress]
8.7.2 Uncertainty aversion and Ellsberg paradox
The Ellsberg paradox (Ellsberg, 1961) describes situations in which agents prefer to take considerable amount of risk rather than to engage into situations plagued with ambiguity. To illustrate
10 Gilboa
and Marinacci (2011) provide a survey of Knightian uncertainty in economics and decision theory.
372
c
by
A. Mele
this paradox, consider the following example relying on a two-period market, zero interest rates,
and a risk-neutral Robinson Crusoe agent.
We assume that there are three Arrow-Debreu securities that pay o in three mutually
exclusive states of the world, states A, B and C (see Table 8.2 below). We initially assume that
the probability of state A is 13 and that of state B is 14 , such that the securities values, say,
are
1
1
5
(8.85)
A =
B =
C =
3
4
12
with straightforward notation. Obviously, then, the value of a security paying o in states A or
C is higher than that paying o in states B or C,
A C
9
12
B C
8
12
(8.86)
That is, the ranking of two portfolios is preserved once we include the same additional assets
in each of these portfolios, provided the additional assets pay in states of the world in which
the initial portfolios do not.
Arrow-Debreu
securities
A
B
C
A C
B C
states
A B C
1 0 0
0 1 0
0 0 1
1 0 1
0 1 1
1
3
A C
=1
B C
2
3
How would we expect Mr Crusoe to rank these assets? We are actually stuck without making
additional assumptions regarding his attitude vis-`a-vis uncertainty. We may proceed as follow.
We assume that Mr Crusoe is risk-neutral but at the same time so averse to uncertainty that if
hypothetically asked to go long on these asstes, he would evaluate them at worst-case scenario.
To formalize this uncertainty, we may assume that Mr Crusoe may conceive a band within
which the unknown probability lies,
( ). The wider this band, the more averse he is
to uncertainty regarding . Note that and can be interpreted both in terms of uncertainty
aversion and cognitive terms, that is, in terms of the extent of ignorance about . We address
this issue later; for now, we interpret this band as indicating Mr Crusoes aversion to uncertainty.
How does asset evaluation reect uncertainty aversion? Assuming that Mr Crusoe evaluates
the asset at worst-case scenarios,
B
min
( )
and A
373
min (1
( )
)=1
c
by
A. Mele
An interesting phenomenon results. The ranking summarized by Eqs. (8.85)-(8.86) may break
1
down. In particular, if Mr Crusoe sets his band such that = 13
1 and = 3 + 2 , for two
positive numbers 1 and 2 small enough, then,
A
B and A
B C
(8.87)
That is, uncertainty aversion might actually undermine a sure thing principle, a concept
we shall return to in a moment.
The previous example is consistent with experimental evidence initially provided by Ellsberg
(1961), who showed that individuals tend to avoid situations where it is di cult to describe
events probabilistically. Klibano , Marinacci and Mukerji (2005) explain this context by relying
on the following lottery counterpart to the Arrow-Debreu security example in Table 8.2.
states
Lotteries A B C
1 0 0
0 1 0
0
1 0 1
0
0 1 1
TABLE 8.3. Lotteries in an uncertain environment.
Savages axiom P2, known as the sure thing principle, would tell us that in this context and
for any decision maker,
0
=
0
Yet assume that the probability of state is unknown, similarly as in the example of Table
8.2. Experimental evidence is consistent with the hypothesis that in this case, decision makers
would prefer the risky lottery 0 (paying o $1 with probability 23 ) rather than the lottery
0
(paying o $1 with unknown probability), even if their preferences would lead them to
choose over . In other words, aversion to uncertainty entails the following counterpart to the
inequalities in (8.87),
0
0
and
The previous examples reveal that there are new elements in asset evaluation in the presence
of aversion to Knightian uncertainty. We now proceed with a few key models that provide new
predictions in this context.
[In progress]
8.7.3 Portfolio selection and market participation
8.7.3.1 A static model
Dow and Werlang (1992) would actually deal with capacities. Heuristically, capacities are nonadditive measures, in that they do not sum up to one from the perspective of an uncertaintyaverse decision maker. For example, an investor may be unaware of the distribution of the asset
discounted value, yet he may consider that the asset discounted value is high ( ) with probability no lower than , or low ( ) with probability no lower than 0 , with these probabilities
374
c
by
A. Mele
being such that + 0 1. In other words, the (unknown) probability that the asset value is
0
,
say, satises
1
. We now illustrate how the fact that + 0
1 reects
the investors aversion to uncertainty. We shall explain that while the investor does not know
the true distribution of the asset value, he evaluates the asset by assigning low chances of good
outcomes.
Suppose a risk-neutral agent is contemplating buying the asset. The worst-case scenario for
him is that the asset value is ; however, there are chances this worst-case scenario could
improve by an amount equal to
. That is, this improvement occurs with probability
at least equal to . The minimum expected improvement is, thus, (
), such that the
minimum expected return from being long the asset is
( + (
))
, where
denotes the asset price, as usual. The expected improvement and expected returns are minimum because by assumption, the true probability is taken to satisfy
. Now, consider
a risk-neutral agent who calculates expected returns at these minimum levels; he will buy the
asset if
0, i.e., whenever the price satises
+ (
(8.88)
Similarly, consider a risk-neutral agent who contemplates selling the asset. His worst-case
scenario is that the asset is actually good, in which case his payo is
and the minimum
expected improvement is 0 (
), such that the expected return from being short the asset
is
( + 0(
)) + . Similarly as in the buy-case, the expected improvement is the
0
minimum one because by assumption, the true probability that the asset is is 1
.
Our agent would now sell the asset when
0, i.e., when the price satises
(8.89)
Note how aversion to uncertainty operates. The conditions (8.88) and (8.89) tell us that
the lower and 0 , the more averse to uncertainty you are. Aversion to uncertainty leads the
agent to presume that + 0
1. Let us elaborate. By (8.88), the agent buys when
and by (8.89), the agent sells when
, where
, with an equality holding only when
0
0
+ = 1. That is, and unless the agent is uncertainty
neutral (i.e., + = 1), the agent will
not participate in the market when the price
. When
, indeed, the price is
too high to break-even while deciding to buy, and too low to break-even while deciding to sell.
0
Dow and Werlang (1992, p. 200) dene the quantity 1
as the amount of probability
lost by the presence of uncertainty aversion.
8.7.3.2 Worst-case scenario interpretation
We can re-interpret the previous behavior in terms of decisions made under worst-case scenarios.
That is, we may assume, now, that the agent relies on a set of addititive priors, i.e., on fully
specied probability distributions. The previous model can then be interpreted as one in which
the agent picks up the worst-case distribution according to the acts he makes (buy or sell).
Let be the unknown probability that the asset value is good, such that the expected prots
can be written as
(
+ (1
)
)
where denotes the position in the asset, with | | = 1 because we are assuming that the agent
can only buy or sell one unit of the asset. Note that sums up to one now.
This model can be made consistent with the previous capacity framework. In the previous
model, the asset is worth with a probability at least , meaning that the asset is worth with
375
c
by
A. Mele
probability at most 1 . Moreover, in the previous model, the asset is worth with probability
0
at least 0 , meaning that the asset is worth with probability at most 1
. Therefore, when
0
the uncertainty averse agent buys, he does so while thinking that the probability
( 1
)
that the asset value is is as small as possible. Similarly, when he sells, he does so while thinking
that the probability 1
( 0 1
) that the asset value is is also as small as possible.
That is, when the agent buys, the prots he expects are:
min
0)
( 1
+ (
min
0
(
)=
)=
Thus, this model collapses to the previous capacity model, as claimed. it also provides additional rationale regarding the probability bands underlying the Arrow-Debreu asset evaluation
in Section 8.7.2.
A technical detour: Provide denitions/relations amongst capacities, convex measures, cores,
etc.
[In progress]
8.7.3.3 Maxmin preferences
where
=(
) , and
denotes the expectation taken under the probability law that
results when = .
Knightian uncertainty arises in this model due to the lack of knowledge regarding the dividend
distribution: while the agent knows is normally distributed, he is unaware of its exact location,
. The presumption that
[ ] might indicate his aversion to this uncertainty: the wider
he presumes this band is, the more uncertainty averse he is. One alternative interpretation
of the band is that this very same band plays a merely cognitive role, in that the agent only
knows that there are two bounding constants to the truth, and . In this model, it is actually
impossible to disentangle the cognitive notion of uncertainty from the agent attitude towards
it. There are more general models of ambiguity in which these two notions can be separated,
surveyed in the next subsection.
The optimization problem in (8.90) is solved as follows. First, the agent solves for the inner
problem; that is, he takes a portfolio choice as given and, then, gures out what a malevolent
Nature would pick up for him given his portfolio choice. For example, if the agent buys the
asset,
0, it easy to see that the solution for the inner problem is
. Analogously, if
0, then, the inner solution is
. In other words, the agents aversion to uncertainty
leads him to presume that Nature will pick up a bad asset for him when he buys ( = ), and
a good one when he sells ( = ).
376
c
by
A. Mele
Given the solution to the inner problem, , the agent proceeds with solving for his portfolio
choice. The rst order conditions lead to
=
( )
We have that 0
( )
and
the overall optimization problem in (8.90) is
( )
for
for
( )
2
( )
( )
for
( )]
( )
( )
(buy region)
(non-participation region)
(sell region)
The interpretation of the agents behavior is similar to that regarding the previous model.
The agent participates in the market if the price is su ciently favourable to him, compared
to the worst-case scenario: he buys (sells) when the price is lower (higher) than his own most
pessimistic (optimistic) expectation of the asset payo . In case the price is not favourable
enough, the agent does not participate.
It is interesting to examine the equilibrium implications of the model. Suppose that the asset
is in positive supply, , say. In equilibrium, = . Then, the equilibrium price is
=
( )
That is, the equilibrium price reects the most pessimistic evaluation of the dividend. This
property will extend to the dynamics models below.
8.7.3.4 Smooth ambiguity aversion
Klibano , Marinacci and Mukerji (2005) (KMM, in the sequel) introduce a model of ambiguity
in which the cognitive notion of uncertainty can be disentangled from the attitude towards it.
We illustrate the main features of this model while extending the previous market, as follows.
Conditionally on , the asset payo
is still normally distributed, but uncertainty aversion is
modeled assuming that there exists an increasing and concave function : R
R, such that a
decision maker prefers the portfolio holdings 0 to if and only if
0
(
))]
(8.91)
M[ (
M
where M
the set of priors on . For example, Mele and Sangiorgi (2015) assume that
denotes
2
in their model with asymmetric information, an assumption that will be used
0
in this section too.
Concavity of is crucial to the denition of ambiguity aversion. Let us explain. Lack of
knowledge of is the source of uncertainty in the model, and concavity of implies that a
decision maker dislikes mean-preserving spreads in expected utility values that arise due to :
he is thus ambiguity averse, unless is linear, in which case he is ambiguity neutral. Therefore,
in this model, Knightian uncertainty arises because is unknown, with 2 measuring how
acute uncertainty is. What makes the model new compared to one with only second-order
377
c
by
A. Mele
uncertainty (i.e., the presence of a stochastic mean in the asset payo ) is the aversion to this
second-order uncertainty, arising through the concavity of .
Regarding the functional function for , one may consider ( ) = 1 (
1), where the
parameter measures absolute ambiguity aversion, with the model collapsing to one with
an ambiguity neutral agent only when = 0. Based on KMM (Proposition 3), one can show
that maxmin expected utility obtains for large. This section follows Mele and Sangiorgi, in
that is taken to lead to constant relative ambiguity aversion, i.e. ( ) = ( ) , for some
constant
1, with the model collapsing to a description of an ambiguity neutral agent only
if = 1. For otherwise, the higher , the more averse to ambiguity the agent is, independent
of the extent of parameter uncertainty about , which is summarized by 2 , as explained.
Thus, the agent solves the following optimization problem:
= arg max M
(8.92)
where the Appendix shows that
= exp
M
1
) +
2
( var ( ) + (1
) var ( | ))
(8.93)
Thus, the program of this ambiguity-averse agent resembles that of a mean-variance agent,
although the variance term is replaced by a convex combination of the unconditional variance
var ( ) = 2 + 2 and the conditional variance var ( | ) = 2 , such that V
var ( ) +
2
2
2
(1
) var ( | ) =
+
. If the agent were ambiguity neutral, = 1, V =
+ 2 , such
that the problem would be indistinguishable from a standard mean-variance program with an
increased variance: in other words, second-order uncertainty would not matter. Instead, secondorder uncertainty matters when the agent is ambiguity averse, in which case the solution in
(8.92) is
0
2
+ 2
The more averse to ambiguity the agent is, the higher , and the less aggressive his portfolio holding will be for a given uncertainty level 2 . The equilibrium implications of the
model are straightforward.
With the asset in positive supply, , the equilibrium price is =
2
2
+
,
with
an
uncertainty premium component, being clearly related to both
0
2
uncertainty, , and uncertainty aversion, .
=
+
378
(8.94)
c
by
A. Mele
0)
(8.95)
In this family of models, the distorsion function generates deviations from the benchmark.
It captures the idea that agents face Knightian uncertainty, in that they have limited knowledge
regarding whether their reference model is correctly specied. Moreover, the agents assume that
satises
2
( )
( )
for all
(8.96)
and for some known function . In words, Eq. (8.95) contains model specications that are
statistically close to the reference model (8.94): they are so close that it is actually di cult to
distinguish them statistically. Naturally, in the absence of ambiguity, one has that
0, such
that this model would collapse to those seen in Chapter 7.
Ambiguity, as described until now, is a source of second-order uncertainty, i.e., one that
adds uncertainty (the agents acknowledgement of dealing with model misspecication) to an
already unobserved process (the drift of dividend growth), which they may wish to learn about.
The crucial point is how the agents behave vis-`a-vis this added uncertainty. The behavioral
assumption is that they fear model misspecication, and choose consumption and portfolio
policies (
) that maximize their worst-case scenario welfare, i.e., their lifetime utility arising
when a malevolent Nature chooses the worst possible model in (8.95) for them:
Z
max inf
( )
(8.97)
(
(0
379
c
by
A. Mele
probability
Pr( = F ), where F denotes the information set availaible to agents at
time . Then, by results given in Chapter 7, we have that
(1
( )
where
= 01
.
An equilibrium is one in which (i) a representative agent solves the optimization problem
(8.97) and (ii) his optimal consumption equals dividends, =
for all . To solve the optimization problem, and in analogy with the static maxmin model of Section 8.7.3, the agent rst
determines the inmum in (8.97), while taking as given the consumption and portfolio choices
in the outer optimization problem. Given the thusly determined functions, the agent solves for
the outer portfolio policies while imposing the equilibrium condition that = . Given this
equilibrium condition, we have that the inmum is attained with
Z
() arg inf
( )
0
= . Note that because the
Let ( )
: (8 96) holds and assume that ( )
drift of
in (8.95) is increasing in (), then, by a comparison theorem (e.g., Karatzas and
Shreve (1991, p. 291-295)), the previous expectation is increasing in (), such that13
p
( )=
( )
(8.98)
In other words, asset prices are now evaluated as if the aggregate dividends dynamics were
fully observed, but with left-tilted bounds to growth,
=
=
(
0
(8.99)
)(
E =
Vol
1
( + 1)
2
= +
2
0
Vol =
( )
( )
( )
(8.100)
where () denotes the di usion coe cient of in (8.99) and is the price-dividend ratio.14
By results given in Chapter 7 (see Section 7.4), the price-dividend ratio is an a ne function of
the dividend growth expected under the agents worst-case scenario probability, . Appendix
3 shows that
( ) =
| {z }
1
(1
) (
+
1
2
2
0)
13 Precisely,
| {z }
1
(1
)(
1
2
2
0)
(8.101)
=1
note
380
c
by
A. Mele
The price-dividend ratio is a weighted average (weighted with the posterior ) of the discounted lifetime expected dividends conditional upon the up () and down ( ) states of the
world, and under the worst-case drifts. Provided
1, the price-dividend ratio is decreasing
15
in the degree of ambiguity aversion, ().
Note that the equity premium in (8.100) originates from the perspective of the worst-case
probability in the family of models included in (8.95). Appendix 3 shows that the equity premium under the reference model (8.94) is
E =E
Vol A
() + (1
) ( )
(8.102)
where E is the worst-case scenario equity premium in (8.99) and A is the average size of
ambiguity aversion. By Eq. (8.98), A
0, such that ambiguity aversion contributes positively
to the thusly dened equity premium.
The rationale behind the denition of E in (8.102) relies on the following interpretation of
the reference model. The ambiguity averse agent prices the asset at the worst-case scenario, yet
Nature draws aggregate dividends according to Eqs. (8.94). Naturally, the agent is unaware of
this statistical law, and places a band around , a band that could be symmetric or asymmetric,
reecting the modeling assumption that the agent only knows that for each regime (up or down),
the expected dividend growth belongs to a certain band.16
regime,
= .
381
c
by
A. Mele
8.8. Production
0.06
0.25
0.05
with
ambiguity
premium
0.2
0.04
0.15
0.03
0.1
0.02
0.05
0.01
0.01
0.02
0.03
0.01
0.02
0.03
FIGURE 8.2. Left panel: The solid line depicts the equity premium under the reference
probability, E
in Eq. (8.102), as a function of the expected growth, ; the dashed
line is the equity premium under the worst-case probability, E
in Eq. (8.100). Right
panel: return volatility, Vol in Eq. (8.100). In both panels, parameter values are = 0 005,
= 0 03, 0 = 0 01, = 1 , = 0 04, and, nally, ( ) = = = 0 05.
2
Note that the equity premium is inverse-U shaped against , a property arising mainly
because return volatility is inverse-U shaped. The origins of this property are claried in Chapter
7 (Section 7.5.4). Models with multiple states (such as those considered by LTV) and meanreverting behavior (ensured while assuming is a Markov chain) are natural candidates to make
volatility and equity premiums visit their descending parts more often than their ascending
parts, thereby leading to a countercyclical behavior.
8.8 Production
Consider an economy with one representative rm producing one single good, as in Section
3.4.1.2 of Chapter 3, and paying o a dividend (
) in each period , expressed as a
function of capital
and investment , with partial with respect to capital
equal to
(
):
(
)
(
( ))
( )
(
)
(
( ))
382
c
by
A. Mele
8.8. Production
Remember, Tobins marginal q and average q are the same, by Theorem 3.2, meaning that the
stock market value of the rm, ( ), coincides with the value of installed capital, ( ) =
collapses to Tobins q, once we x the price of uninstalled capital to one,
1,
+1 , where
which is the case as soon as the rm produces uninstalled capital, simply. A few calculations
allow us to dene equity returns in this economy. First, we note that:
(
)=
=
=
=
=
+1
[
[
[
[
(
(
(
+1 (
(
+1 (
+1 ( (
+1
+1
+1 )
+1
+1 )
+ (1
)
+1 + +1 (
+1
+1 )
+1
+1 )
+1
+1
+1
+1
+1 )]
+1
+1 ))]
+2
+1 ))]
+1 ))]
where the second line follows by the q theory, as developed in Chapter 3, the third and fourth
lines by the law of capital accumulation, and the expression for ( +1 ), the fth line by the
condition +1 =
( +1 +1 ), and the homogeneity of the function . Therefore, equity
returns are:
+1
+1
+1 )
(
(
+1
+1 )
+1 )
)
+1
+1
+1
+1 )
))
To match the volatility of equity returns, a model without adjustment costs would require
a counterfactually large volatility of the marginal product of capital. Therefore, not only are
adjustment costs needed to rationalize the existence of time-varying market-to-book ratios.
Adjustment costs would have the potential to boost return volatility. But then, the equity
premium puzzle can only be exacerbated in a setting without adjustment costs. Note, indeed,
that by the usual representation of the equity premium in Section 6.5 of Chapter 6,
+1
corr (
+1
+1 )
Std (
(
+1 )
+1 )
Std
+1
where denotes the equity return in excess of the risk-free rate. Unless the excess returns predicted by the model co-vary substantially, and negatively,
c
by
A. Mele
we need to introduce some sort of hindrance to the adjustment of capital supply to shocks.
For example, Jermann (1998), assumes the presence of adjustment costs. Instead, Boldrin,
Christiano and Fisher assume, among other things, that investment decisions can be thought
to be determined prior to the realization of the shocks. Both Jermann (1998) and Boldrin,
Christiano and Fisher (2001) consider economies with habit persistence anyway, which allows
them to generate variability in the demand for capital and, hence, boost price volatility.
[In progress]
c
by
A. Mele
make reference to the Menzly, Santos and Veronesi (2004) economy in Section 7.5.4 of Chapter
7. We denote the equilibrium surplus consumption ratio with =
, where, as explained
extensively in Chapter 7, is solution to,
1
1
1
1
1
=
0
( + )
)+
(1
1
= (
+
)
=
=
+
( + )
where
Vol
= lim
where Vol ( ) =
and
+(
= lim
Vol
| {z
0
0 15
0+
Vol ( )
0+
, such that
=
Vol ( )
0 +
|{z}
| {z }
+
|
{z
}
}
=0 01
510 3
= endog. P/D uct. 26 31
+
Vol ( )
| {z }
+
+
|{z}
{z
}
|
510 3
Vol ( )
= leverage multiplier
(8.103)
0 24
11 08
where we have indicated the approximate average values taken by the variables of interest, and
obtained by calibrating the model with the values of Table 8.2 on Section 8.9 below. Note, also,
that the leverage ratio, , is endogenous and equal to,
=
(
+
+
(
)
+
and Vol
, as the surplus changes.
In other words, we only see what happens to
As the numerical values in Eq. (8.103) show, much of the action in this model derives from the
0
large swings in the price-dividend ratio, (( )) = + . What is the statistical relation between
and return volatility that the model predicts? Figure 8.3 depicts values
the leverage ratio
385
c
by
A. Mele
of the leverage ratio and volatility consistent with the model. Note that Figure 8.3. does not
depict a causal relation, as leverage and equity volatility are both driven by the same state
variable, the surplus consumption ratio.17
Vol
0.20
0.15
0.10
0.05
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
leverage ratio
8.10.3 Bankruptcy
The previous model has no role for bankruptcy, which plays an obviously fundamental role,
as developed in great detail in in Chapter 13. Let us consider bankruptcy in a simple setting.
Consider a two date economy, and suppose that the value of the rm in one year is, , which
equals bad
Nominal debt, with probability , and good
Nominal debt, with probability
1 . We assume risk-neutrality. and that are no bankruptcy costs. Let = 1 0 0 be the equity
q
return, where 1 is the equity value at the second period. Then, we have that vol( ) =
.
1
For example, if
1
1 2
( )= +
( (1
)
) + ( ln )
(8.104)
0
0
2
2
17 Does debt maturity lead to a greater contribution of leverage to volatility? It is not the case. Given the models parameters,
the e ects of debt maturity on leverage can be shown to be quite limited.
386
c
by
A. Mele
where is the surplus consumption ratio, is a constant, and all the remaining parameters are
as in Section 7.5.2 of Chapter 7. Wachter (2006) analyzes the term-structure implications of
this model in detail, both real and nominal, within an environment with time-varying expected
ination.
Note, the constant does not depend on anything relating to the agents preferences. Its mere
role is to make interest rates time-varying. How to ensure that Eq. (8.104) is consistent with
optimizing behavior? As explained in Chapter 7, the short-term rate depends on the sensitivity
of habit to consumption shocks, a function of , ( ), through an e ect due to precautionary
savings: the higher this sensitivity, the higher the volatility of habit and, hence, the propensity to
save, which drives interest rates down. This sensitivity ( ) is free, in that it is not restricted by
the theoryCampbell and Cochrane simply guide us with heuristic considerations leading to it.
One of these considerations is that the short-term rate also relates to habit, due to intertemporal
substitution e ects, and negatively, due to mean-reversion. Campbell and Cochrane choose ( )
such that intertemporal substitution e ects exactly o set precautionary savings, thereby making
the short-term rate constant or, at most, a ne in the log surplus consumption ratio, as in Eq.
(8.104). Naturally, the sensitivity, ( ), is a function of , once this reverse engineering has
unfold, as shown in Appendix 5 of Chapter 7.
The question arises as to which sign we should expect from the parameter , empirically. Are
real interest rates countercyclical? They are. It is somehow puzzling, from the perspective of
the basic production economies analyzed in Chapter 3, where real interest rates are procyclical,
being positively related to the marginal product of capital and, hence, to productivity shocks.
However, economies with habit formation might be capable of generating countercyclical real
rates, due to intertemporal substitution e ectsIt is the case, for example, for the models with
frictions in the adjustment of capital supply to shocks of Boldrin, Christiano and Fisher (2001).
In endowment economies and habit formation, countercyclical real rates are, then, quite likely
to arise. Consider, for example, the Menzly, Santos and Veronesi (2004) model of external habit
formation presented in Section 7.5.4. we remind that this model predicts that the short-term
rate is:
2
2
( )= + 0
(8.105)
+
1
1
0
0
Figure 8.4 depicts the short-term rate as a function of , obtained using the parameter values
in Table 8.2, which are similar to those used by Menzly, Santos and Veronesi.
R(s) 0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.010
0.015
0.020
0.025
387
0.030
0.035
0.040
c
by
A. Mele
FIGURE 8.4. The short-term rate predicted by Menzly, Santos and Veronesi (2004) model
of external habit formation, with parameter values as in Table 8.2.
0 03 0 01 0 04 0 15 0 03 40 0 05 0 60
0
TABLE 8.2. Parameter values utilized for the Menzly, Santos and Veronesi (2004) model
of external habit formation.
The fourth term of Eq. (8.105) reects intertemporal substitution e ects, and is the dominating term, leading to countercyclical interest rates, due to the mean reversion in the surplus
consumption ratio, and similarly as in the Campbell and Cochrane model, as explained in Section 7.5.2 of Chapter 7. Finally, the catching-up model of Chan and Kogan (2002) reviewed in
Section 8.3 leads to the same prediction: real interest rates are countercyclical.
[In progress]
1
1
( +1 ) 1
+1
+1
(8.106)
=
=
+1
+1
1
+1
where denotes the certainty equivalent continuation utility expressed in consumption units,
1
) , which is obviously ordinally equivalent
and is solution to Eq. (8.3). Next, dene = (1
to and satises
1
= (1
) +
( 1+1 ) 1
ln
388
+1
(8.107)
c
by
A. Mele
+1
= ln
1
2
+1
+1
(8.108)
for some constant . Note that Eq. (8.108) is a special case of the dynamics of consumption
in the Bansal and Yaron (2004) model in Section 8.2it does not include the small persistent
component
of Eq. (8.19). Guessing that
= + ln , and solving for the undetermined
2
(1 ) (1
) 2
coe cients and using Eq. (8.107), delivers =
and = 1 1 , such that
3
2(1 )
the stochastic discounting factor in Eq. (8.106) for = 0 is
+1
+1
+1
=
+1
+1 )
=
+1
We have,
(
+1 ) =
(
(
+1
+ 12 (2
+1 ) =
+1 )
2 2
+1
(8.109)
By increasing , the volatility of the stochastic discounting factor increases although, then,
( +1 ) remains substantially at. Once again, this property is what makes non-expected
utility address the interest rate puzzle (see Section 8.2). As also discussed in Section 8.2, nonexpected utility does not necessarily imply a resolution of the equity premium puzzle: Eqs.
(8.109) clearly indicate that large values of are needed to inate the equity premium to empirically plausible levels. Alternatively, we need additional sources of variation in the primitives
of the economyfor example, after Tallarini, long-run risks are proposed by Bansal and Yaron
(2004).
8.13.2 Preferences for robustness
Hansen and Sargent (2008, page 317) demonstrate that this model can be understood as leading to a resolution of the equity premium puzzle even without long-run risks, by simply re1
interpreting the parameter
in Eq. (8.107) as a multiplier for an individual with ambiguity
averse multiplier preferences.
8.13.3 Irrelevance
The main point of Tallarini is an irrelevance result: we can understand the dynamics of asset
markets independently of those of real economic aggregates, due to the linearity of preferences
resulting from the assumption that intertemporal elasticity of substitution is one, = 0. To extend the previous model to one with real aggregates, Tallarini considers the following extension
to Eq. (8.107),
0
+1
= ln + ln + 0 ln
(8.110)
c
by
A. Mele
dimensions we should consider, to conclude on any models prediction about asset prices. For
example, Tallarinis assumption of no adjustment costs implies Tobins q is one. Moreover,
welfare calculations such as those in Lucas (19??) are likely to change, as Alvarez and Jermann
(20??) demonstrate.
Hansen and Sargent (2008, page 317) demonstrate that this model can be understood as
leading to a resolution of the equity premium puzzle even without long-run risks, by simply re1
interpreting the parameter
in Eq. (8.107) as a multiplier for an individual with ambiguity
averse multiplier preferences.
[In progress]
c
by
A. Mele
There are two types of agents: (i) productive or farmers and (ii) unproductive or
gatherers. Farmers invest into a linear technology, obtaining output +1 ,
=
+1
(8.111)
, their
(8.112)
where
1
(8.113)
Productivity is random: in each period, farmers may become gatherers, and gatherers may
become farmers. We shall develop more details regarding this assumption below. Intuitively,
switching types is required to make sure both types survive.
All agents maximize the same intertemporal utility of consumption subject to their budget
constraint. For example, the farmers plan is
!
X
+1
max
s.t.
ln +
+ = +
(8.114)
(
=0
=0
=(
(8.115)
The farmers plan is then as in (8.114), subject to the ow-of-funds constraint in Eq. (8.115).
The solution to this program is well-known (see, e.g., Chapter 3). Optimal consumption is =
(1
) . Moreover, gatherers do not participate in this economy. Therefore, in equilibrium,
= 0 for all , such that by the constraint in (8.114),
=
. Aggregate investments are
18 A formal proof of this statement relies on the rst order conditions of the gatherers problem (the equivalent to (8.114)) with
and +1 and based on the production function in (8.112).
respect to
391
c
by
A. Mele
, where
denotes the aggregate net-worth, which equals aggregate output,
because aggregate borrowing is zero. To sum up,
+1
+1
+1 ,
(8.116)
(8.117)
+1
We search for an equilibrium in which = around the stationary state, and (8.117) holds
as an equality. Below, we conrm that such an equilibrium exists, by just verifying that the
amount of the gatherers investments is bounded and bounded away from zero.
Note that (8.117) is the friction that prevents farmers from borrowing in an innite amount
even while the interest rate is less than their productivity.19 Thus, the farmers have access to
good nancing conditions but they cannot borrow an innite amount as they are borrowingconstrained by (8.117). They will then borrow as much as they could, i.e.,
+1
+1
(8.118)
The previous constraint imposes an upper limit to the farmers investment, . Using the
budget constraint in (8.114), the equilibrium condition = , and Eq. (8.118), it is:
=
(8.119)
Investments
are just mechanically derived from the budget constraints. The numerator
in (8.119) equals the farmers savings. It is the down-payment that is used to invest over and
+1
above the amount borrowed,
. Furthermore, to ensure this investment plan is part of the
equilibrium, it must be that
: the collateralized return (i.e., the amount of borrowable
funds) needs to be less than the interest rate to ensure that the current borrowing is less than
. The ow-of-funds accounting is now
+1
+1
+1
= (1
(1
1
392
(8.120)
c
by
A. Mele
The program of the farmers in this economy is to maximize the intertemporal utility in
(8.114) subject to (8.120). We solve for consumption , which then determines
in (8.119).
The solution for consumption follows by the usual argument. For a log-utility maximizer, =
(1
) . Replacing into (8.119), and aggregating over all farmers leaves:
=
(8.121)
1
(8.122)
=
=
1
such that aggregate output satises: +1 =
+1
=
+(
+1
+1
, or,
(8.123)
+1
1
1
Moreover, assume that productivity is persistent, in that Pr (
Pr ( +1 | ) Pr ( +1 | ), i.e.,
1
+1 |
Pr (
+1 |
) and
(8.124)
= (1
)(
+1
+1 )
+1
393
+1 )
+1
+1
(8.125)
c
by
A. Mele
where capitalized letters denote aggregate variables, and the second equality holds in equilibrium.
Requiring that agents can swich their types (
0) ensures that a stationary state steady
exists, which is compatible with an equilibrium in which both types survive. Intuitively, if = 0,
the farmers net-worth would dominate, driven by the farmers higher productivity. Moreover,
persistence in productivity shocks (1
) is needed to ensure that a stationary steady
state exists.
To corroborate these claims, we determine the net-worth of both the farmers and gatherers,
and replace them into the farmers law of new-worth accumulation, Eq. (8.125). By using the
optimal consumption = (1
) into (8.120) and aggregating, yields the farmers aggregate
net-worth before the productivity shock,
+1
+1
(1
1
(8.126)
The gatherers aggregate net-worth is obtained by plugging Eq. (8.121) into the farmers
aggregate debt obtained through Eq. (8.118), and plugging
from Eq. (8.122) into
+1 =
, leaving:
(
)
(8.127)
+1 +
+1 =
Finally, we plug Eqs. (8.126)-(8.127) into Eq. (8.125) and use the expression for
(8.123), obtaining,
(1
)
(1
) +
( )
+1 =
+(
)
+1
in Eq.
(8.128)
An equilibrium is a sequence ( ) =0 for any given initial condition 0 . Note that (0) =
and (1) = 1
. Therefore, (8.124) guarantees that a xed point
exists, and is such that
= ( ), with = 1 when = 0. Moreover, under conditions, there is a unique
. So
productivity switches are needed to prevent this economy from reaching the trivial steady state
with gatherers extinction. Moreover, persistence of these switches needs not to be too large.
Suppose we are at the steady state,
, and that there is an aggregate productivity shock
at , in that both and lower by . After this shock, the farmers aggregate net worth is, by
Eq. (8.125):
) ((1
)
+
) (1
) +1
+1 = (1
That is, +1 obviously goes down, although proportionately more than aggregate output,
. It will now take
+1 =
+1 +
+1 , due to leverage. Therefore, after the shock,
+1
some time for the farmers aggregate share of net-worth
to converge towards its steady
state , through Eq. (8.128). This is propagation due to credit constraints. Through leverage,
a temporary productivity shock makes farmers aggregate net worth fall more than output,
leading to deviate from its steady state. It will now take time for to catch up to . During
this recovery process, output growth, +1 in Eq. (8.123), will obviously be lower than that in
the steady state, and will only gradually converge to it. In contrast, a temporary shock in an
unconstrained economy will only have a temporary impact.
8.14.2 Amplication
The model in the previous section illustrates how propagation mechanisms operate in an economy with borrowing constraints. However, the main idea underlying the nancial accelerator
394
c
by
A. Mele
doctrine is that of contagion, that is, spillover of capital markets shocks into the real sphere
of the economy. In the examples of the previous section, the farmers borrowing capacity can
depend on the value of some collateral, say land. In bad times, when the land value drops, the
borrowing capacity decreases, which makes land value decrease even more. Land is a parody
for assets.
We explain how this channel works by relying on the Kiyotaki and Moore (1997) model,
which actually is a model simpler than that in the previous section.
[In progress]
8.14.3 Additional literature
Adrian and Shin (2011); Brunnermeier and Sannikov (2013); Danielsson, Zigrand and Shin
(2011); Geanakoplos (2010); Gertler and Kiyotaki (2011); He and Krishnamurthy (2012): Hugonnier
and Prieto (2012); Shin (2010).
[In progress]
395
c
by
A. Mele
X
+1 +
= 1+
X
(
= 1+
+1
+1
+1
+1
+1
+1
+1
)=
1
1
) = max W (
( (
+1
+1 )))
) )
( (
+1
+1
+1
, the denition of
+1 ) (1
1
+1 )))
yields
+ ((1
= W2 (
( (
+1 )))
+1
1(
+1
+1 ))]
(8A.1)
where subscripts denote partial derivatives. Thus, optimal consumption is some function of the state
(
) such that,
(
)) (1 +
( +1 ))
+1 = (
By di erentiating the value function with respect to
1(
) = W1 ( (
+ W2 ( (
( (
+1
+1 ))) 1 (
( (
+1
+1 )))
,
)
1(
+1 ) (1 +
+1
+1 ))] (1
1(
))
where subscripts denote partial derivatives. By replacing Eq. (8A.1) into the previous equation leaves
the envelope condition for the dynamic programming problem,
1(
) = W1 ( (
( (
+1
+1 )))
(8A.2)
By replacing Eq. (8A.2) back into Eq. (8A.1), and rearranging terms,
W2 ( (
W1 ( (
)
)
(
(
))
W1 ( (
))
+1
+1 )
+1
+1 )) (1 +
( +1 )) = 1
) (
))
W2 ( (
W1 ( ( +1 +1 ) ( +1 +1 )) (1 + ( +1 )) = 1
(8A.3)
W1 ( (
) (
))
396
c
by
A. Mele
) = max W (
( (
= max W (
+1
P
where +1 =
( +1 +
rst order conditions are
+1
+1 )))
+1
+1 )
:0=
+1
W1 ()
+ W2 ()
1(
+1 ) (
+1
( (
+1
+1
+1
+1 )))
+1
+1 )]
) (
) (
))
W1 ( (
))
+1 )
+1
+1
+1 ))
+1
+1
=1
Derivation of Eq. (8.7). We need to explicitly determine the stochastic discount factor in Eq.
(8A.3)
;
+1
+1 )
W2 ( (
W1 ( (
)
)
(
(
))
W1 ( (
))
+1 )
+1
+1
+1 ))
Note that
W1 (
)=
W2 (
)=
such that
(
+ ((1
) )1
+ ((1
) )
W2 (
+1 ) =
W1 (
+1
1
1
+1 )
+1
1(
) = W1 ( (
(
+1 ))
+1
+1
+1 )
( (
W(
)
+1 )
))
(1
+1 )
+1
(8A.4)
+1
)). Therefore,
1
+1
(8A.5)
(1
1
1
+1 ))
+1
(1
=W( (
=
+1 ) =
( ( +1
( +1
( (
(
) )
)=W( (
=W(
((1
+1
)
W1 (
)
(
).
where
Along any optimal consumption path,
(
where the rst equality follows by Eq. (8A.2), the second by the rst of Eqs. (8A.4), and the last is
1
the optimality condition. We conjecture that (
) = ( ) 1 , such that the previous expression
delivers the agents optimal consumption:
)= ( )
( )
( ) (1
)(
1)
(8A.6)
Therefore,
+1
= (1
( ))
397
(1 +
+1 ))
(8A.7)
c
by
A. Mele
+1 ))
+1 )
+1 ) (1 +
1
+1 ))
(8A.8)
( +1 ))1
)). Using the expression of W in Eq. (8.6) and
+1 ) (1 +
= ( )
=
=
1
1
1
1
( )
( )
+1 ) (1 +
(
(
1
+1
+1 )
+1 ) (1 +
+1 ) (1
( ))
+1 ))
(1 +
( +1 ))1 =
( +1 ) (1 +
(
1
( ( +1 +1 ))
=
( +1 +1 )
(1
( )) (
+1 ) (1 +
+1
+1 ))
(1
+1 ))
)(
(8A.9)
(1
+1 ))
( ))
(1 +
1
(1 +
+1 ))
1
1
(
! (1
)(
1)
(8A.10)
! (1
+1 ))
)(
1)
)(
+1
(1
( )
( )
( )
+1
1)
! (1
)(
1)
1)
where the rst equality follows by Eq. (8A.6) and the second by Eq. (8A.7). The result follows by
replacing the previous expression into Eq. (8A.5).
Proof of Eqs. (8.12) and (8.13). By the standard property that if is normally distributed,
(), we can elaborate on Eq. (8.10), obtaining,
ln ( ) = () + 12
ln( +1 )+
+1
0 = ln
!
2
2
1
+1
2
2 2
ln
+
(
+
2
=
(8A.11)
+1 ) +
2
We do the same in Eq. (8.11), and obtain:
+1
= +
ln
(
1) (
+1 )
1
2
+(
1)
By replacing Eq. (8A.12) into Eq. (8A.11), we obtain Eq. (8.12) in the main text.
in Eq. (8.13), we replace the expression for (
To obtain the risk-free rate
into Eq. (8A.12).
398
1)
!
(8A.12)
+1 )
in Eq. (8.12)
c
by
A. Mele
ln( +1 )+ 1 1 +1
1 +
+1
) + ln
0 = ( 0 (1
1) 0
1
1 2
1 1
=
( 0 (1
+ ln
)+ 1
1) 0
0
2
1
+ ( 1
1) 1 + 1
1 1
const1 + const2
where the second equality follows by Eqs. (8.19) and (8.20). Note, then, that this equality can only
hold if the two constants, const1 and const2 are both zero. Imposing const2 = 0 yields,
1
1
1
as in Eq. (8.22) in the main text. Imposing const1 = 0, and using the solution for
for the constant 0 .
1,
1
)
1
+
1
2
=
(
)
( )k k
with
+
2
=0
Now, (
) is the aggregator, with being a variance multiplier, placing a penalty proportional to
) is the continuous time counterpart to the aggregator
utility volatility k k2 . The aggregator (
( ) of the discrete time case. The solution to the previous stochastic di erential utility is:
Z
1
2
( )k k
(
=
)+
2
For example we can take,
(
)=
with =
6 0 and
1. The standard additive utility case is obtained once
case, ( ) = ( )
(see, e.g., Du e and Epstein (1992, p. 367).
[In progress]
399
= 0 and
= , in which
c
by
A. Mele
[8A.P1]
where
is the aggregate endowment in the economy.
We explain how the equilibrium price system is determined as the Arrow-Debreu state price density
in an economy with a single agent endowed with the aggregate endowment , instantaneous utility
is the reciprocal of the marginal utility of
function ( ), and where the social weighting function
income for agent
. Indeed, the rst order conditions for the program [8A.P1] are
0
for all
(8A.13)
with respect to
, and
is a Lagrange multiplier.
,
=
(8A.14)
( ) denotes the partial derivative of with respect to . The second equality in Eq. (8A.14)
where
follows by the optimality conditions (8A.13), and the third, by di erentiating the constraint of the
social plan [8A.P1]. Combining Eqs. (8A.13) and (8A.14) leaves
(
(
)
=
0)
(
(
)
0)
for all
(8A.15)
0
0
(
(
)
0)
for all
(8A.16)
To show that the allocations in Eq. (8A.15) and in Eq. (8A.16) are the same, we need to show that
(
)=
(8A.17)
1 , where
We show that Eq. (8A.17) holds true once the social weight
=
is the marginal
utility of income of agent . Indeed, in this case, Eq. (8A.13) and the optimality conditions for the
decentralized economy lead to,
for all
)=
and
400
)=
(8A.18)
c
by
A. Mele
By aggregating the rst and the second of the previous conditions leaves:
Z
Z
=
(
)
=
where
ans denote the inverse functions for consumption, as implied by the social and private
allocations in (8A.18). Eq. (8A.17) follows by an argument similar to that used to show Theorem 2.7 in
Chapter 2, and by the envelope theorem in Eq. (8A.14). Therefore, the pricing kernel in this economy
with heterogeneous agents and complete markets has the same pricing implication as the economy
with a single agent as explained, viz
( )
=
(8A.19)
( 0)
0
The practical merit of this approach is that while the marginal utility of income is unobservable, the
thusly constructed Arrow-Debreu state price density depends on the innite dimensional parameter,
, which could be calibrated to match selected quantitative features of consumption and asset price
data.
We now apply this approach and derive the equilibrium conditions in two models and then, move
to study the allocation process in an incomplete market setting.
catching up with the Joneses (Chan and Kogan, 2002). In this model, markets are
(
) =
complete, and we have that
= [1 ] and the instantaneous utility of agent
is,
(1
), where is the standard of living of others, as explained in the main text.
( / )1
The static optimization problem for the social planner in [8A.P1] can be written as,
Z
Z
( / )1
=
(8A.20)
(
) = max
s.t.
1
1
1
The rst order conditions for this problem lead to,
(8A.21)
2
( (
))
( (
))
(
) ln
(8A.22)
( (
)) =
0
2
where
that:
( ) is the value function in Eq. (8.25). To evaluate the previous expression for , note, rst,
Z
0
1
( (
))
1 1
( (
))
=
( (
))
(8A.23)
1
Moreover, by di erentiating Eq. (8.26) with respect to , using Eq. (8A.23), and rearranging terms,
leads to
( (
))
= ( (
)) . Di erentiating this expression for
with respect
to
again, produces:
0
2 ( (
))
( (
)) 1
(8A.24)
=
2
401
c
by
A. Mele
Replacing Eqs. (8A.23)-(8A.24) into Eq. (8A.22) yields Eq. (8.27) in the main text. In this ctitious
representative agent economy, the short-term rate is the expectation of the stochastic discount factor.
It equals, again by results given in Section 7.5.1 of Chapter 7,
2
)=
( (
( (
))
))
( (
1+
))
( (
))
( (
))
))
(
)
))
( (
( (
1
2
2
0
1
2
2
0
00
( (
( (
))
( (
( (
))
))
0
( (
))
))
It is instructive to compare the rst order conditions of the social planner in Eq. (8A.21) with
those in the decentralized economy. Because markets are complete, the optimality conditions in the
decentralized economy satisfy:
m
1
s
(8A.25)
(8A.26)
=
1,
(8A.27)
Comparing the two implications in (8A.26) and (8A.27) leads to the conclusion that the social
allocations with = 1 , are the same as the private, and that the counterpart to Eq. (8A.19) is,
1
(
)
=
=
1
( 0 )
0
0
0
Restricted stock market participation (Basak and Cuoco, 1998). Given Eq. (8.39), and
results given in Section 7.5.1 of Chapter 7, the unit risk-premium, (
), solves a xed point problem:
(
)=
11 (
)
)
1(
11 (
12 (
1(
)
)
That is,
(
)=
1(
1(
12 (
We claim that:
1(
)=
( ), and
1(
1(
)
)
11 (
12 (
0
1(
(8A.28)
)
=
)
00
( )
(8A.29)
where
and
are the social planner consumption allocations. By replacing Eqs. (8A.29) into Eq.
and , leads to the expression for in Eq. (8.34). The expression
(8A.28), and using the denition of
for the short-term rate in Eq. (8.33) can be found similarly, and again, through the results given in
Section 7.5.1 of Chapter 7.
We now show that Eq. (8A.29) hold true. Consider the Lagrangean for the maximization problem
in Eq. (8.38),
( )+ (
)
= ( )+
402
c
by
A. Mele
( )
where is a Lagrange multiplier, = 0 ( ) , and and are the market consumption allocations.
= (
) and
The rst order conditions for the social planner lead to social allocation functions
= (
), and Lagrange multiplier (
), satisfying:
0
)) = (
)=
))
))
(8A.30)
)=
)) +
such that
1(
)=
))
))
))
+
)
(
(
))
(8A.31)
where the second equality follows by the rst order conditions in Eq. (8A.30), and the third equality
holds by di erentiating the equilibrium condition
(
)+
(8A.32)
with respect to .
Eq. (8A.31) establishes the rst claim in Eqs. (8A.29). To prove the second claim, invert, rst,
(
) = 0 1[ (
)] and
(
) =
the rst order condition with respect to , obtaining,
0 1[ (
) ]. Replace, then, these inverse functions into Eq. (8A.32),
=
0 1
[ (
)] +
0 1
[ (
(8A.33)
)=
00 (
1
(
))
and
1=
00 (
1
(
12 (
))
and
)+
11 (
1
)
12 (
+
0 = 00
( (
))
)
11 (
00
)) =
1(
(8A.34)
1
( (
)+
))
00
1
00 ( (
11 (
1
( (
))
1(
(8A.35)
(8A.36)
))
12 (
1(
The second relation in Eqs. (8A.29) follows by rearranging terms in the previous relation.
Extinction I: derivation of Eq. (8.67). Denote with 0 the initial wealth of agent . Note
), = 1 2, such that by using
that the budget constraints faced by the two agents are: 0 = (
and
in Eqs. (8.65) and (8.66),
the expressions for
1
1
1
1
(
) (1 + (
) )
(
02
2 )
(8A.37)
1=
=
=
(
01
1 )
(1 + (
)1 ) 1 1
403
c
by
A. Mele
where the rst equality follows by the assumptions that rational and irrational agents have the same
initial endowments.
Let us introduce a new probability, , dened through the Radon-Nikodym derivative,
1
=
( 1 )
F
such that Eq. (8A.37) can be re-written as
(
)1 (1 + (
)1 )
= (1 + (
)1 )
(8A.38)
where denotes the expectation under . By Girsanovs theorem, we have that under ,
to
2 1 2 2
) +
2
= ((1 )
where is a Brownian motion under . That is, utilizing the expression for
(8A.38) can be written as
1 2 2
+
1 (1 + 1 ) 1 = (1 + 1 ) 1
2
is solution
(8A.39)
Extinction II: derivation of Eq. (8.68). We determine the equilibrium asset price in the
logarithmic preferences case. We have:
=
)=
(1 + )
(1 + )
1+
(1 + )
where the second line follows by the martingale property of . To determine the denominator of the
previous ratio, note that,
"
"
1#
1#
(
)
+
(1 + )
=
=
"
(
2(
where the second equality follows by a change in probability and the third, by Girsanovs theorem and
Eq. (8.62). Therefore,
=
1+
(1 + )
1+
1+
=
Replacing this value for
404
2(
= 0 and
(8A.40)
= 1 for all ,
c
by
A. Mele
=
M
1 2
2
=
exp
( ( | )
) +
var ( | )
2
Note that ( | ) is normally distributed with, say, mean
var [ ( | )], such that,
1 2 2
) +
=
exp
(
M
2
[ ( | )] and variance
2
1
+
2
var ( | )
(8A.41)
[ ( | )] =
Regarding
( )=
and
[var ( | )]
+ var ( | )
implied by the previous two equalities into Eq. (8A.41) leaves Eq.
Solution of the multiple likelihood model. Consider the APT equation delivering E , i.e.,
the asset expected returns under the distorted probability in (8.95),
E =
Vol
1
( + 1)
2
2
0
0(
)
( )
( )
(8A.42)
where
denotes the innitesimal generator of ( ) in (8.99), and the last two lines follow by Eqs.
(8.100). That is, the price-dividend ratio is solution to the following partial di erential equation:
1
0
2
2
( + 1) 0 + 0
) ( )
+
( )+1
0=
( ) + 0 ( ) (1
2
Conjecture that ( ) = 0 + 1 , for two positive constants. By plugging this guess back to the
previous equation conrms that the price-dividend ratio is indeed a ne in the expected growth. To
determine 0 and 1 , one may proceed through the following intuitive arguments. Note that
1
Z
1
1
(0 ) =
+ (1
= 0
0)
1
2)
1
2)
0
0
(1
)(
(1
) (
0
2
0
2
where the rst equality is a standard evaluation formula, and the second relies on Eq. (8.99). Replacing
) , delivers
the denition of expected dividend growth into the previous equation, = + (1
Eq. (8.101).
Next, we determine the expression for the equity premium under the reference model, i.e., E in
( | ( ) ) = + (1
) and A is the average
(8.102). Note that = + A 0 , where
405
c
by
A. Mele
size of ambiguity aversion, also dened in (8.102). Therefore, by Girsanov theorem (see Chapter 4),
we can dene a probability
with Radon-Nikodym derivative against the worst-case probability
(i.e., the probability under which ( ) are solution to (8.99)),
= 12 0 A2
0 A
such that
0A
= =
. Then, under
, we have that
(8A.43)
A ( )
+ ( )
where () denotes as usual the di usion coe cient of in (8.99). The APT equation delivering the
is
asset expected returns under
+
E
=
=
=
0 ()
1
() 0 +
()
0 ()
A 0 + +
()
()
0 ()
A ( ) A 0 +
+ +
1
( )=
where
+ (1
)(
(1
) (
1
2
0.
406
2)
0
(1
)(
1
2
2)
0
(8A.44)
c
by
A. Mele
References
Abel, A.B. (1990): Asset Prices under Habit Formation and Catching Up with the Joneses.
American Economic Review Papers and Proceedings 80, 38-42.
Abel, A.B. (1999): Risk Premia and Term Premia in General Equilibrium. Journal of Monetary Economics 43, 3-33.
Adrian, T. and H. S. Shin (2011): Financial Intermediaries and Monetary Economics. In
B. M. Friedman and M. Woodford (Editors): Handbook of Monetary Economics (NorthHolland Elsevier), Vol 3A, Chapter 12, 601-650.
Alvarez, F. and U.J. Jermann (20??):
Bansal, R. and A. Yaron (2004): Risks for the Long Run: A Potential Resolution of Asset
Pricing Puzzles. Journal of Finance 59, 1481-1509.
Basak, S. (2000): A Model of Dynamic Equilibrium Asset Pricing with Heterogeneous Beliefs
and Extraneous Risk. Journal of Economic Dynamics and Control 24, 63-95.
Basak, S. (2005): Asset Pricing with Heterogeneous Beliefs. Journal of Banking and Finance
29, 2849-2881.
Basak, S. and D. Cuoco (1998): An Equilibrium Model with Restricted Stock Market Participation. Review of Financial Studies 11, 309-341.
Bernanke, B.S. and M. Gertler (1989): Agency Costs, Net Worth, and Business Fluctuations.
American Economic Review 79, 14-31.
Bernanke, B.S. (2004): The Great Moderation: Remarks by the Federal Reserve Board Governor, The Meetings of the Eastern Economic Association, Washington, DC, February
20.
Bernanke, B. S., M. Gertler and S. Gilchrist (1999): The Financial Accelerator in a Quantitative Business Cycle Framework. In J.B. Taylor and M. Woodford (Editors): Handbook
of Macroeconomics (North-Holland Elsevier), Vol. 1C, Chapter 21, 1341-1393.
Berrada, T. (2006): Incomplete Information, Heterogeneity, and Asset Pricing. Journal of
Financial Econometrics 4, 136-160.
Black, F. (1976): Studies of Stock Price Volatility Changes. Proceedings of the 1976 Meeting
of the American Statistical Association, 177-81.
Boldrin, M., L. Christiano and J. Fisher (2001): Habit Persistence, Asset Returns and the
Business Cycle. American Economic Review 91, 149-166.
Brunnermeier, M. K., T. M. Eisenbach and Y. Sannikov (2012): Macroeconomics with Financial Frictions: A Survery. Working Paper Princeton University.
Brunnermeier, M. K. and Y. Sannikov (2013): A Macroeconomic Model with a Financial
Sector. Working Paper Princeton University.
407
c
by
A. Mele
Buraschi, A. and A. Jiltsov (2006): Model Uncertainty and Option Markets with Heterogeneous Beliefs. Journal of Finance 61, 2841-2897.
Campbell, J.Y., A. W. Lo and C. MacKinlay (1997): The Econometrics of Financial Markets.
Princeton: Princeton University Press.
Campbell, J.Y. (2003): Consumption-Based Asset Pricing. In Constantinides, G. M., M.
Harris and R. M. Stulz (Editors): Handbook of the Economics of Finance (North-Holland
Elsevier), Vol 1B, Chapter 13, 803-887.
Campbell, J.Y., and J.H. Cochrane (1999): By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior. Journal of Political Economy 107, 205-251.
Campbell, J.Y. and R. Shiller (1988): The Dividend-Price Ratio and Expectations of Future
Dividends and Discount Factors. Review of Financial Studies 1, 195228.
Chan, Y.L. and L. Kogan (2002): Catching Up with the Joneses: Heterogeneous Preferences
and the Dynamics of Asset Prices. Journal of Political Economy 110, 1255-1285.
Christie, A.A. (1982): The Stochastic Behavior of Common Stock Variances: Value, Leverage,
and Interest Rate E ects. Journal of Financial Economics 10, 407-432.
Cochrane, J. H., F. A. Longsta , and P. Santa-Clara (2008): Two Trees. Review of Financial
Studies 21, 347-385.
Constantinides, G.M. and D. Du e (1996): Asset Pricing with Heterogeneous Consumers.
Journal of Political Economy 104, 219-240.
Constantinides, G.M., J.B. Donaldson and R. Mehra (2002): Juniors Cant Borrow: a New
Perspective on the Equity Premium Puzzle. Quarterly Journal of Economics 117, 269296.
Cujean, J. and M. Hasler (2001): Fear of Recessions, Heterogenous Beliefs, and Stock Price
Under/Over-Reaction. Working Paper Swiss Finance Institute EPFL.
Cvitanic, J., and S. Malamud (2011): Price Impact and Portfolio Impact. Journal of Financial Economics 100: 201-225.
Danielsson, J., J.-P. Zigrand and H. S. Shin (2011): Balance Sheet Capacity and Endogenous
Risk. Working Paper London School of Economics and Princeton University.
Detemple, J. and Murthy, S. (1994): Intertemporal Asset Pricing with Heterogeneous Beliefs
Journal of Economic Theory 62, 294-320.
Dow, J. and S. Werlang (1992): Uncertainty Aversion, Risk Aversion, and the Optimal Choice
of Portfolio. Econometrica 60, 197-204.
Du e, D. (1992): The Nature of Incomplete Security Markets. In J-J La ont (Editor):
Advances In Economic Theory, 6th World Congress, Vol. II, Chapter 4, 214-262.
Du e, D. and L.G. Epstein (1992a): Asset Pricing with Stochastic Di erential Utility. Review of Financial Studies 5, 411-436.
408
c
by
A. Mele
Du e, D. and L.G. Epstein (with C. Skiadas) (1992b): Stochastic Di erential Utility. Econometrica 60, 353-394.
Dumas, B., A. Kurshev and R. Uppal (2009): Equilibrium Portfolio Strategies in the Presence
of Sentiment Risk and Excess Volatility. The Journal of Finance 64, 579-629.
Ellsberg, D. (1961): Risk, Ambiguity and the Savage Axioms. Quarterly Journal of Economics 75, 643-69.
Epstein, L.G. and S.E. Zin (1989): Substitution, Risk-Aversion and the Temporal Behavior of
Consumption and Asset Returns: A Theoretical Framework. Econometrica 57, 937-969.
Epstein, L.G. and S.E. Zin (1991): Substitution, Risk-Aversion and the Temporal Behavior of
Consumption and Asset Returns: An Empirical Analysis. Journal of Political Economy
99, 263-286.
Friedman, M. (1953): The Case for Flexible Exchange RatesEssays in Positive Economics.
Chicago: University of Chicago Press.
Gallmeyer, M., Aydemir, A.C. and B. Hollield (2007): Financial Leverage and the Leverage
E ect: A Market and a Firm Analysis. working paper Carnegie Mellon.
Geanakoplos, J. (2010): The Leverage Cycle. In D. Acemoglu, K. Rogo and M. Woodford
(Editors): NBER Macroeconomic Annual 2009 (University of Chicago Press), Vol 24,
1-65.
Gertler, M. and N. Kiyotaki (2011): Financial Intermediation and Credit Policy in Business
Cycle Analysis. In B. M. Friedman and M. Woodford (Editors): Handbook of Monetary
Economics (North-Holland Elsevier), Vol 3A, Chapter 11, 547-599.
Gilboa, I. and M. Marinacci (2011): Ambiguity and the Bayesian Paradigm. In Advances
in Economics and Econometrics: Theory and Applications, Tenth World Congress of the
Econometric Society. Cambridge: Cambridge University Press.
Gilboa, I. and D. Schmeidler (1989): Maxmin Expected Utility with a Non-Unique Prior.
Journal of Mathematical Economics 18, 141-153.
Guvenen, F. (2009): A Parsimonious Macroeconomic Model for Asset Pricing. Econometrica
77, 1711-1740.
Harrison, J.M. and D.M. Kreps (1978): Speculative Investor Behavior in a Stock Market with
Heterogeneous Expectations. Quarterly Journal of Economics 92, 323-36.
Hansen, L. P. and T. J. Sargent (2008): Robustness. Princeton: Princeton University Press.
He, Z. and A. Krishnamurthy (2012): Intermediary Asset Pricing. Forthcoming in the American Economic Review.
Heaton, J. and D.J. Lucas (1996): Evaluating the E ects of Incomplete Markets on Risk
Sharing and Asset Pricing. Journal of Political Economy 104, 443-487.
409
c
by
A. Mele
Huang, C.-f. (1987): An Intertemporal General Equilibrium Asset Pricing Model: the Case
of Di usion Information. Econometrica 55, 117-142.
Hugonnier, J. and R. Prieto (2012): Arbitrageurs, Bubbles, and Credit Conditions. Working
paper SFI- EPFL (Lausanne) and Boston University.
Jermann, U.J. (1998): Asset Pricing in Production Economies. Journal of Monetary Economics 41, 257-276.
Karatzas, I. and S.E. Shreve (1991): Brownian Motion and Stochastic Calculus. New York:
Springer Verlag.
Keynes, J.M. (1921): A Treatise on Probability. London: MacMillan and Co.
Kiyotaki, N. (1998): Credit and Business Cycles. Japanese Economic Review 49, 18-35.
Kiyotaki, N. and J. Moore (1997): Credit Cycles. Journal of Political Economy 105, 211-248.
Klibano , P., M. Marinacci, and S. Mukerji (2005): A Smooth Model of Decision Making
under Ambiguity. Econometrica 73, 1849-1492.
Knight, F.H. (1921): Risk, Uncertainty, and Prot. New York: Houghton Mi in.
Kogan, L., S. Ross, J. Wang, and M. Westereld (2006): The Survival and Price Impact of
Irrational Traders. Journal of Finance 61, 195-229.
Kyle, A.S. and F.A. Wang (1997): Speculation Duopoly with Agreement to Disagree: Can
Overcondence Survive the Market Test? Journal of Finance 52, 2073-90.
Leippold, M., F. Trojani and P. Vanini (2008): Learning and Asset Prices under Ambiguous
Information. Review of Financial Studies 21, 2565-2597.
Lettau, M. (2002): Idiosyncratic Risk and Volatility Bounds, or, Can Models with Idiosyncratic Risk Solve the Equity Premium Puzzle? Review of Economics and Statistics 84,
376-380.
Liptser, R.S. and A.N. Shiryaev (2001): Statistics of Random ProcessesVol. II (Applications).
Berlin, Springer-Verlag.
Lucas, R.E. (19??):
Lucas, D.J. (1994): Asset Pricing with Undiversiable Income Risk and Short Sales Constraints: Deepening the Equity Premium Puzzle. Journal of Monetary Economics 34,
325-341.
Mankiw, N.G. (1986): The Equity Premium and the Concentration of Aggregate Shocks.
Journal of Financial Economics 17, 211-219.
Mankiw, N.G. and S.P. Zeldes (1991): The Consumption of Stockholders and Non-Stockholders.
Journal of Financial Economics 29, 97-112.
Markovitz, H. (1952): Portfolio Selection. Journal of Finance 7, 77-91.
410
c
by
A. Mele
c
by
A. Mele
Weil, Ph. (1989): The Equity Premium Puzzle and the Risk-Free Rate Puzzle. Journal of
Monetary Economics 24, 401-421.
Weil, Ph. (1992): Equilibrium Asset Prices with Undiversiable Labor Income Risk. Journal
of Economic Dynamics and Control 16, 769-790.
Xiouros, C. and F. Zapatero (2010): The Representative Agent of an Economy with External
Habit Formation and Heterogeneous Risk Aversion. Review of Financial Studies 23,
3017-3047.
Zapatero, F. (1998): E ects of Financial Innovations on Market Volatility when Beliefs are
Heterogeneous. Journal of Economic Dynamics and Control 22, 597-626.
412
9
Information and other market frictions
9.1 Introduction
In the economies of the previous chapters, the equilibrium outcomes do not convey more information than that available to each agent because information, whilst sometimes incomplete, is
disseminated symmetrically across decision makersfor example, asset prices aggregate information that the agents already know, such that the agents inference on the asset fundamentals
would not improve were it also based on the equilibrium prices.
This chapter considers markets in which equilibrium outcomes aggregate information dispersed across the market, which agents nd useful while updating beliefs. This information is
useful because the pieces of information agents have access to are not the same, such that, now,
asset prices contain information about the fundamentals that some agents might not directly
observe, and which are made publicly available (so to speak) through trading activity. We study
markets with asymmetric information, in which some agents have more precise information than
others, and markets with di erential information, in which agents know di erent pieces of information that have the same quality. In the economies of the previous chapters, the price only
determines the budget constraints. While the price still determines budget constraints in the
markets that we analyze in this chapter, this price now plays a new, additional, fundamental
role: it conveys information to investors.
Note how subtle the equilibrium concept needs to be in the markets of this chapter. If agents
nd it useful to condition their choices upon the information conveyed by an equilibrium outcome, these very same choices may well a ect the information contained in the equilibrium
outcome, over a xed point. Provided it exists, this xed point leads to implications regarding
the informational role of asset prices, that is, the equilibrium amount of information. In the
asymmetric information case, we say that the price transmits information from the more
informed investors to the less. In the di erential information case, we say that the price aggregates information dispersed amongst investors. Both cases play outstanding roles in economics.
Consider the asymmetric information case: if uninformed investors can learn from the equilibrium price, what are the incentives left to purchase information? In other words, what is
the value of information? The di erential information case is equally important. The welfare
theorems reviewed in Chapter 2 suggest that a Pareto optimal allocation can be centralized.
9.1. Introduction
c
by
A. Mele
However, this solution cannot be implemented while portfolio decisions are made on the basis
of local information. The market solution proves useful as the price would now aggregate
dispersed information, making it available to investors initially not having direct access to it.
These ideas go back to Hayek (1945) at least.
There is indeed a number of conceptual issues that arise in these markets. Namely, how much
information does a price need to convey for an equilibrium to exist? Intuitively, if the asset
price conveys too precise information, and becomes a su cient statistics for the asset payo ,
the agents would not even need to condition upon their own information while formulating
portfolio decisions. But then, if agents trade while only relying on the equilibrium price (and
not on their own signals), how can, then, the price aggregate information? This is the Grossman
(1976) paradox. Note, also, that the price cannot reveal all private information when information
is costly, for otherwise there would not be incentives left to purchase information in the rst
placethe Grossman and Stiglitz (1980) paradox.
A standard approach to deal with these paradoxes is to assume that prices convey noisy
signals about the information that investors have. As Black (1986) discussed, noise makes
markets function when information problems would otherwise lead them not to arise in the rst
place. The mechanism is simple. In equilibrium, the price incorporates the information on which
informed investors trade, but also other factors that are possibly unrelated to the fundamentals,
such as liquidity shocksnoise. An uninformed investor, now, cannot tell whether a large asset
price swing is due to information or to a liquidity shock as the price is partially revealing of
the information informed investors have. So uninformed investors learn from the price, but the
information conveyed by the price is imperfect, such that informed investors still have superior
information, even after the uninformed learning. Thus, it can pay to purchase information. All
in all, partial revelation is the key to make asset markets work in this context.
The equilibrium concept to deal with these markets is an extension of the Rational Expectations Equilibrium (REE) the previous chapters have relied upon, and is called Noisy REE
(NREE). It appears that macroeconomics contains the rst example of an an equilibrium in
which agents confuse fundamentals with noise. In an attempt to explain the relation between
the conduct of monetary policy and the business cycle, Phelps (1970) and Lucas (1972) consider an economy in which agents have imperfect information regarding the fundamentals of
the economy. This information-based approach to the business cycle is summarized in Lucas
(1981), and was somehow abandoned in favor of the real business cycle theory (see Chapter 3),
perhaps because information could hardly be regarded as the main engine of macroeconomic
uctuations. However, the interplay between macroeconomics and information helped Lucas
(1972) develop the notion of rational expectations within a noisy economy. In Section 9.2, we
present a simplied version of the Lucas framework, which helps pave the way to the study of
asset markets in subsequent sections.
A central theme of this chapter is the role of information in asset markets. In many of the
models of this chapter, noise is what makes these markets work, as explained. Liquidity shocks
are the most natural example that illustrates this concept. In fact, the models in this chapter
make sharp predictions on how liquidity shocks a ect asset prices. For example, we shall explain,
in non-competitive markets (markets beyond the standard NREE), uninformed trades may have
a price impact because market makers confuse information with noise: being unsure about the
nature of the orders they see (information-driven or liquidity shocks), market makers price the
assets in a way that even a liquidity shock (noise) gets impounded in equilibrium, a clear case
of adverse selection. Liquidity in asset markets therefore constitutes the other side of the same
coin (against information) in this chapter.
414
c
by
A. Mele
The core of this chapter is then to analyze how information a ects asset markets. Section
9.3 discusses the classical notions of informational e ciency in the early empirical literature,
and how the subsequent information literature has helped shape these notions. This literature
is reviewed next. First, Sections 9.4 and 9.5 explain that the standard notions of a Walrasian
equilibrium or REE are not su cient to deal with markets in which agents have asymmetric
or diverse information. Section 9.6 then deals with NREE. Sections 9.7 and 9.8 cover markets
with non-competitive players.
While information does play a fundamental role to explain prices and liquidity in the microstructure of asset markets, information cannot be the only driver of market liquidity. Liquidity in markets driven by macroeconomic news (government bonds, for example) cannot be
only driven by investors with superior or diverse information. Search and bargaining are
alternative explanations in these markets. Section 9.9 studies liquidity in markets where information plays a more limited role, with a focus shifted to the search nature of OTC markets.
Section 9.10 concludes the chapter and examines a number of additional mechanisms that
could potentially a ect the asset price formation process, such as the presence of irrational (or
noise) traders and additional capital market imperfections that lead to limits to arbitrage.
Eq. (9.1) follows, approximately, once we assume that the average price, , is common knowledge, as for example in the monopolistic competition model of Blanchard and Kiyotaki (1987).
When is not common knowledge as in the analysis of this section, Eq. (9.1) can still be
thought of as arising through a plausible decision mechanism. The specic functional form for
the average price is the most important approximation made while deriving Eq. (9.1) based
on a rigorous micro-founded setup. Appendix 3 to this chapter contains a discussion of these
issues.
Information is disseminated di erentially (not asymmetrically), in that producers in the -th
island are not aware of the price in the remaining islands, and make statistical inference on
economic developments occurring in the other islands with the same precision. We conjecture
and, later, verify, that all variables, exogeneous and endogeneous, are normally distributed.
We shall show that this normality property implies the price index gathers all information
e ciently, i.e. is a su cient statistics for all that information.
By the Projection Theorem reviewed in Appendix 1, we have that:
=
| )
( ))
where we have used the fact that information is symmetrically disseminated and, then, (i)
the expectation ( ) = ( ) = ( ) for every and , and (ii) both the numerator and
415
c
by
A. Mele
(
)
denominator of the ratio,
, are the same across all islands. This coe cient will
( )
be determined below, as a part of the equilibrium.
Aggregating across all islands, yields the celebrated Lucas supply equation:
1X
( ))
(9.2)
=1
Next, assume the demand for the good produced in the -th island is given by:
=
+
(
) where
0 2
and
(9.3)
( )+
where
(9.4)
P
are sectoral shocks in that:
= 0.
Finally, we assume that ( ) = 0, and that
=1
The functional form for the demand function, , follows after assuming the goods in the islands
are imperfect substitutes (see, e.g., Blanchard and Kiyotaki, 1987).
The equilibrium price in the islands plays two roles in this economy. A rst, and standard
role, is to clear the markets, being such that = , or:
(
( )) =
) , for all
(9.5)
The second role the price performs is to convey information to agents regarding the two
shocks: (i) the macroeconomic, monetary shock, ; and (ii) the real shocks in all the islands,
, = 1 . We conjecture that the only real shock that determines the price in the -th
island is , i.e. that the price is a function
(
). We also conjecture this price is a ne,
in and , viz
(
)= + +
(9.6)
where the coe cients , and have to be determined in equilibrium. Under these conditions,
the average price is a function
( ) satisfying
( ) = +
(9.7)
Let us replace Eqs. (9.6), (9.7) and (9.4) into the equilibrium condition, Eq. (9.5). By rearranging terms,
0=( +
1) + ( +
1) +
( )
The previous equation has to hold for all
and
=
and
. Therefore,
( )
1
1
=
(9.8)
1+
+
We are left with determining , which given Eqs. (9.6)-(9.7), and Eq. (9.8), is easily shown to
equal:
2
)
(
=
=
(9.9)
2
( )
+
2 +
2
1+
416
=
c
by
A. Mele
The previous equation has a unique and positive xed point for , which can then be replaced
back into Eqs. (9.8), yielding the solutions for and , which are both positive.
We can now gure out the implications of this equilibrium. By replacing Eqs. (9.6)-(9.7) into
the Lucas supply equation (9.2), leaves:
=
This is Lucas celebrated neutrality result. Anticipated monetary policy, ( ), does not a ect
the equilibrium outcome, . It is the monetary shock that a ects . Agents in any island do
not observe the price in the remaining islands and, hence, the aggregate price level, . Therefore,
they are unable to tell whether an increase in the price of the good they produce, , is due to
a real shock, , or to a monetary shock, . In other words, they cannot disentangle a monetary
shock from a real shock. If the agents were informed about real shocks, they would of course
infer , and a monetary shock would not exert any e ect on the equilibrium production.
In other words, in equilibrium, the price di erence is
=
, which does not depend on
. It is a dichotomy prediction reminiscent of the classical theory. Note, however, that
is not observed, as is not observed. Instead, the producers in the -th island can only make
inference on,
2 2
( | )=
2 2
2 2
(
) = 2 2.
The previous term co-varies positively with the observed price, ,
The covariance is zero precisely when the assumption is removed of imperfect knowledge regarding the real shocks, 2 = 0, in which case = 0. In contrast, and assuming imperfect
knowledge, producers act so as to compensate for their partial lack of knowledge, and produce
to the maximum extent they can justify, on the basis of the positive statistical co-movements,
(
) 0. Note, if ( ) =
1 , i.e. money supply in the previous period, then from
Eq. (9.7), the ination rate,
=
+ (1
) 1 . Therefore, output and ination are pos1
itively correlated, and generate a Phillips curve, which policy makers cannot exploit anyway,
as anticipated monetary policy, ( ), is rationally factored out, and does not a ect output.
This is the essence of the Lucas critique (Lucas, 1977).
In the next sections, we analyze asset markets that work due to similar mechanisms. Why
should we ever buy some assets from those people insisting in selling them? Trading seems to
be a di cult phenomenon to explain, in a world with imperfect information. Yet trading does
occur, if imperfect information has the same nature as that of the Phelps-Lucas model. Agents
might well be imperfectly informed about the nature of, say, unusually high market orders. For
example, sell orders might arrive to the market, either because the asset is a lemon or because
the agents selling it are hit by a liquidity shock. In the models of this chapter, an equilibrium
with rational expectation exists, precisely because of this noiseliquidity, in this example.
There is a chance the sell order arrives to the market, simply because the agents selling it are
hit by a liquidity shock. Imperfectly informed agents, therefore, might be willing to buy, if it is
in their interest to do so.
c
by
A. Mele
We would say asset markets are informationally e cient if prices reect available information,
accurately and rapidly. This denition is obviously loose, but made purposedly so, as our
objective, now, is to illustrate how the process of narrowing it down leads to topics that have
made the object of controversy. In particular, we need to qualify the (i) type and (ii) quality of
information embedded into and revealed by the price: What type of information does the price
reveal? How accurately can the price convey information?
The rst question, relating to the type of information, has been addressed in a famous contribution by Fama (1970), who considers three forms of informational e ciency. First, strong
e ciency, arising when the price reects all private and public information. Second, semi-strong
e ciency, the situation in which the price conveys all public information. Third, weak e ciency,
arising when the price only conveys information regarding past (price) data.
At least initially, the motivation to dene e cient markets (in the informational sense) was
to illustrate that markets cannot be in a state of disequilibrium. For example, if asset prices
tend to be high on Friday and to decline on Monday, a protable trading opportunity might
seem to arise. An equilibrium, the motivation goes, is informationally e cient (weakly so, in
this example), should the average return from this opportunity be small enough to discourage
trading on it. We know this reasoning has fallacies. Even if the average gain is statistically large,
we might have no agent attempting at it, due to risk-aversion or trading frictions. Therefore,
we would never know whether even a potentially large gain (in statistical terms) is a market
ine ciency or rather, say, compensation for risk, a classical joint hypothesis problem.
The theoretical literature has rened the notion of informational e ciency, by shifting its
focus on the second question formulated at the beginning of this section: How accurately can
the price convey information? The approach followed to address this question relies on models
in which rational agents obviously trade on their information, but also on the information
the price reveals in equilibrium, which depends on the agents portfolio decisions, over a xed
point, as anticipated in the Introduction. This xed point can lead to asset prices that are fully
revealing, in that they reveal all private and public information. These models rely on rational
agents and not surprisingly, predict that no money could be ever left on the table. We also
have a renement of this fully revealing concept: we say markets with fully revealing prices
are strongly informationally e cient if the prices reveal a su cient statistics for all the private
information. Note that strong e ciency in this theoretical sense di ers from its meaning in the
empirical literature, signifying as it does, now, that a simple statistics (say, the average of the
signals dispersed across the agents population) is enough to forecast the asset fundamentals.
As discussed in the Introduction, markets with fully revealing prices are problematic. They
lead to paradoxes. First, the Grossman (1976) paradox. If markets are strongly e cient, the
agents should abandon their own signals while formulating portfolio decisions, although in this
case the price should not contain any piece of private information, contradicting the initial
presumption that markets are strongly e cient.
Second, the Grossman and Stiglitz (1980) paradox. If markets are strongly e cient, informed
agents make losses once information is costly, and would rather become uninformed, freeriding on an informative price, although then, in this case, the price will not contain information
anymore. To resolve this paradox, Grossman and Stiglitz (1980) propose markets which they
famously describe (p. 393) to be ones in which there is an equilibrium degree of disequilibrium.
That is, in their model, prices cannot be fully revealing, but partially revealing, meaning that the
informed agents do not give all of their information away to the uninformed. Disequilibrium
means prices are not fully revealing, and equilibrium degree of disequilibrium means that
the price informativeness depends on how many agents are informed, which is an endogenous
418
c
by
A. Mele
variable in the model. Note, now, that disequilibrium allows markets to function, a perspective
somehow distinct from the early attempts to dene e ciency in the empirical literature.
The following three sections aim to formalize these ideas. Section 9.4 shows that money is
indeed left on the table once agents take portfolio decisions while ignoring the information
content of asset prices. Then, we introduce progressively more appropriate notions of equilibrium
in which prices are fully (Section 9.5) and partially (Section 9.6) revealing.
= arg max
( | )+ 12 2 ( | )
= arg max
=
|
(9.11)
|
|
= +
+
and
(9.12)
for all
(9.13)
uninformed agents, who do not observe the signal in Eq. (9.10). Their
There are also
risky asset demand, say, is therefore a special case of the informed asset demand in Eq.
(9.11), namely for
= 0,
=
(9.14)
The equilibrium price of the risky asset is found by aggregating demand, setting supply equal
to aggregate demand,
X
=
+
(9.15)
0
=1
419
c
by
A. Mele
and solving for after replacing Eq. (9.11) and Eq. (9.14) into Eq. (9.15) and using Eqs.
(9.13)-(9.12), yielding,
= +
(9.16)
(( + ) +
)
(( + ) +
)
where denotes the average signal, viz
1 X
(9.17)
=1
The equilibrium price in Eq. (9.16) has three components. The rst is the discounted payo .
The second term reveals that the price aggregates information dispersed across the informed
investors through the average signal, . That is, is a su cient statistics for all of the signals
observed by each informed agent, ( ) =1 , with respect to the equilibrium price. This second
term thus adds or subtracts value according to whether the average signal, , is higher than
the unconditional guess, . The third term is a risk-premium: the higher the average supply,
the higher the risk the agents have to bear in equilibrium, which is evaluated proportionally to
their risk aversion, 1 = .
The main issue with a Walrasian equilibrium is that while the price conveys information
informed investors have, the uninformed investors do not condition upon this informative price
while formulating their asset demand in Eq. (9.14). They only use the price to determine their
budget constraint.
Consider the following additional issue, arising when the number of agents gets large. In this
case, we have that by the Law of Large Numbers,
plim =
+
( + )+ 1
( + )+ 1
where
lim
. That is, in this limit, the price perfectly reveals the asset payo , ,
provided of course the proportion of informed agents is asymptotically signicant, 0.
It is an arbitrage opportunity. Any rational investor who understands this market, can make
large prots whenever 6= . He will observe the price , and infer . For example, if
,
he will borrow and use this to invest in the risky asset. In the second period, he will pay back
and receive the asset payo , with a sure prot equal to
0.
= +
(9.18)
420
c
by
A. Mele
for two constants and . However, we conjecture that the two constants and are not the
same as those in Eq. (9.16). It is natural: if we assume the agents make inference about the
asset payo while also conditioning on the equilibrium price, this equilibrium price should then
di er from that arising in the Walrasian case.
To determine the equilibrium, rst note that the informed agents formulate a demand function
equal to
=
(9.19)
|
|
This demand schedule generalizes that in Eq. (9.11), in that the conditioning information now
contains both the signal available to the informed investor, , and the equilibrium price in
(9.18) or, equivalently, and conjecturing that 6= 0, the average signal, . However, it is easy
to check that by the projection theorem,
| = ( | ) = +
(9.20)
+
and,
(9.21)
That is, the average signal is a su cient statistics for the distribution of the asset payo . We
shall discuss the economic implications of this conclusion below, once we will have solved for
the equilibrium.
Regarding the uninformed investors, their demand schedule di ers from that in (9.14), in that
they will update their beliefs about the asset payo after having learnt the price realization,
in (9.18) or, equivalently, and conjecturing that 6= 0, the average signal, . But then note that
each uninformed agent will have exactly the same demand schedule as the informed in (9.19),
due to Eqs. (9.20)-(9.21), viz,
= =
| (
( | )
= 1
= 1
We can determine the coe cients and of the equilibrium price in (9.18) by plugging the
previous demand schedules into the market clearing condition, Eq. (9.15), leaving,
(9.22)
= 1
= 1
The equilibrium price in (9.22) fully reveals the average information disseminated in the
market, , just as in the Walrasian case (see Eq. (9.16)), although of course this REE di ers
from the Walrasians. The REE collapses to Walrasians only (i) in the absence of uninformed
agents attempting to extract information about the price,
= 0, and (ii) when the informed
agents have access to the same signal, = for all
(in which case we would set = 1 in
(9.22)). One implication is that absent these two conditions, arbitrage is ruled out now as the
number of agents increase as,
plim
The mechanism in this model is the following. While trading, the informed agents transmit
pieces of information into the equilibrium pricethey set the information content of the
421
c
by
A. Mele
price system so to speak. The uninformed agents free-ride on these prices. The price now
performs two roles: one, classical, is to determine the agents budget constraint; and a second,
less mechanical, to inform the uninformed agents about the information other investors possess.
Thus when an uninformed agent observes a low price, he will increase his demand as his budget
constraints soften. However, a low price reveals to the uninformed agents that the informed
might have received bad information about the asset quality, which decreases their demand.
The two e ects compensate exactly with each other, leading to a price-inelastic demand, =
= 1
. Given this, the informed agents can only be allocated their initial asset
0,
endowment as well, = 0 , = 1
.
This model leads to a number of puzzling predictions. First, a feature of the model pointed
out earlier, known as the Grossmans paradox (Grossman, 1976). By Eqs. (9.20) and (9.21),
any informed agent gives up his knowledge about his own signal, , relying as he does on the
average signal, . But if there is no agent using his own information while trading, we might
wonder, now, how the price ends up aggregating information in the rst place.
A second models implication links to the equilibrium allocation. Because each investor is
allocated his own endowment in equilibrium, informed and uninformed investors make the
same prots and, hence, have the same welfare. This raises the following issue, known as the
Grossman-Stiglitz paradox (Grossman and Stiglitz, 1980): why would we be willing to purchase
private information (i.e. the signals ), if this information could be freely read from the
equilibrium price? And, precisely because there are no incentives to purchase private information
in this case, the price should then not reect any.
We now explain these paradoxes could be solved by assuming the presence of noise in the
equilibrium, leading to a partially revealing price: when the uninformed agents make inference
on the private information investors have, based on the price, they can only extract part of this
information: while part of their information is given for free to uninformed, uniformed investors
can still have incentives to acquire information.
c
by
A. Mele
who determine the asset demand on the basis of the probabilistic distribution of the price. But
in equilibrium, prices depend on the uninformed demand. In the Grossman and Stiglitz (1980)
market reviewed in Section 9.6.1, there is a simple solution to this information transmission
problem. In equilibrium, prices are informative, to the extent that the incentives to purchase
information decrease with the number of the already informed agentsinformation choices are
strategic substitutes.
In the di erential information market of Section 9.6.2, there are no agents with superior information. The models that analyze this market are introduced by Hellwig (1980) and Diamond
and Verrecchia (1981). They aim to study issues regarding information aggregation: how does
the market help aggregate information dispersed across investors? This question has very distant origins in economics. A competitive equilibrium leads to Pareto optimal allocations (rst
welfare theorem); conversely, a given (desired) Pareto optimal allocation can be decentralized
through a dedicated re-distribution of wealth (second welfare theorem). Pushed to the extreme,
the second welfare theorem may seem to suggest that any desired market outcome could be
implemented through a centralized, socialist type planning system, as in the planning literature
following the work of Lange (1936, 1942); that is, once the planner xes an outcome as an objective, the very same outcome could be achieved through a dedicated re-distribution of resources
that leads agents to the desired objective under laissez faire. Hayek (1945) rejects these ideas:
we cannot implement this mechanism while missing data regarding information that is local.
After all, the second welfare theorem regards Walrasian equilibria, not the equilibria we are
examining in this chapter.
Hellwig and Diamond & Verrecchia models formalize the process of information aggregation
in the context of nancial markets. It is a complex task because, now, compared to the previous
REE markets, agents cannot extract all the information that others know from the price, and
need to condition on their own signals and forecast both signals and forecasts of others, over
a xed point. An equilibrium in this market exists once all these forecasts are mutually and
internally consistent and, the price is only partially revealing of all the information disseminated
in the market.
c
by
A. Mele
Each informed investor agrees to pay the constant cost , and observes the same signal on the
asset payo ,
=
for all
in Eq. (9.10). That is, acquiring information only leads
1
to a partial resolution of risk. We assume, and later justify, that the equilibrium price in this
market does not reveal more information than is already possessed by the informed agents.
Therefore, the asset demand of the informed agents is the same as in the Walrasian market,
Eq. (9.11), viz
=
)
= 1
(9.23)
| ( ( | )
with agents conditional estimate and precision of the asset payo being the same as in (9.12)(9.13),
+
(9.24)
( | ) = +
| =
+
We know from the previous section that the assumptions formulated so far lead to an equilibrium in which the uninformed investors free-ride on the equilibrium price, as the latter is
fully revealing. Obviously prices do reveal the information that informed investors impound
on them. The issue is to ascertain whether some of this information could not be revealed
for free to uninformed investors. Grossman and Stiglitz do indeed study an equilibrium with
partial information revelation. One of their key assumption is that the asset supply is random,
such that, now, markets reveal both fundamental and non-fundamental sources of information,
which an uninformed investor cannot tell apart. Specically, assume that the total asset supply
is random, and equal to2
0 2
(9.25)
0+
We interepret as a liquidity shock. For example, a positive and large realization of
could be interpreted as an asset sell-o due to non-fundamental reasons (say, asset owners who
monetize their investments due to previously unexpected consumption contingencies).
The partial revelation mechanism operates as follows. First, note that the uninformed investors now formulate their asset demand conditional upon having learnt about the equilibrium
price,
=
)
= 1
(9.26)
| ( ( | )
Replacing Eqs. (9.23)-(9.26) into the market clearing condition,
= + (1
(9.27)
reveals that the equilibrium price, , is informationally equivalent to the compound signal ,
dened as
(
(
) 1
(9.28)
0)
The compound signal (or, equivalently, the price) does reveal information about funda; however, it does so imperfectly, conveying as it does additional
mental information,
information regarding possible liquidity shocks,
0 . An uninformed investor who observes
1 In
their original formulation of the model, Grossman and Stiglitz assume that the asset payo is = + , where both and
are normally distributed, and that the informed investors observe for a fee. The formulation of the model in this section is
equivalent. Moreover, we are assuming that the information fee is only paid once payo s are revealed (rather than at 0 ) to simplify
the presentation.
2 To study the model predictions as
increases, we would need to make assumptions on how 2 would change with ; see
Footnote 3 for one alternative assumption. In this section, we take as given.
424
c
by
A. Mele
the price, now, cannot tell how much of a price increase is due to good news (say high ) or
a negative liquidity shock (say low ). Therefore, we expect the equilibrium in this market to
be one in which informed agents have higher welfare (before paying ) due to their superior
information: they know and of course, they know the price , and then ; that is, they know
everything (but ).
We search for a linear equilibrium, that is, an equilibrium in which the asset price is a linear
function of the compound signal, in Eq. (9.28),
=
(9.29)
for two constants and to be determined. We determine and by replacing the asset
demands of the informed and uninformed investors into the market clearing condition, Eq.
(9.27). We proceed as follows.
First, we determine the conditional expectation, ( | ), and the conditional precision, | ,
in Eq. (9.26). Note that the equilibrium price in Eq. (9.29) is a ne in , and is normally
distributed, such that
( 2 2 ). Therefore,
( | ) = +
)
2
(9.30)
where,
=
1
+
2 2
(9.31)
( +
{z
)
}
info revelation
)
|{z}
(9.32)
budget constr.
=
+
|
(9.33)
We shall develop the economic interpretation of both the uninformed and the informed asset
demand schedules below.
Second, we plug (9.32) and (9.33) into Eq. (9.27), obtaining the equilibrium price conjectured
in (9.29)
+ (1
) |
0
=
) |
) |
| + (1
| + (1
|
|
{z
}
{z
}
=
The model leads to a number of important predictions. First, in equilibrium, the price does
not reveal all private information to the uninformed agents, but only a noisy version of it, :
we say the price is partially revealing. The model thus provides a resolution to the GrossmanStiglitz paradox: as the next subsection will show, there exist equilibria in which it is valuable
to purchase private information.
Second, consider the uninformed demand, in Eq. (9.32). One the one hand, it decreases
with the price as a mechanical implication of the agents budget constraint. On the other hand,
the uninformed demand increases with , due to information. In the REE, the two e ects exactly
425
c
by
A. Mele
compensate with each other, as explained in the previous section. Straightforward calculations
do, instead, reveal, that in the equilibrium of this model,
0
with an equality only holding once 2 = 0. That is, the uninformed asset demand in (9.32)
slopes down. The interpretation is simple: in the NREE, the e ects of information revelation
are weaker than in the REE case, as the price is partially revealing, as explained.
Third, the constant determines the price impact of a liquidity shock. The higher , the higher
. Its inverse, 1 =
the impact of a liquidity shock,
, is
0 , compared to information,
a measure of the risk-bearing capacity of the informed agents vis-`a-vis information. Note, indeed,
that the informed agents demand in (9.33) has two components. The rst relates to how high
the price is, compared to the unconditional guess of the asset value; it increases (in absolute
value) with the conditional precision of the value estimate given private information, | . The
second component relates to how good news are, compared to unconditional guesses; it increases
(in absolute value) with the conditional precision of the noise (i.e. with the informational
advantage), reecting a better quality of private information. Thus, and as anticipated, the
term 1 measures the total e ect of the informed agent demand with respect to information.
The higher this term, the easier becomes for the price system to absorb a liquidity shock,
0.
Finally, consider the conditional precision of the asset payo upon observation of the price,
i.e., | in Eq. (9.30): it is a measure of price informativeness. In particular, the constant in
(9.31) reects asymmetric information: it increases with , i.e., it decreases with the size of the
noisy component of private information, 2 . Moreover, it decreases with , i.e. it increases with
the risk-bearing capacity of the informed agents. In other words, price informativeness decreases
as private information becomes more noisy and increases with the risk-bearing capacity of the
informed agents. Note, nally, that because
1,
|
(9.34)
That is, prices provide less information than private signals. This property, while very intuitive,
has important welfare implications, discussed below.
9.6.1.2 The value of information
The uninformed agents do not entirely learn about the private information other investors
, with
have acquired. Whilst observing the equilibrium price, they confuse fundamentals,
liquidity shocks,
0 and sometimes, then, trade against the informed. Uninformed investors
should therefore expect lower prots and welfare than the informed (before information costs).
In Appendix 2, we show that,
r
(
)
(
(9.35)
where
and
denote the terminal wealth of the informed and the uninformed agent.
Dene the ex-ante prot certainty equivalent of the informed and the uninformed agents as the
two values C and C that solve:
C
and
426
c
by
A. Mele
We dene the value of information as the net gain of becoming informed in terms of the
previous certainty equivalents, which by Eq. (9.35) is,
1
|
(9.36)
C
C =
ln
2
|
Because prices provide less information than private signals, |
| (see Eq. (9.34)),
the informed agent is always better o compared to the uninformed, before accounting for the
information cost . Moreover, note that the conditional precision of the asset value estimate
made by the informed agents, | , is independent of the number of informed agents, . Instead,
that of the uninformed, | , is increasing in , as discussed in the previous section. Therefore,
the value of information is strictly decreasing in : the higher the number of informed agents,
the less valuable it is to acquire private information. An interior equilibrium in the market for
information is given by the proportion of informed agents
such that C
C = 0. Figure 9.1
depicts the value of information as a function of the proportion of informed agents for given
constellations of parameter values.
0.14
0.14
0.12
0.12
0.10
0.10
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.00
-0.02
-0.04
0.00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
lambda
-0.02
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
lambda
-0.04
-0.06
FIGURE 9.1. This gure depicts the value of information, C C in (9.36), when the the
cost of information is = 0 20, and parameters are set equal to =
=
=
= 1,
which is the benchmark solid line in both panels. The left and right panels compare the
benchmark with the value of information arising when the size of liquidity shocks increases
so as to make
= 34 (left panel, dashed line), and when the precision of information
acquired by the informed agent decreases such that
= 34 (right panel, dashed line).
In all cases depicted in Figure 9.1, there exist interior equilibria. For example, when the cost
of information is = 0 20 and all remaining parameters are set equal to one (the benchmark),
the equilibrium obtains with
= 71%. This equilibrium is stable in the following sense. When
the proportion of informed agents is less than this equilibrium,
, there are incentives to
become informed (C C
0), such that more agents will become informed until is reached.
The opposite occurs when
. There might be non-interior equilibria. For example, the cost
427
c
by
A. Mele
= 1
(9.37)
|
=
|
where, as usual, denotes the equilibrium price, and where the conditional precision of the
. This conjecture on
payo estimate is assumed to be the same for each agent, |
|
the conditional precision will be conrmed below.
We shall formulate conjectures regarding below. For now, we assume that the price is a
random variable that helps each investor forecast the asset payo , . The rationale behind this
conjecture is that in equilibrium,
X
=
(9.38)
=1
such that the clearing price aggregates the signals all investors observe, which are potentially
useful in each investors inference problem.
Furthermore, we conjecture that the conditional expectation,
+ (
|
= +
)
(9.39)
3 The two models are similar except that Diamond and Verrecchia consider endowment shocks as the channel leading to partially
revealing prices. The inference process of the agents in Diamond and Verrecchia is also more elaborated than in Hellwig, as each
agent information set includes the price, the signal on the dividend, and the endowment shock. In Hellwig, there are no endowment
shocks as explained.
428
c
by
A. Mele
=
(
(9.40)
0)
where denotes the average signal in Eq. (9.17). The rationale underlying this conjecture is
the following. Note that by replacing Eq. (9.39) into the asset demand in (9.37), and then
plugging the result into the equilibrium condition (9.38), yields, after rearranging terms, that,
+
{z
| {z }
{z
0)
(9.41)
where the identities under the brackets follow by the conjecture made in (9.40).
The rst restriction in (9.41) delivers the value of in (9.40),
(9.42)
0
|
=
|
(9.43)
,
{ } depend on ,
{ }, over a a xed point.
where the very same
Intuitively, the price coe cients, , depend on the agents coe cient updates, , as the
agents conditional expectation of the dividend (and, hence, ) contribute to the equilibrium
price, through the equilibrium condition (see Eq. (9.38)). In turn, the agents coe cient updates
depend on the statistical distribution of the equilibrium price (and, hence, ), as the latter
is a conditioning variable for the agents expectations of the dividends. Appendix 2 shows that
there is a unique solution to this xed point, and that the price in Eq. (9.40) can be expressed
as
+ 2
1 1
(9.44)
=
1 1
+1
+ 2( + )
where
0)
(9.45)
and
is a strictly positive and bounded constant dened in the Appendix (see Eq. (9A.18)).
The fact that
0 implies that the price only partially reveals the average signal, .
Similarly as for the constant in the Grossman and Stiglitz (1980) model (see Eq. (9.28)), the
constant
in this model determines the price impact of a liquidity shock. Below, we shall see a
striking similarity between these two constants in the limiting case of a large number of agents.
429
c
by
A. Mele
The compound signal in Eq. (9.45) is the di erential information counterpart to in (9.28).
That is, in the di erential information market of this section, the price is partially revealing,
as conveys noisy information about the asset fundamentals. The equilibrium in this market
then addresses a number of issues that the REE cannot. First, Eqs. (9.44) and (9.45) reveal
that
and
are both positive. Moreover,
and
are both positive, provided that the
conditional precision |
0,4 a condition that can be shown to hold. That is, every agent
now conditions on both his own signal and the price while formulating his demand functiona
resolution to the Grossman (1976) paradox. The coe cient
is positive because the agents
signals are obviously positively correlated with the dividend; and the coe cient
is also
positive, because the equilibrium price is positively correlated with the asset dividend.
The solution in Eq. (9.44) is not known in closed form because the constant
is not. Note
that this constant determines the price sensitivity to the average signal and liquidity shocks,
and and also the conditional precision, | in (9.42). However, a solution is available in
the limiting case in which the number of agents is large. This case provides intuition regarding
the nite case.
So suppose that the number of agents is large, such by the Law of Large Numbers, the price
in (9.44) satises5
1
(1 + 2
)
plim =
+
(9.46)
0
2
|
( + (1 +
) )
|
{z
}
|
{z
}
where
and
0)
= lim
=(
)+
(9.47)
The price now reveals better information about the asset fundamentals than in the nite
agents case, as the signal embeds more precise information about the asset payo than in
(9.45). Naturally, each agent still nds it useful to condition his demand on both his own signal
and the equilibrium price. Moreover, the price is partially revealing in this limiting case too.
Note, now, that the limiting coe cient, , is the same as in the Grossman and Stiglitz (1980)
model when = 1 (see Eq. (9.28)). In other words, in this market, liquidity shocks are absorbed
through a mechanism quite comparable to that in the asymmetric information market.
Finally, the conditional precision in (9.47) is the sum of the precision of the payo given each
2
agents signal, | =
+ , plus a second component, 2
. This second component is
increasing in : the agents demand becomes more aggressive as risk-tolerance increases, leading
to a more informative equilibrium price. The second component is also increasing in both
and . The e ects regarding an increase in
are explained by the price being more responsive
to fundamentals than noise as 2 decreases. The e ects regarding an increase in
link to each
agent providing more information over equilibrium aggregation while trading more aggressively
on less noisy signals.
4 Assume momentarily that
0. Because
0, the second of Eqs. (9.43) implies that
0,
0. The fact that
0 follows by Eq. (9A.12) in Appendix 2.
(9.43), and
5 By
1
2
from zero. For example, this case arises when the shock
, where
0 2 .
=1
in (9.25) is the sum of the endowment shocks a ecting each agent, say
430
c
by
A. Mele
c
by
A. Mele
liquidity traderan assumption that parallels those leading to partial revelation in the NREE
models of the previous section. As usual, informed traders are those who observe a signal on
the asset payo , . Denote with F the information set that is available to marketSmakers at
time , and assume buy or sell orders arrive sequentially to them. Clearly, F = F 1
, where
can be either a buy or a sell order, i.e.
{buy sell }. For simplicity, we assume interest
rates are zero.
Suppose, now, that F is the same as that available to all traders: that is, there are no
informed traders. Then, if dealer receives a buy order, and is risk-neutral, he will set the
ask price at ,
say, such that,
0=
|F
= buy ) =
|F
1)
(9.48)
where the second equality follows because the buy order is totally uninformative, that is, it has
no predictive power on .
Likewise, if dealer receives a sell order, he will set the bid price at ,
say, such that,
0=
|F
= sell ) =
|F
1)
(9.49)
The expected prots in (9.48) and (9.49) are set equal to zero, due to the assumption dealers
are perfectly competitive, and the second equalities in both (9.48) and (9.49) follow by the
assumption that trades arrive to dealers while not carrying any information about . Therefore,
the bid-ask spread is zero for each dealer, and the asset price collapses to
=
( |F
1)
|F )
Trades arrive sequentially to the market makers. The latter believe there is (i) a probability
the order at is information driven; (ii) a probability 1
the order at is liquidity driven, and
in this case, 50% chances the liquidity motivated order is a buy or a sell. Note that the order,
now, might contain private information regarding the asset payo . While market makers are
not sure about whether the order is informative, they post order-contingent rules that reect
updated information: a bid price if the order they receive is a sell, and a ask price if the order
is a buy.
For example, and regarding a sell order, competition leads a representative market maker to
post the following bid price
=
( |F
= sell ) =
(sell )
432
1
+ (1
(sell ))
(9.50)
c
by
A. Mele
(sell )
|F
Pr ( =
= sell ) =
Pr (
= sell | =
F
Pr ( = sell | F 1 )
1)
1
(9.51)
Once again, note that the market maker posts bid and ask prices before knowing whether he
will receive a sell or buy order. As explained, his is a posting rule and actually a regret-free rule:
once the order is lled, market makers will not regret having executed at the prices they had
previously posted given the information of the time. This rule is, of course, entirely consistent
with market practice.
To determine the probabilities of the ratio in (9.51), note that conditionally on = ,
informed traders would never sell, such that the probability to receive a sell order in this case
is simply the probability that the order is liquidity motivated, 1
, times the probability that
1
a liquidity trader sells, 2 ,
Pr (
= sell | =
1)
1
(1
2
(9.52)
Furthermore,
Pr (
= sell | F
1)
1
(1
1
2
1)
) + (1
1
+ (1
2
(9.53)
The rst term on the RHS of the previous equality arises because conditionally upon =
(which has
1 likelihood given F 1 ), the only sell trades can be liquidity motivated, which
occurs with probability 12 (1
), as explained while deriving Eq. (9.52). The second term on
the RHS of (9.53) is due to the fact that conditionally on = (which has 1
1 likelihood
given F 1 ), the probability of receiving a sell order is the sum of (i) the prob of an informed
trade, , plus (ii) the prob of a liquidity motivated order, 12 (1
).
Replacing Eqs. (9.52) and (9.53) into Eq. (9.51) leaves
1
(sell ) =
1
2
1
2
(1
(1
)
) + (1
1)
(9.54)
An analogous reasoning leads to the following expression for the ask price posted by a representatitive market maker
=
( |F
= buy ) =
(buy )
+ (1
(buy ))
(9.55)
where
1
(buy )
Pr ( = | F 1
= buy )
Pr ( = buy | =
F 1)
=
Pr ( = buy | F 1 )
1
(1 + )
= 1 2
1
(1
)+
1
2
(buy )
433
(sell )
(9.56)
c
by
A. Mele
That is, the order ow is informative. Moreover, the model predicts the price is asymptotically
strongly e cient, in that
0 if =
and
1 if =
(see Eqs. (9.60) below).
Intuitively, chances are high the asset payo is low, after a large number of sell orders arriving
to the market makers trading desk.
The model implies adverse selection costs are borne by liquidity motivated orders. Consider,
indeed, the fair value of the asset, i.e. that arising in the absence of any information frictions
(i.e. prior to possibly informed trades arriving at ), dened as,
1
( |F
1)
+ (1
1)
(9.57)
It is the price we would observe absent any adverse selection. We have, accordingly, and using
Eq. (9.50), Eq. (9.54) and Eq. (9.57),
1
(9.58)
where,
1
(sell )
1)
(1
1
2
(1
) + (1
1)
The bid price is less than the fair value because once a market maker receives a sell order,
he will be unsure about whether the sell order is liquidity motivated: it could be information
motivated. This correlation between the order ow and the asset value leads the market maker
to evaluate the asset less than based on his information set prior to the received order. The
spread,
(posted at
1 in anticipation of a sell order possibly occurring at ) represents
the updated beliefs of the market maker after receiving a sell order. Adverse selection costs
are borne by liquidity traders because those a ected by a liquidity shock, will sell the asset
at a price lower than the fundamental value according to the previous information set not
including their own order,
1.
Likewise, and using Eq. (9.55), Eq. (9.56) and Eq. (9.57),
1
(1
(1
1)
(9.59)
where,
1
(buy )
1
2
)+
Now, those who want to buy an asset for liquidity reasons and accordingly, place a buy order,
will buy at a price higher than , bearing an adverse selection cost.
To summarize, Eq. (9.57), Eq. (9.58) and Eq. (9.59) form a stochastic di erence system,
where the conditional probability the asset is good, , is,
as in Eq. (9.54)
1 (sell )
=
(9.60)
(buy
)
as in Eq. (9.56)
1
Note, also, that the two spreads,
and
1
2
c
by
A. Mele
1 if the asset
It is increasing in the volatility of the asset payo , consistent with empirical evidence, being
0 or
zero only when = 0 (no asymmetric information) or asymptotically, when either
1.
The insights of the model are very important although the model has limitations. First, the
only source of the bid-ask spread relies on information asymmetries and the inherent adverse
selection: the market maker may receive buy orders by investors who know the asset payo is
good, or sell orders by investors who know the asset is a lemon. That is, the order ow correlates
with the asset payo , which leads to price impacts of trades not related to fundamentalsthe
liquidity shocks; that is, whenever a liquidity trader trades, he will move the price above or
below the fundamentals because the market maker anticipates that the order he observes could
be information driven.6 Furthermore, two assumptions of the model are that informed traders
do not act strategically, and that the order size is xed. The next section analyzes markets in
which traders understand their trades can have a price impact and can, accordingly, optimize
on their order size and also distribute their order sizes over time optimally.
435
c
by
A. Mele
Kyle (1985) solves for the equilibrium of this game and shows that the price impact, and
hence, liquidity conditions, tightly link to the extent of the information asymmetry between
traders and dealers. In Sections 9.8.1 and 9.8.2, we analyze the static, baseline version of his
model as well as a few variants of it. In Section 9.8.3, we analyze and discuss dynamic extensions
of the baseline model, explaining the models implications in terms of trading patterns and
general market behavior.
9.8.1 The Kyle baseline model
Kyle (1985) considers a market in which prices are determined in a sequential equilibrium,
similarly as in the Glosten and Milgrom (1985) market. An insider trader knows the value of
the asset payo , , and submits his order, . The market maker (many of them, actually),
observes the aggregate order ow, dened as the sum of the insider trade and a liquidity shock,
, i.e.
+ , where
( 2 ) and
(0 2 ), and fundamentals ( ) and noise ( )
are independent.
Given the order ow they observe, and perfect competition, the representative market maker
sets the asset price according to semi-strong informational e ciency,
( | )
(9.61)
(9.62)
where the coe cient, , known as Kyles lambda, measures the price impact of the order
ow. In other words, its inverse, 1 , is a measure of market depth, i.e. the order ow that is
needed to induce prices to change by one dollar. So because the order ow obviously contains
liquidity shocks, the higher , the less liquid the market is. We shall determine the value of
, below, as part of the equilibrium of the game between the informed trader and the market
maker.
The insider trader chooses his order size to maximize his prots while anticipating he has
price impact,
max
(
) =
with the price being as in Eq. (9.62). The rst order conditions of the previous program lead
to the following demand schedule,
=
1
2
(9.63)
, which is the
That is, the insider trade is proportional to his informational advantage,
di erence between his information about the asset payo , and the unconditional guess of the
asset payo , the fair value of the asset. Naturally, the coe cient of proportionality, , has
still to be determined.
Regarding the market makers beliefs, we have, by the projection theorem, that,
=
( | ) = +
)
( )
436
= +
2 2
+ 2
{z }
(9.64)
c
by
A. Mele
where the identity under the brackets follows by the conjecture made in (9.62).
To summarize, the equilibrium in this market is one in which the market maker sets the price
as in Eq. (9.62), and the optimal trading size of the insider is given by in Eq. (9.63). The
price impact, , in (9.62) and the trading aggressiveness coe cient, , in (9.63), satisfy:
1
2
1
=
+
=
(9.65)
The rst condition says that when markets are deep (i.e. when the price impact is low), the
price is obviously less likely to move against the insider, who will therefore trade aggressively
. The second condition says that the market maker makes
on his informational advantage,
the markets deep both when the insider trades little on his information advantage (i.e. when
is low) and, when he trades so aggressively to reveal much of his information (i.e. when is
high). An equilibrium of this game is the solution to the system (9.65),
=
1
2
(9.66)
The insider trades more and more aggressively as liquidity trades become large (i.e. as
becomes large), because it is easier to hide his information in this case. Alternatively, the probability the order ow contains information decreases with , making adverse selection less acute,
leading the market maker to lower and, then, the informed traders to trade more aggressively.
Likewise, the insiders informational advantage increases with , leading the market maker to
raise the adverse selection costs , and, then, the insider to trade less aggressively.
Price discovery leads to halve the initial uncertainty about the asset payo ,
2
( | )=
1
( )
=
( )
2
(9.67)
where the rst equality follows by the projection theorem, and the second by a direct calculation.
Finally, the expected prots to the trader are, unconditionally,
((
) )=
1
2
When
traders observe the asset payo , the market maker will see an order ow equal to,
+ , and will set the price as in Eq. (9.61). We still conjecture that this price is linear
in , as in Eq. (9.62), and that each insider trades on his information advantage according to,
=
(9.68)
437
c
by
A. Mele
Note, now, that while formulating his trading rule, each insider knows he will have a price
impact but that the other agents have price impact too. A Cournot-Nash equilibrium is one
in which (i) each trader formulates his optimal trading decision whilst taking the trading rules
adopted by his peers as being as in (9.68) and still, (ii) nds (9.68) being optimal to him.
The reason this market is referred to as being Cournotian is because it parallels the Cournot
(1838) model in Industrial Organization, in which a nite number of rms compete for the same
product (and, hence, the same price),7 and act by maximizing their prots based on residual
demands, just as in the model of this section. Naturally, traders formulate their strategies
while also conjecturing that the market maker bases his inference on the aggregate order ow,
as explained.
Each trader maximizes his expected prots, as follows:
+ +
= + (
1)
max
(
) =
(
2{z
1)
}
where the identity under the brackets follows by the conjecture made in Eq. (9.68). That is,
=
and one determines
1
+ 1)
(9.69)
similarly as in (9.64),
(
+ )
=
+ )
2
2 2 2
(9.70)
+1
(9.71)
438
c
by
A. Mele
Finally, competition amongst traders lead them to experience lower prots as their number
increases,
1
((
) )=
( + 1)
We now explain that some of these properties do not hold in markets where traders have
heterogenous information.
9.8.2.2 Heterogeneity in private information
max
(
) |
+ +
= +
6=
It is instructive to analyze the rst order conditions of this problem,
0=
|
2
6=
(9.72)
Note that while determining his optimal asset demand , each agent needs to forecast the
demand of his peers given the information he has access to, ( | ), while at the same time
fully understanding that his peers are doing exactly the same, in that
reects agents
expectation of agent (plus all other agents) expectations given . Forecasting the forecasts of
others leads to an innite regress problem, whereby agents end up making conjectures about
the conjectures others are making about them, ad innitum.
There is no guarantee a xed point exists to this reasoning. However, in a linear equilibrium,
a solution exists and is easy to describe. Conjecture that each agent trades on his informational
advantage according to,
=
(9.73)
Then, each agent forecasts of the forecasts of others simplies, collapsing
asit does to fore
. Therefore, in
casting the informational advantages of the peers, ( | ) =
this linear equilibrium, each agent can determine his optimal demand from the rst order
conditions (9.72), as follows:
1
(
1)
(1
(
1))
=
(9.74)
=
2
2{z
}
|
where we have used
that each agent trades according to (9.73), the symmetry
1 the conjecture
( | )=
for each , and the second equality follows because by the projection
439
c
by
A. Mele
2
2
+ 2
| {z }
(9.75)
and, nally, the identity under the brackets in (9.74) follows by the conjecture made in Eq.
(9.73).
Note that Eq. (9.75) says that the slope regression estimates of dividends and signals are the
same, and equal to . Moreover, is also the correlation coe cient between the signals traders
have access to. The model collapses to the identical information market in the previous section
once 2 = 0, i.e. = 1.
By Eq. (9.74),
(9.76)
=
(2 + (
1))
and regarding the price impact, the Appendix shows that,
=
)
( )
such that,
=
2 2
(1 + (
1) ) +
2+(
1)
(9.77)
(9.78)
In equilibrium, trading aggressiveness, , increases with the correlation of the traders signals.
Intuitively, in this model, the signals the traders observe become less and less correlated as their
quality deteroriates,
=
1+
2
where = 2 is a measure of the signals quality. That is, as the noise component increases
(i.e. 2 increases), each trader has access to more and more idiosyncratic, albeit noisy, pieces
of information. Each trader then trades less aggressively as his signal becomes less precise.8
The price impact in (9.78), , is increasing in , provided is small enough. The mechanism
is the following. The market maker anticipates that an increase in the quality of the traders
information (reected by a higher ) results in a more informative trading, which makes adverse
selection more severe increases with as a result. In other words, market makers have few
traders ( is small), who in addition are equipped with high quality information, and increases
with as a result. Note that when
is large, can be decreasing in provided the latter is
large enough. The reason is that as
increases, the order ow become more informative (as
is increasing in )informed traders are then revealing their presence, and an increase
in would make their presence revealed even more so to speak: adverse selection costs lower as
a result.
Price discovery deteriorates as the correlation decreases, as it can be easily veried that:
2
( | )=
2
( )
=
( )
2+(
1)
8 In Section 9.8.3, we discuss alternative signals structures such that a decrease in correlation does not necessarily imply less
precise signals but an increase in monopolistic information power.
440
c
by
A. Mele
(9.79)
When is small, expected prots increase as increases, reecting the fact that agents observe
signals with better quality. As increases, the expected prots might actually fall with when
is su ciently high, reecting a lower informational advantage, as explained earlier while
commenting on in (9.78).
9.8.2.3 Multiple dealers
Kyle (1989) considers a model in which uninformed and informed traders co-exist and act
strategically. Foucault, Pagano and Roell (2013) consider a simplied version of this model, in
which all traders are risk-neutral, and the uninformed traders are interpreted as dealers.
These dealers compete through a call auction mechanism which leads to an equilbrium
P price such
that the dealers make prots on average. As in Kyle (1985), the order ow is =
+ ,
=1
but in the presence of
dealers acting in an imperfectly competitive market,
=
( )
(9.80)
=1
where ( ) is the supply schedule submitted by dealer , a function of the price. The auctioneer
sets a price such that the demand schedules of the dealers sum up to the order ow.
Each dealer maximizes his expected prots taking the other dealers (and the informed
traders) actions as given, as in a Cournot-Nash market,
max
((
| )
(9.81)
and one constraint that takes into account the nature of imperfect competition amongst dealers,
as we now explain.
Assume that the other dealers supply schedules are, for each ,
( )=
6=
(9.82)
for some to be determined in the equilibrium of the game. The meaning of is that dealer
sells if
0, and buys if
0. Replacing Eq. (9.82) into Eq. (9.80) leaves the residual
demand function for any dealer ,
= +
1)
(9.83)
1
1
(
1)
( | ) = (1
(
1) )
(9.84)
2
2
where the last equality follows by the projection theorem and, as usual, and assuming for
simplicity a single insider trader,
2
+
= 2 2
=
(9.85)
+
+ 2
441
=
c
by
A. Mele
The previous expression is the same as the usual Kyles lambda (see Eq. (9.64)), although it
is not the price impact in this model unless
is large.
We determine in Eq. (9.82) and then, we search for an equilibrium price by solving Eq.
(9.80). First, eliminate from Eq. (9.83) and Eq. (9.84), such that,
(1
( )
1)
(
1) ) (
1+ (
1)
{z
where the identity under the brackets follows by the conjecture made in Eq. (9.82). Therefore,
2
(9.86)
1)
Finally, Eq. (9.80), Eq. (9.82), and Eq. (9.86) imply that the equilibrium price is,
= +
1
2
(9.87)
To determine the overall equilibrium, we need to determine in (9.85), and then, in Eq.
(9.87). The usual maximization problem of the insider trader leads to,
1
=
2
(9.88)
=1
2
(9.89)
Dealers market power makes asset markets less deep than in the perfect competition case of
Kyle (1985), because the price impact, , is decreasing in
as (9.89) reveal. However, price
discovery remains the same as in the Kyle (1985) model (see Eq. (9.67)),
( | )=
1
2
Intuitively, the insider trades through half of the market depth both in the perfectly competitive
(see (9.63)) and in the imperfectly competitive case (see (9.88)).
Finally, the expected prots to the insider trader are,
((
1
) )=
2
(
(
2)
1)2
c
by
A. Mele
Consider an asset that pays o a dividend at some point in time, . The asset can be
traded inP distinct periods. A traders cumulative position at the end of the trading period
is
=
, where
denotes the position in the asset at the trading period (these
=1
positions are not part of a self-nanced strategy). His nal prot is,
where
Therefore,
=1
denotes the sum of the values of all the positions over the trading period.
=
(9.90)
=1
Below, we shall utilize the continuous time counterpart to this expression while reviewing a
dynamic extension of the Kyles model.
9.8.3.2 Monopolistic trader
How does the insider trader dilute his information when allowed to trade over di erent batch
auctions? We analyze the continuous time version of Kyles model, in which trading occurs over
a nite horizon xed to [0 1]. As in the baseline market, the insider trades based on his available
information, and the market maker (many of them, actually) updates his beliefs regarding the
fundamental asset value based on having observed the aggregate order ow.
The aggregate cumulative orders at
[0 1], say, comprise two components: (i) the
R any
insiders cumulative demand,
, and (ii) the cumulative orders by the liquidity
0
traders,
, where
is a constant, and
is a standard Brownian motion. Therefore, =
+
, such that the aggregate order ow satises:
=
(9.91)
The market maker does not make any prots while he sets the price according to the semistrong e ciency rule paralleling that in Eq. (9.61),
=
| ( ) [0 ]
(9.92)
The insider trader maximizes his expected gains from trade, the continuous time version of
in Eq. (9.90):
Z 1
(9.93)
)| =0
max
(
)
(
=
( )
0
[0 1]
Motivated by the market behavior in the static setting of the previous sections, we conjecture
that the optimal trading strategy in (9.93) is
=
for some deterministic function
(9.94)
)
443
(9.95)
c
by
A. Mele
| ( ) [0 ] +
+ =
(
| ( ) [0 ] )
+
( +
)
=
| ( ) [0 ] +
( +
| ( ) [0 ] )
=
(9.96)
where the second equality follows by the projection theorem, and the third by Eq. (9.92), Eq.
(9.95) and the following denition of residual variance,
(
)2 ( ) [0 ]
(9.97)
The residual variance, , can be interpreted as a gauge of informational e ciency, or price
discovery process, i.e. the market makers inference about the asset payo . We shall see that in
equilibrium, price discovery will be complete, in that lim 1 = 0.
In the limit, and disregarding 2 terms, the price in (9.96) satises,
=
(9.98)
We now determine and , and then use (9.98) to determine , thereby having determined
the trading strategy in Eq. (9.94). Later, we shall verify that the thusly determined strategy
is indeed optimal. Regarding , suppose that
becomes available over the time interval .
Then,
(
)2 ( ) [0 + ]
+ =
2
(
| ( ) [0 ] )
+
2
) ( ) [0 ]
=
(
( +
| ( ) [0 ] )
2
2
2
in (9.98),
2 2
(9.99)
(1
444
(9.101)
c
by
A. Mele
That is, the pace of information revelation is constant in this market. Eqs. (9.100), (9.101)
and (9.98) now imply that the trading aggressiveness in (9.94) is,
=
1
(1
(1
(9.102)
Eqs. (9.100), (9.101) and (9.94) complete the description of the market equilibrium, once we
prove the trading strategy in Eq. (9.94) is optimal with
as in (9.94). The proof proceeds
in two steps. First, we show that the proposed
leads the price process to converge to the
fundamental (no money left on the table); and second, we show that any strategy is optimal
when it satises this property.
Regarding convergence to fundamental value, note that by replacing (9.91) into (9.98), using
the expressions for in (9.94), and the expressions for and
in (9.100) and (9.102),
=
(9.103)
Now, by Karatzas and Shreve (1991, Corollary 6.10, p. 359), the process,
=
, is a
1
(
(9.104)
) + (
)+ (
)
+
(
) 2 2
0 = max
2
where the expectation is taken conditional upon the price, and where we have used
=
and the expression of (9.91) for . As originally noted by Back (1992), the maximand in (9.104)
is linear in such that the terms multiplying must sum up to zero, with the remaining terms
summing up to zero as well, viz
+
=0
(9.105)
1
( )+ 2
( ) 2 2=0
Eqs. (9.105), and the boundary condition
the value function in (9.93),
(
)=
1
(
2
(1
1)
)2 +
=
+
(1
, say
)
(
(9.106)
). By Itos lemma,
where
denotes the usual innitesimal generator, which by Eq. (9.93) satises 0 =
(
) . Therefore,
Z 1
=
=
(
)
(0
)
(1
)
0
1
9 Note
of
445
c
by
A. Mele
2
Note that by evaluating Eq. (9.106) at = 1, we conclude that (1 1 ) = 21 (
0.
1)
Foster and Viswanathan (1996) generalize the dynamic market of Kyle (1985), by relying on
a discrete time setting with multiple traders ( say), and a general correlation structure of
the signals (still assumed to be Gaussian). They assume that each trader observes a signal
correlated with the asset payo , which has variance 0 , and that any two signals have the same
0 10
correlation equal to
.
0
One leading example of signals in this setting is the exhaustive information structure,
arising when the sum of all signals is the
P truth, i.e. all traders would form a mega Kyle
trader upon information collusion, =
is the signal available to trader . It
=1 , where
is immediate to verify that in this case, for each ,
(
)=
+(
1)
1)
,
1
(
(9.107)
1)
2
0
The correlation coe cient is a measure of monopolistic information power amongst traders.
The lower , the more unique a signal is to each traderand the higher is his monopolistic
power. Foster and Viswanathan (1996) show, numerically, that in a dynamic context, traders
engage in a rat race once is high: because traders have comparable information, they trade
very aggressively since the beginning of the trading period to preempt being anticipated by
others, but then, revealing virtually all their information over the rst trading rounds. In fact,
Back, Cao and Willard (2000) consider a continuous time model along the same lines, and
conclude that an equilibrium fails to exist when = 1 due to this behavior.
However, when is small, and the degree of monopolistic information power is high as a
result, traders engage in a waiting game, trading little at the beginning of trading period, and
aggressively towards the conclusion. These trading patterns lead to a price discovery process,
which occurs at a slow pace at the beginning of the trading period, and at a high pace at the
end. These patterns of price discovery process cannot be generated by the constant information
ow predicted by the single insider traders model of Kyle (1985), as summarized by Eq. (9.101).
10 The
variance-covariance matrix of the signals available to all traders is, then, invertible, provided
446
1)
0.
c
by
A. Mele
447
c
by
A. Mele
448
449
c
by
A. Mele
c
by
A. Mele
We have,
( | )=
(9A.1)
and
1
( | )=
(9A.2)
A proof of this result can be obtained as follows. Consider the following regression
)+
= (
( )=
(9A.3)
, which
( | )
= +
0 2
= 1
(9A.4)
When
= 1,
( | )=
( | )=
( | )=
( | )=
=1
1
P
=1
1
+
(9A.5)
1
(9A.6a)
(9A.6b)
=1
Eqs. (9A.6a)-(9A.6b) simplify once the precisions of the signals are all the same,
leaving,
1X
=1
450
1
+
for all ,
c
by
A. Mele
(
| )+ 12 2
(
| )
U =
=
where, by the expression of in (9.26),
(
| )=(
such that,
U =
)2
( ( | )
(
0+ 0
1
2
| )=
( ( | )
)2
( ( | )
)2
(9A.7)
Instead, by the Law of Iterated Expectations, the expected utility of the would-be informed investors
before accounting for information costs is
U =
1
( 0+ 0 )
)2
2 |( ( |)
=
1
( 0+ 0 )
)2
| ( ( | )
2
(9A.8)
=
To determine the inner expectation in the last line of (9A.8), we rely on the following distributional
result, shown below,
)
(
)
| ( ( | )
(9A.9)
|
)
1
| ( ( | )
|
), then
1 2
2
2
1
2 1+
1
1+
2
1
|
( 0+ 0 )
((
(
|
)
))
=
U =
2 |
|
|
|
where the second equality follows by the expression of U in Eq. (9A.7). Eq. (9.35) follows by the
previous expression of U .
We are left to show that (9A.9) holds true. Regarding the conditional expectation , note that
( | )= ( |
), such that, by the Law of Iterated Expectations and the fact that the information
content of is coarser than that of (
),
(
Regarding the expression for
( | )| ) =
( |
)| ) =
( | )
( | )
=
=
(
(
( | )| ) +
( | )| ) +
451
(
2
|
( | )| )
c
by
A. Mele
( | )| ) =
2
|
=
=
1
( )
( )
( )
2(
( )
(9A.10)
=
) =
=
=
)
=
1
1X
=
=
2
2
2
+2
2+
2
+
=1
(9A.11)
+(
2 2
1)
+(
1)
Plugging (9A.11) into Eq. (9A.10), leaves, after tedious but straightforward calculations,
=
=
2 2 2
(9A.12)
where
( )
2 2
(9A.13)
Next, replace (9A.12) into the expressions of the price coe cients, Eqs. (9.43), which leaves the
following expressions for and :
2 2 2
1
2
=
(
|
|
|
, prior to determining
452
21
and
2 2
(9A.14)
in (9A.14). By the
2 2
(9A.15)
c
by
A. Mele
21
=
and
21
1
1
2
2
(9A.16)
2 2
2 2
2 2
2 2
2 2
(9A.17)
To show that there exists a unique solution for and , dene the constant through the following
relation,
1
1
21 1
+ 2 2 2
1 1 2+ 2 2 2
=
=
(9A.18)
2 2
2 2
where the rst equality follows by Eqs. (9A.16) and the second by the very same denition of . It is
say. Given , we determine
easy to see that there exists a unique solution for to Eq. (9A.18),
in (9A.16) by expressing it as a function of , and relying on the denition of
in (9A.13), as
follows:
2 2 2
2 2 2
=
where
1
, and then
1
2
2 2
1
2
(9A.19)
2 2 2
Eq. (9.44) follows after using the denitions of precisions in the previous expression, and after rearranging terms.
The limiting case in Eq. (9.46) is obtained after taking the limits in Eq. (9.44) for large, and
noting that,
lim
Finally, the limiting conditional precision in (9.47) is obtained after taking the limit in Eq. (9A.17)
for large, and using the previous denition of .
Proof of Eq. (9.77). By Eq. (9.73), the order ow is,
=
1 X
=1
(9A.20)
=1
1 X
=1
453
(9A.21)
c
by
A. Mele
)=
(9A.22)
and,
( )=
2 2
() +
2 2
1)
2
(9A.23)
1 2
(9A.24)
Proof of Eq. (9.79). Note that by Eq. (9A.20), the equilibrium price is,
= +
+
1
2
1
((
) )=
1
2
=
(1
) 2
=
(2
)
2+(
1)
where the second line follows by the expression for the average signal in Eq. (9A.21) and by rearranging
terms, and the third by using the expression for
in Eq. (9.76). Eq. (9.79) follows by the denition
of in (9A.24) and the expression for in (9.78).
454
c
by
A. Mele
[0 1]
(9A.25)
is the nominal money holdings, is a general price index (to be determined below; see Eq.
where
is consumption of a basket of goods (to be dened in a moment), and
is labor supplied;
(9A.32)),
the parameters satisfy
(0 1),
0,
1. In words, each producer enjoys consuming a basket a
goods but su ers a disutility while working.
We assume that the basket of goods contains CES (Constant Elasticity of Substitution) substitutes
as in Dixit and Stiglitz (1977), that is,
=
1)
(9A.26)
(9A.27)
is the price of his produced good and is his money endowment. It is easy to see that
where
the solution to this problem is symmetric, in that each consumer-producer chooses the same basket
composition (albeit acheiving di erent consumption quantities of this same basket, depending on the
income ). We can, then, dene the price index in (9A.25) as the minimal expenditure needed to
purchase a unit of the composite good
in Eq. (9A.26),
:
(9A.28)
We now determine demand for both the single and composite goods.
Goods demand and price index. We determine
rst. Replacing Eq. (9A.28) into the budget
+
= , leads
constraint (9A.27) and maximizing (9A.25) subject to the resulting constraint,
to a standard result:
=
= (1
)
(9A.29)
455
c
by
A. Mele
. It is
=
(9A.30)
where the second equality of the constraint follows by Eqs. (9A.29). Note that this program is simply
maximization of (9A.25) under the constraint (9A.27). The rst order conditions lead to
1
1 1
=
whence, the denition of as the CES between any two consumption goods. Note that to simplify
. Replacing the previous optimality condition into
notation, we are now reverting to writing
the constraint of (9A.30), and using the denition of the composite good in Eq. (9A.26), leads to
=
=
(9A.31)
where the second equality follows by the second equality of the constraint in (9A.30).
The price index is obtained by solving for in Eq. (9A.28) while relying on the rst equality in Eq.
(9A.31), leaving
11
Z 1
1
(9A.32)
=
0
Finally, we determine the indirect utility of consumer-producer , by replacing Eqs. (9A.29) into
Eq. (9A.25),
=
(9A.33)
+
where is a constant and where we have used the agent budget constraint and the production technology = . Note that this expression resembles that of a maximizing rm. One issue is whether the
in
is a ected by his decisions regarding
consumer-producer has pricing power, that is, whether
. We assume it is the case below, although then this is not crucial for the interpretation of the pricing
equations in Section 9.2.
Production and equilibrium. The aggregate demand for the good produced by the consumerproducer is obtained by aggregating the individual demands for this good,
Z 1
Z 1
=
=
(9A.34)
0
where we have used the second equality in (9A.31), and where the last equality holds by the following
denition of aggregate demand, ,
Z 1
Z 1
Z 1 Z 1
=
=
(9A.35)
0
Let denote the aggregate money endowment. We can solve for , by noticing that
Z 1
Z 1
+
=
=
+
=
0
R1
0
, such that,
Z 1
=
=
0
456
(9A.36)
c
by
A. Mele
condition equilibrium,
= 0
; and the second by (9A.35). By Eq. (9A.36), the solution for
is then
=
1
which replaced into Eq. (9A.34) leaves the aggregate demand facing consumer-producer
=
(9A.37)
In the context of monopolistic competition of this appendix, one could solve for the optimal pricing
rule. The latter is obtained by replacing Eq. (9A.37) into Eq. (9A.33) and maximizing with respect to
, leaving
"
# 1
1 1+ ( 1)
=
(9A.38)
(
1)
1
We now proceed with the interpretation of some basic assumptions in Section 9.2 in light of the
framework developed so far.
Relations with model in Section 9.2. Eq. (9A.37) provides foundations to the demand for each
product in the model of Section 9.2 (see Eq. (9.3)). Regarding the supply equation (see Eq. (9.1)), note
that assuming market power and that is common knowledge leads to Eq. (9A.38), which could be
replaced back into Eq. (9A.37) to determine the equilibrium production in each market. Alternatively,
one could simplify and remove the assumption of market power. Assume, further, that is not common
knowledge, in which case producers maximize their expected utility in Eq. (9A.33),
!
arg max
=
Assume that ln
| )
(9A.39)
457
c
by
A. Mele
References
Back, K. (1992): Insider Trading in Continuous Time. Review of Financial Studies 5, 387409.
Back, K., C.H. Cao and G.A. Willard (2000): Imperfect Competition Among Informed
Traders. Journal of Finance 55, 2117-2155.
Black, F. (1986): Noise. Journal of Finance 41, 529-543.
Blanchard, O.J. and N. Kiyotaki (1987): Monopolistic Competition and the E ects of Aggregate Demand. American Economic Review 77, 647-666.
Blanchard, O. and S. Fisher (1989): Lectures on Macroeconomics. Cambridge, MIT Press.
Chamberlin, E.H. (1933): The Theory of Monopolistic Competition: A Re-orientation of the
Theory of Value. Harvard: Harvard University Press.
Cournot, A.A. (1838): Recherches sur les Principes Mathematiques de la Theorie des Richesses.
Paris: Hachette.
DeLong, J.B., A. Shleifer, L.H. Summers and R.J. Waldman (1990): Noise Trader Risk in
Financial Markets. Journal of Political Economy 98, 703-738.
Diamond, D.W. and R.E. Verrecchia (1981): Information Aggregation in a Noisy Rational
Expectations Economy. Journal of Financial Economics 9, 221-235.
Dixit, A.K. and J.E. Stiglitz (1977): Monopolistic Competition and Optimum Product Diversity. American Economic Review 67, 297-308.
Du e, D. (2012): Dark Markets: Asset Pricing and Information Transmission in Over-theCounter Markets (Princeton Lectures in Finance). Princeton: Princeton University Press.
Du e, D., N. Garleanu and L.H. Pedersen (2005): Over-the-Counter Markets. Econometrica
73, 1815-1847.
Du e, D., N. Garleanu and L.H. Pedersen (2007): Valuation in Over-the-Counter Markets.
Review of Financial Studies 20, 1865-1900.
Fama, E. (1970): E cient Capital Markets: A Review of Theory and Empirical Work. Journal of Finance 25, 383-417.
Foster, F.D., and S. Viswanathan (1996): Strategic Trading When Agents Forecast the Forecasts of Others. Journal of Finance 51, 1437-78.
Foucault T., M. Pagano and A. Roell (2013): Market Liquidity: Theory, Evidence and Policy.
Oxford: Oxford University Press.
Garleanu, N., L.H. Pedersen and A.M. Poteshman (2009): Demand-Based Option Pricing.
Review of Financial Studies 22, 4259-4299.
458
c
by
A. Mele
Glosten, L.R. and P.R. Milgrom (1985): Bid, Ask and Transaction Prices in a Specialist
Market with Heterogeneously Informed Traders. Journal of Financial Economics 14,
71-100.
Greenwood, R. and D. Vayanos (2014): Bond Supply and Excess Bond Returns. Review of
Financial Studies 27, 663-713.
Gromb, D. and D. Vayanos (2002): Equilibrium and Welfare in Markets with Financially
Constrained Arbitrageurs. Journal of Financial Economics 66, 361-407.
Grossman, S.J. (1976): On the E ciency of Competitive Stock Markets where Traders Have
Diverse Information. Journal of Finance 31, 573-585.
Grossman, S.J. and J.E. Stiglitz (1980): On the Impossibility of Informationally E cient
Markets. American Economic Review 70, 393-408.
Hayek, F.A. (1945): The Use of Knowledge in Society. American Economic Review 35,
519-530.
Karatzas, I. and S.E. Shreve (1991): Brownian Motion and Stochastic Calculus. New York:
Springer Verlag.
Kyle, A.S. (1985): Continuous Auctions and Insider Trading. Econometrica 53, 1335-55.
Kyle, A.S. (1989): Informed Speculation with Imperfect Competition. Review of Economic
Studies 56, 317-356.
Hellwig, M.F. (1980): On the Aggregation of Information in Competitive Markets. Journal
of Economic Theory 22, 477-498.
Lange, O. (1936): On the Economic Theory of Socialism: Part I. Review of Economic Studies
4: 53-71.
Lange, O. (1942): The Foundations of Welfare Economics. Econometrica 10: 215-228.
Lucas, R.E. (1972): Expectations and the Neutrality of Money. Journal of Economic Theory
4, 103-124.
Lucas, R.E. (1973): Some International Evidence on Output-Ination Tradeo s. American
Economic Review 63, 326-334.
Lucas, R.E. (1977): Econometric Policy Evaluation: A Critique. Carnegie-Rochester Conference Series on Public Policy 1, 19-46.
Lucas, R.E. (1981): Studies in Business-Cycle Theory. Boston, MIT Press.
Phelps, E.S. (1970): Introduction. In: Phelps, E. S. (Editor): Microeconomic Foundations of
Employment and Ination Theory, New York: W. W. Norton.
459
Part III
Asset pricing and reality
460
10
Options and volatility
10.1 Introduction
This is the rst of four chapters devoted to illustrate how nancial theory can be applied to
cope with the pricing of derivatives and related instruments. We actually know that much of the
theory in Part I of these lectures was motivated as an attempt to rationalize the breakthrough
made by Black and Scholes (1973) and Merton (1973) to price European options. How come
we could even price an asset without making reference to any risk-aversion correction? The
theory in Part I of these lectures explains the rationale behind this and related results. We now
apply the theory to explain how to price assets in markets more realistic than those originally
idealized by Black, Scholes and Merton.
We face a paradox known since at least Hakansson (1979). If the Black & Scholes formula
is true, we should acknowledge that markets are complete, as market completeness is the assumption needed to argue about the redundancy of the option and, then, the whole Black &
Scholes theoretical construct. But if the option is redundant, why would we be willing to trade it
in the rst place? Alternatively, the option is not redundantand options are massively traded
indeedbut then the Black & Scholes formula is wrong, in that it relies on the counterfactual
assumption that markets are complete. Indeed, in practice, many derivatives are traded overthe-counter, with nancial intermediaries specializing in providing counterparties with payo s.
Financial intermediation is also about matching the clients needs regarding the obtention of
dedicated payo s against a fee, on top of the fair value of the derivative. The fee might be
justied by the specialization required to cope with the sources of market incompleteness, as
well as the risks the intermediary will incur losses due to its obligation to honour the payo s
promised to clients.
This chapter analyzes a form of market incompleteness, arising when the volatility of the
assets underlying these derivatives is random, and cannot be hedged through the underlying
assets.
[Introduction in progress]
[Plan of the chapter]
c
by
A. Mele
(cocoa) 7
(chocolate) 7
Let co and ch be the price of cocoa and chocolate at time and , and co and ch be
the corresponding forward prices. To insure against the vagaries of cocoa prices, the producer
co
can go long a forward contract. This contract guarantees a payo equal to co
at time
co
co
. This payo , minus the unit input cost
to incur at time , leaves exactly
. That
is, a forward contract allows the producer to freeze the sure amount co to be paid at .
Likewise, the producer may wish to short a forward on the price of his own chocolate; indeed,
ch
this position would allow him to receive a payo equal to ch
at , which added to the
price of chocolate sold at , leaves the sure amount ch .
10.2.1 Forwards: denition and pricing in frictionless markets
If markets are frictionless, and the underlying asset is traded (a stock, say), forward contracts
can be synthesized as follows. Let ( ) be the price of a bond expiring at time and
(
)
the price of a stock. Assuming the short-term rate is constant, we have ( ) =
,
where denotes the short-term rate, which is the same for borrowing and lending. So at time
, borrow ( )
and buy the stock, choosing
: ( )
= 0. The value of this
portfolio at time is
. But the portfolio is worthless at time , so this trade is the same
as a forward. Therefore, we have
= ( ), where:
( )
(10.1)
Therefore, forwards are insensitive to volatility in this introductory example. They actually
might be volatility-sensitive under circumstances claried below (see the discussion after Eq.
(10.2)) and in the following sections.
The pricing of a forward on a stock can be extended to cases where the underlying is not
traded. Suppose that a risk materializes at time , say the level of average temperature over
a certain period preceding , and over a pre-specied geographical area, or say the realized
volatility experienced by the S&P 500 Index over the month preceding . Denote this risk with
. The payo of a forward contract from the perspective of its buyer is given by
( ),
such that by risk-neutral evaluation,
( )=E( )
462
(10.2)
c
by
A. Mele
where E denotes the risk-neutral expectation. Naturally, Eq. (10.2) collapses to Eq. (10.1),
should be the price of a traded stock.
In general, though, Eq. (10.2) reveals that is unlikely we could come up with a preferencefree evaluation of a non-traded risk, as expected from the theory developed in Part I of these
lectures. In these cases, the volatility of may well a ect the forward price ( ) due to
risk-premiums.
There are exceptions in which the pricing in Eq. (10.2) is model-free even if is not traded.
An important relatively recent advance is to have shown how to proceed with the model-free
evaluation of non-traded risks, provided a su ciently high number of additional derivatives
are written on this risk. The CBOE-VIX index of expected volatility does rely on this idea as
explained in Section 10.8.
10.2.2 Forwards as a means to borrow money
Forward contracts can be used to borrow money. We can do the following: (i) go long a forward,
which at time , delivers the payo
+ ; (ii) short-sell the underlying asset, which at time
, will give rise to a payo of
. So, (i) and (ii) are such that now, we access to
dollars,
due to (ii), and at time , we pay
, i.e. the sum of the two payo s resulting from (i) and
(ii). By Eq. (10.1), this is tantamount to borrowing money at the interest rate .
10.2.3 Marking to market
Consider a derivative we go long at time = 0, when it is worthless. As time unfolds, its value
will change, which calls for marking to market it. Suppose the derivative pays o
( )
0
at time , where
is the price of some asset as of time , and 0 is set so as to make
the derivative worthless at time zero. Assuming that interest rates are constant, we have that
E0 [ ( )
= 0, taken under the
0 :
0 ] = 0, where E0 is the expectation at time
risk-neutral probability. That is, 0 = E0 [ ( )]. The market value of the derivative at time
, say MtM , is simply the present value of the expected payo at , under the risk-neutral
(
)
probability,
E [ ( )
0 ], or
MtM =
0)
(10.3)
For more elaborated payo s, such as those depending on the realizations of the underlying risks
over the life of the contract, marking to market updates may be more intricate than that in
Eq. (10.3), as in the case of the variance contracts (see Section 10.7.3).
10.2.4 Futures
Forward contracts are typically OTC, and not standardized, and might not be traded after
their inception. Futures are, instead, standardized. The cost of entering into a future contract
is zero, as for a forward. However, the central feature of future contracts is marking-to-market,
which forces their value to be zero at any time we wish to enter into them after their inception.
Note, in contrast, that in general, we should have to pay (or be payed) to enter into a forward
contract after its inception. Indeed, the payo at pertaining to a forward that we enter at
time is
( ), such that similarly as for Eq. (10.3), the value of the forward at (and
originated at ) is as in the following mark-to-market update,
( )
E (
( )) =
463
( )
( ))
c
by
A. Mele
By construction, forwards are not standardized, in that their value (the cost of entering
into them) depends on when we enter them! Moreover, in OTC markets, it is often the case
that the appropriate marking notion is that of a mark-to-model, rather than mark-to-market.
Futures work di erently, as the cost of entering into them is always zero, as mentioned.
Precisely, let F ( ) be the price process of the future at time . Intuitively, while we hold a
future position, we only pay or receive di erences up to the maturity , F ( ) F 1 ( )
1
over pre-specied periods,
1 . Finally, we could close any position at any time before
by just initiating an opposite position.
For reasons developed below, let us assume that the short-term rate
is random. The two
dening properties of F ( ) are that (i) at maturity, the futures settle at , i.e. F ( ) =
(a boundary condition that is part of the futures security design), and (ii) the gains generated
by F ( ) are continuously credited from/debited to the future holders account, such that in
absence of arbitrage,
Z
F ( )
0=E
1
( )=
E
( )
where ( ) is the price of a zero coupon bond. In other words, the future price is a martingale
under the risk-neutral probability, while the forward is not. In Chapter 12, we show that forward
prices are martingales under another probability, referred to as the forward probability (see also
Chapter 4). Naturally, both futures and forward prices are martingales under the risk-neutral
probability, once interest rates are constant or deterministic, although this case is obviously not
relevant whilst dealing with the evaluation of xed income securities.
10.2.5 Backwardation and Contango
Let us assume interest rates are constant, such that forwards and futures are valued the same.
A natural question arises: do markets price forwards or futures at a discount or premium?
We have two notions of dicounts: one, regarding expectations of future prices; and a second,
more operational, regarding current prices.
As for the rst notion, we say markets are in Backwardation if the forward price is lower than
(
)
the expected spot price, ( ) =
( ), and in Contango otherwise, i.e. ( )
( ), where denotes the expectation taken under the physical probability. Clearly, markets
(
)
are in Backwardation if prices follow Geometric Brownian motions, as
( ) =
,
where
is the drift coe cient under the physical probability. Keynes (1930) and Hicks
(1939) would refer to markets in Backwardation as being the standard situation, one of Normal
1 In addition to these margins, one is also typically due to provide an initial margin, which aims to mitigate concerns regarding
the solvability of the trading parties.
464
c
by
A. Mele
160
190
150
140
Forward
180
Spot
170
130
160
Forward
120
150
140
110
130
100
90
80
Spot
120
110
0.0 0.1
0.2 0.3
0.7 0.8
0.9
100
1.0
time
0.0
0.1
Backwardation
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
time
Contango
The presence of a convenience yield is an alternative assumption to that of short-sale constraints, which we can make to restore Backwardation. A convenience yield represents the ow
of services accruing to the owner of the asset (not the owner of the forward). In the presence
of a constant convenience yield, denoted with , the forward price satises:2
( )=
)(
(10.4)
2 To prove Eq. (10.4), we generalize the reasoning leading to Eq. (10.1), as follows. Suppose to the contrary that
()
)(
) . Then we go long a forward contract, short
(
) shares at , and invest the proceeds in a money
()
() (
465
c
by
A. Mele
such that we have Backwardation again, for large enough.3 Note, however, that Contango
would be a normal situation in storable commodities markets characterized by costs of carry
such as warehousing fees or foregone interestsa cost of carry would imply a negative in Eq.
(10.4). Thus, and according to the sign and magnitude of , we can either have Backwardation
or Contango.
Some commodity markets can actually switch from being in Backwardation to Contango, and
in Backwardation again, through cycles. To account for these cycles, we may wish to consider
a model with a stochastic convenience yield; an instance of such a model is one where under
the risk-neutral probability, the spot price is solution to
=(
= (
(10.5)
and
where are standard Brownian motions under the risk-neutral probability, ,
are constant parameters, and is a risk-premium parameter arising because the stochastic
convenience yield
is not tradable and, hence,
is not a martingale under the riskneutral probability.
The expression for the forward price would collapse to that in Eq. (10.4), once we assume
the convenience yield is constant. In the general case,
( )=E (
)=
(10.6)
That is, a stochastic convenience yield might drive the correlation between the forward and the
spot to values lower than one.
The expectation in Eq. (10.6), (
), is actually known in closed-form. It is the same
as the price of a zero coupon bond predicted by a model where the short-term rate is , and is
solution to the second equation in (10.5). This model, developed by Vasicek (1977), is discussed
in Chapter 12. Naturally, the important implication of this model is that according to the
specic values taken by , markets can either be in Backwardation or in Contangoboth in
terms of the criterion comparing the forward price to the current spot and the expected spot.
Finally, the nature of the relation between the ctitious bond price (
) and the
volatility parameter
is indeterminate. As discussed in Chapter 12, there are two e ects that
explain how the price of any zero coupon bond links to the volatility of the short-term rate,
a convexity e ect and a risk-premium e ect. The former acts so as to lead (
) to
be increasing in
, and the latter acts in the same direction if
0, and in the opposite
direction if
0. For short-maturities, the risk-premium e ect dominates over convexity, such
that (
) and, hence, the forward price, is decreasing in the volatility parameter ,
provided
0. However, if
0, the forward price, is increasing in , for any maturity.
market account. The short-sale at implies that we need to pay out the convenience yield
( ) at any time
(
), which
we nance by shorting additional
shares per share already held, such that over any innitesimal amount of time, we short
) = 1,
( )=
( )
additional shares. In total, then, at time , we would have shorted just one share, as ( ) = ( ) (
( ), as usual. Finally, at time , the money market account would
which we buy back by honouring the forward, which costs
)(
) . Therefore, the net payo at
)(
)
yield ( ) (
would be equal to ( ) (
( ) 0, an arbitrage.
3 To derive Eq. (10.4), note that by no-arbitrage, E
+
=
yield is simply
. A storage cost is a negative convenience yield.
466
, where
c
by
A. Mele
467
c
by
A. Mele
20
long stock
10
-10
long call
85
90
95
100
105
110
115
110
115
120
S_T
short put
-20
Bullish view
20
short share
10
long put
short call
85
90
95
100
105
120
S_T
-10
-20
Bearish view
FIGURE 10.1. Top panel: The solid line depicts the P&L of an at-the-money European
call option when the interest rate is zero, (
)+
, where
is the stock price at
expiration, = 100 is the strike price, and = 5 is the price of a call. The dashed line is
, with = 100. The dotted line is the payo from
the P&L from holding the stock,
+
the sale of a put option
(
) , where by the put-call parity, =
+ = 5.
The bottom panel depicts P&Ls from going long a put, (
)+
(the solid line),
shorting the stock,
(the dashed line), and shorting a call
(
)+ (the
dotted line).
The prices of the call and put options are related by the put-call parity. Let ( ) be the
time price of a zero maturing at time
. Then, the prices of a put and a call option with
468
c
by
A. Mele
satisfy,
)
(10.7)
To show Eq. (10.7), consider two portfolios: (a) long one call, short one underlying asset, and
invest
( ); (b) long one put. The table below gives the value of the two portfolios at time
and at time .
Value at
Value at
+
(
(a)
(b)
+
0
The two portfolios have the same value in each state of nature at time . Therefore, their
values at time must be identical to rule out arbitrage. Alternatively, Eq. (10.7) follows by
(
)
(
)
taking conditional expectations of the identity:
(
)+
(
)+ +
(
)
(
).
By the put-call parity, the properties of European puts are easily found to follow from those
of calls. Therefore, we only focus on calls, whenever possible. The price of a European call
option satises the following bounds:
max {0
)}
( ;
(10.8)
Indeed, consider two portfolios: (a) long one call; (b) long the asset underlying the call and issue
debt for an amount equal to
( ). The table below gives the value of the two portfolios at
time and at time .
Value at
Value at
(a)
(b)
0
(
The value of portfolio (a) dominates that of portfolio (b) at , and the same must be true at
time . Moreover, the price is positive because the payo of the option is positive. Therefore,
the rst inequality in (10.8) is true. As for the second inequality, suppose the contrary, i.e.
. Then, at time , we could sell one call and buy the underlying asset, thus making
a prot equal to
. Come time , the option will be exercized if
, in which
case we shall sell the underlying assets and obtain . If
, the option will not be
exercized, and we will still hold the asset or sell it and make a prot equal to
. Eq. (10.8)
implies the following asymptotic behavior of the call price: (i) lim 0 ( ; ;
)
0, (ii)
lim 0 ( ; ;
)
, and (iii) lim
( ; ;
)
.
The top panel of Figure 10.2 illustrates the basic arbitrage bounds in (10.8), as well as the
limiting behavior of the price for
small and for
large. First, the price
must be in the
region within the
and
lines. Moreover,
is small when
is small, and large when
is large. However,
cannot lie outside the region within the
line and
lines, which
implies that gets large, by sliding up on the
line.
469
c
by
A. Mele
c(t)
45
B
K b(t,T)
c(t)
S(t)
c(t)
S(t)
S(t)
FIGURE 10.2.
How does the option price behave in the region within the
and
lines? We cannot
tell. We may simply say that given the boundary behavior of , if
is convex in , it is
also increasing in . Convexity of
is a reasonable property, which holds for basic di usive
models, as originally noted by Bergman, Grundy and Wiener (1996). In this case,
would
behave as in the left-hand side of the bottom panel of Figure 10.2. This case seems to be
relevant, empirically, and consistent with the predictions of the celebrated Black and Scholes
(1973) formula, and some of its extensions. However, it is not a general property of option
prices. Bergman, Grundy and Wiener (1996) provide several counter-examples where
can be
decreasing over some range of , arising in models with jumps, or with stochastic volatility.
Theoretically, we cannot rule out that the option price behaves as in the right-hand side of the
bottom panel of Figure 10.2, as further developed in Section 10.5 [in progress].
The economic meaning of convexity is that the option is unlikely to be exercized when
is small. Therefore, changes in
have little e ect on . However, the option is likely to be
exercized when
is large. A percentage increase in is then to be followed by an even higher
percentage increase in . In other terms, the elasticity of the option price with respect to the
asset price is larger than one,
any increasing and convex function, which is zero at the origin, the tangent is higher than the secant.
470
c
by
A. Mele
20
15
10
T=1Y
80
85
T=6m T=3m
90
95
100
105
110
115
120
10.3.2 Hedging
Financial intermediaries such as investment banks sell options that they want to hedge against,
to avoid the exposure to losses illustrated in Figure 10.1. Hedging is important when the only
objective is to receive fees from the sale of derivatives. The portfolio that mimics the option
price must display the properties discussed in Section 10.3.1. For example, we need to ensure
that it behaves as the call price behaves in left-hand side of the bottom panel of Figure 10.2,
which is the most relevant, empirically.
We require this portfolio to exhibit a number of properties: (i) its value, , should be increasing in , which is ensured by including the asset underlying the option into the portfolio; (ii)
the sensitivity of with respect to must be positive and bounded by one, 0
1, which
we can make, once the number of underlying assets is less than one; (iii) the elasticity of
with respect to must be greater than one,
1, a condition that could be met, by issuing
debt. Mathematically, the value of the replicating portfolio should be
=
, where
denotes the number of the underlying assets, with
(0 1), and
is debt. In principle, this
portfolio might lead these three properties to be satised.
In fact, hedging is dynamic in nature, because option prices obviously change over time.
Therefore, we expect to be a function of the underlying asset price, , and time to expiration
of the option, in accordance with theory set forth in Part I of these lectures, especially in Chapter
4. The portfolio needs to satisfy additional properties: (iv) the number of the underlying assets
must increase with , and the value of the portfolio should be virtually insensitive to changes
in
when
is low, and slide up through the
line in Figure 10.3 when
is large.
These conditions are met if increases with , with lim 0 ( )
0 and lim
( )
1.
Finally, the portfolio needs to be self-nanced, as the long position in the option does not
471
c
by
A. Mele
20
10
85
90
95
100
105
110
115
120
S_T
-10
-20
-30
-40
FIGURE 10.4. The solid line depicts the payo guaranteed by an accumulator, a structured product that is long one call option with strike price
= 100, and short puts
with strike price
= 90, and = 2 (solid line), and = 4 (dashed line).
If the current market level is = 102, prots are likely to be made, at least provided the
market does not fall below the strike at which the put is struck, which in this example is
= 90. However, accumulators are quite riskythe losses they might lead to during market
downturns can be quite severe cmpared to possible gains in good times.
The size of the losses obviously depends on the number of puts to sell, , and their strike,
. As the previous picture reveals, losses widen as we increase the number of puts we go
short. Therefore, we can decrease the probability of experiencing any losses, by just xing
5 Therefore, underlying Figure 10.4 is the assumption that the price of the put is half that of the call. Below, we discuss this
assumption, which we now make for illustrative purposes only, and which we shall remove in Section 10.5.3.
472
c
by
A. Mele
10.4. Evaluation
= 90, and decreasing , although this might entail less resources left over to go long a call.
A possibility might then be to go long a less expensive call, by adding a knock-out feature into
the call contract (one that says that the option becomes worthless once the market reaches a
certain level such as, say, 105 within the investment horizon), or by purchasing a call with a
lower strike price. Alternatively, we might be willing to design a product with more risk, but
also more upside, by choosing an appropriate strike for the puts. Obviously, puts become more
valuable as we increase the strike price. Therefore, we could increase
, from 90 to 95 say,
whilst keeping constant. While selling put options with higher strikes increases the probability
to have losses, it also allows us to purchase more expensive calls, those with lower strikes, which
leads to a higher probability to achieve positive returns.
Naturally, the previous reasoning hinges upom the assumption that the accumulator is selfnanced at inception. This condition denes the type of options we can a ord. Some call options
can be too expensive and might require a large exposure relating to the short position in puts.
For example, some calculations show that in a market with stochastic volatility such as that of
Heston (1993b), we may need to sale short approximately six puts when the current index level
is = 102 and
= 95. Section 10.5.3 develops a case study where these risks are quantied
under a variety of alternative assumptions about strikes and market volatility.
10.4 Evaluation
This section provides pricing formulae and discusses hedging in the special case of the Black
and Scholes (1973) and Merton (1973) markets. It also examines issues relating to how hedging
might possible spillover to the volatility of the underlying asset, thereby dealing with instances
of feedback e ects, whereby the presence of derivatives (and trading activities on them) might
a ect the dynamics of the very same underlyinga theme sometimes referred to as endogenous
risk (see Chapter 8).
The next section provides a general evaluation formula to price futures and options, which
goes beyond the assumptions underlying Black & Scholes and Merton market. Section 10.4.2
provides a derivation of Black-Scholes formula relying on this general framework, and also
relying on a replication argument. Section 10.4.3 [...]
[In progress]
10.4.1 A pricing formula
Forward and options obviously di er, as they rely on the two distinct notions of obligation and
optionality. However, we can encompass their payo s into a single one, the value of which will
be used a few times in this chapter. Consider, then, again, a contract similar to that in Section
10.2.1, where at time , the payo is given by
, for some constant value of . We know
that the current value of this payo is:
(
E (
)=
(10.9)
)I
473
c
by
A. Mele
10.4. Evaluation
For = 0, this payo is just that of a forward, and for =
call. To price this payo , we proceed as follows:
(
E [(
)I
]=
=
=
E ( ( )I
)
(
(I
E
)
(
)
E (I
E (I
)
(
(10.10)
(
)
where ( )
,
is the risk-neutral probability given the information at time ,
is a new probability, with Radon-Nikodym derivative given by
( )
(10.11)
),
10.4.2.1 By replication: I
Replicating the value of an asset while trading in other assets is a theme framed in many
junctures of Part I of these lectures. For sake of completeness, we develop the arguments, which
ultimately lead to the Black & Scholes formula.
We invest units in the asset underlying a call option, and
in the money market account,
and the value of this portfolio is =
+
with obvious notation. Once self-nanced, the
value of this portfolio satises,
=
+
(10.12)
whereas by Itos lemma, the call option price
+
=
say, satises,
1
+
2
2
2
2
(10.13)
We conjecture that the value of this portfolio does exactly replicate the option value, a
conjecture veried in the subsection below. By comparing Eq. (10.12) and Eq. (10.13) leaves
474
c
by
A. Mele
10.4. Evaluation
(i) the matching condition for
,
+
=
1
2
=
=
= (
)=
( )
ln( ) + ( +
1
2
)(
)
(10.15)
Eq. (10.15) holds even without requiring that a market for the option exists over the options
life, or that the pricing function ( ) is di erentiable. That the option price is di erentiable
is a result, not an assumption. Let us dene the function ( ) that solves Eq. (10.14), with
boundary condition (
)=(
)+ . Note, we are not assuming this function is the option
price. Rather, we shall show this is the option price. Consider a self-nanced portfolio of bonds
and stocks, with =
. Its value satises,
=(
Moreover, by Itos lemma,
=
)+
) is solution to
1
+
2
= (
(
)=( 0
(0 0 )) , for all
[0 ]. Next, assume that
Hence, we have that
+
=
(0
).
Then,
=
(
)
and
=
(
)
=
(
)
. That is, the portfolio
0
0
=
replicates the payo underlying the option contract. Therefore, 0 is the value of the
option at time zero, even when a market for the option does not exist over its life.
475
c
by
A. Mele
10.4. Evaluation
10.4.2.3 By probabilistic arguments
Alternatively, we can use the general framework in Section 10.2.3 and arrive at Eq. (10.15).
We simply set
in Eq. (10.10), and calculate the dynamics of the stock price under
and under , and determine (
) and (
) in Eq. (10.10). Under , the stock
price is the usual geometric Brownian motion with drift equal to , and then, (
)=
(
), which explains the second term
in Eq. (10.15). (As )for the rst term, we can
show that the Radon-Nikodym derivative,
, is such that ( ) is
= ( )
F
solution to:
( )
=
( )
such that by the Girsanov theorem reviewed in Chapter 4, the stock price is solution to:
1 2
+
+
ln =
2
where is a Brownian motion under . It easily follows that (
completing the proof of the Black-Scholes formula.
)=
( ), thereby
| {z }
(10.16)
is, simply, by Itos lemma, the instantaneous volatility of the option returns, and
where
is the unit risk-premium related to the uctuations of the asset price, = (
) .
10.4.4 Future options and Blacks formula
Consider a future option, one that gives the buyer the right, not the obligation, to enter into a future contract for a specied price , at time , such that the payo at time is, ( ( )
)+ ,
(
)
where
( ) denotes the future price in Eq. (10.1),
( )=
. It is easy to see that
( ) is martingale under the risk-neutral probability . For example, assuming that is a
Geometric Brownian motion with volatility , we have that,
( )
=
( )
where is a Brownian motion under
F(
( )
E (
(10.17)
(
)
( )
)+ =
( ) ( )
(
)
(10.18)
476
c
by
A. Mele
10.4. Evaluation
where
ln
=
( )
1
2
and the second equality of Eq. (10.18) follows by the Black & Scholes formula, Eq. (10.15).
Eq. (10.18) is the celebrated Blacks (1976a) formula, which turns out to be very useful in
the context of xed income security pricing, as explained in Chapter 12. Appendix 2 provides
an alternative derivation of Eq. (10.18), based on the pricing approach of Section 10.2, and
the slightly more general assumption that the volatility of the future price in Eq. (10.17) is
time-varyingbut deterministic.
Chapter 12 explains that the property that future prices are martingales under the riskneutral probability generalizes to one holding when interest rates are time-varying, under a
certain probability called forward probability (see Chapter 12, Section 12.2).
10.4.5 Hedging
The cloning arguments in Section 10.4.2 suggest how to replicate a call option in a Black
and Scholes market. We set up a portfolio with an amount
in the underlying asset and the
remaining in the money market account, where
BS
BS
(10.19)
)=
BS
BS
(10.20)
Comparing Eq. (10.20) with the Black-Scholes formula in Eq. (10.15), produces,
BS
= ( )
(10.21)
But why do we need to replicate derivatives, in practice? Because most of them are dealt with
by investment banks, which simply act as nancial intermediaries, trading derivatives on behalf
of third parties, and being compensated through fees. Suppose an investment bank receives an
order to sell a put. The bank would like to hedge against this put by creating a replicating
portfolio such that the value of this portfolio be the same as the nal payo to be paid o
to honour the sale. So hedging is needed to replicate the nal payo s required to honour the
contracts giving rise to these payo s. Standard market practice is to use the Black-Scholes
delta in Eq. (10.21).
Note that at the same time, investment banks, not to mention funds, can undertake speculative trading activities aimed to implement specic views, such as those described in Section
10.5.5 below, in which case hedging doesnt necessarily need to be implemented. However, even
in this case, hedging might be required to isolate the particular views a trading desk of the
bank is taking. For example, Section 10.5.5 will explain that to express the view that equity
volatility will raise, say, we cannot simply go long call options, because call prices are increasing
both in volatility and the price underlying the option. A better solution is to go long an option,
delta-hedged through Black-Scholes, as we shall explain.
477
10.4. Evaluation
c
by
A. Mele
A well-known denition is that of the Gamma of a derivative, which is the second order partial of
the derivative price with respect to the underlying. The Gamma is always positive for long calls
and puts, as these derivatives have positive convexity, as illustrated by Figure 10.1. Naturally,
short calls and puts have negative gamma. In order for the statement when gamma is negative,
delta hedging involves buying on the way up and selling on the way down to be true, we also
have to consider whether the delta is positive or not (that is, whether the derivative price is
increasing or decreasing in the underlying asset price). So we have four instances of hedging
portfolios:
(i) Positive gamma: Buying on the way up and selling on the way down.
(i.1) Hedging portfolios with positive delta, as required, for example, to hedge against the
sale of a call. Positive delta means that the hedging portfolio relies on buying the
assets underlying the call. When the price of these assets are up, the delta is also
up, which implies we need to keep on buying even more of the assets underlying the
hedging portfolio. On the other hand, when prices are down, the delta is also down,
which implies holding less of the assets underlying the hedging portfolio, thereby
leading to sell some these assets precisely when the market is down.
(i.2) Hedging portfolios with negative delta, as required, for example, to hedge against
the sale of a put. Negative delta means that the hedging portfolio relies on selling
the assets underlying the put. In this case, delta is up when when prices are up.
478
c
by
A. Mele
10.4. Evaluation
However, this now simply means that we need to sell less! For example delta might
have been 12 before the market was up and now delta is 14 : that is, we need to buy
back some of the assets underlying the hedging portfolio. When, instead, prices are
down, delta is also down, which means we need to sell even more into a depressed
market.
(ii) Negative gamma: Buying on the way down and selling on the way up.
(ii.1) Hedging portfolios with positive delta, as required, for example, to hedge against
having gone long a put. Positive delta means that the hedging portfolio relies on
buying assets underlying the put. Negative gamma now means that as soon as the
price of these asset goes up (resp. down), we need to buy less (resp. buy more), so
we sell when prices go up and buy when prices go down.
(ii.2) Hedging portfolios with positive delta, as required, for example, to hedge against
having gone long a call. We are now selling the assets underlying the call. Negative
gamma, here, means that as the price of these assets goes up (resp. down), we need
to sell more (resp. sell less), so once again, we sell when prices go up and buy when
prices go down.
How to implement these hedging portfolios, in practice, is still an open question, as this
issue is necessarily model-based. Section 10.5.4, for example, shows that delta hedging under
the Black-Scholes assumptions would lead the bank to eliminate the risk of uctuations in the
underlying stock price. At the same time, however, hedging through Black-Scholes leads the
derivatives book quite messy once the fundamental assumption underlying the Black-Scholes
world does not hold, namely that volatility changes randomly. In this case, hedging would
rather look like a volatility view. To appropriately hedge, one has to rely on more complicated
hedging strategies. For example, to hedge against an option in a world of stochastic volatility,
we would need to use a stock, a bond, and, another ... option!
10.4.6.2 Crashes
[In progress]
Use a simple model, by Grossman, to illustrate how volatility is pumped-up by automatic
mechanisms. Then, discuss a streamlined version of Gennotte and Leland, with asymmetric
information, to illustrate the 1987 crash.
10.4.7 Properties of options in di usive models
We consider a simple model in which the stock price is solution to,
p
= ( ) + 2 ( )
(10.22)
+
(
+
)=
479
( )
( ).
(10.23)
c
by
A. Mele
10.4. Evaluation
10.4.7.1 Passage of time
Call options are sometimes referred to as wasting assets because their value tends to decreases over time, due to a decrease in the value of the optionality, in a sense to be explained
next. Dene
as the elasticity of the option price with respect to the asset price. For a
call option, the elasticity
1 as noted in Section 10.3, such that Eq. (10.23) and convexity of
leaves:
=
(
1)
( ) 0
(10.24)
Furthermore, note that convexity increases as time to maturity decreases, with the limit case
arising at maturity when
shrinks to Diracs delta. Therefore, the drop in value is the most
severe in correspondence of shorter maturities. What happens, then, if we sell call options to buy
them back later? Do options provide us with arbitrage opportunities? Obviously not. Selling
a call and bying it later leads to prots only if market volatility is stable and the underlying
asset price does not move too much as a result. This remark is indeed the motivation of some
basic trading strategies known as calendar spreads, and further explained in Section 10.6.1. A
trading strategy is not an aribtrage opportunity though, only a way to implement a particular
view.
Note that for a Euroepan put option, the elasticity of the put price with respect to the asset
is negative, and can actually lead the RHS of Eq. (10.24) to change sign, especially for far
out-of-the-money options, as Figure 10.5 reveals.
34
T=3m
32
30
28
26
24
22
20
T=1Y
18
16
66
68
70
72
74
76
78
80
82
84
FIGURE 10.5. The value of a put option with strike = 100, as time to maturity shrinks,
as predicted by the Black & Scholes model, with volatility parameter equal to 20% and
short-term rate = 1%. The solid line is the price corresponding to time to maturity =
one year, and the dashed line is the price corresponding to time to maturity = three
months.
10.4.7.2 Comparative statics of dynamic models
We derive properties of option prices in the context of di usion processes, relying on methods
suggested by Bergman, Grundy and Wiener (1996), and in Chapter 7 of the lectures (see
Proposition 7.1). We establish that if the stock price is solution to Eq. (10.22), the price of an
European-style option inherits the properties of the nal payo : it is increasing and convex in
480
c
by
A. Mele
10.4. Evaluation
2
2
+
( ) +
( )
(10.25)
+
0=
)=
subject to the boundary condition (
) = 0 ( ) 0. Therefore, we have that (
(
) 0 for all , due to results reviewed in Chapter 7 (see Proposition 7.1 and Appendix 1 in Chapter 7). That is, in a scalar di usion setting, a European-style option price is
increasing in the underlying whenever the ( ) is increasing.
Next, we nd conditions under which
0. We di erentiate Eq. (10.25) with respect to
, and
=
satises,
2
2 2
2
+2
( )
+
( )
(
)
(10.26)
0= +
2
p
2 (
2
+
+
+
0=
( )
2
( ) for all .
(10.27)
By the same results used to analyze Eq. (10.25) and (10.26), we now have that
0 whenever
0. That is, if option prices are convex in the underlying price, they are increasing in the
volatility of the asset price.
This result is reminiscent of the theory of mean-preserving spreads as explained in Chapter
7. By increasing the volatility of the underlying, the holder of an European call (say) would
benet from the upside while not su ering losses on the way downrisk-neutral evaluation of
traded assets benets from an increasing volatility. We will see that that this conclusion might
be reversed when it comes to assessing how the price of xed income instruments reacts to
changes in the volatility of the underlying fundamentals.
10.4.7.3 Counterexamples
[In progress]
10.4.7.4 Recovering risk-neutral probabilities
)=
481
) ( |
c
by
A. Mele
| ). Assuming that
We can check this relation holds true in the Black-Scholes model, in Eq. (10.20). Let us di erentiate again,
2
(
; )
(
)
= ( | )
(10.28)
2
Eq. (10.28) allows us to recover the risk-neutral density using option prices. The Arrow-Debreu
state density, DAD (
= | ), is given by,
2
(
; )
(
)
2 (
)
= | )=
( | )| = =
DAD (
2
=
These results are quite useful in applied work. They also help deal with the pricing of volatility
contracts reviewed in Section 10.6, as explained in Appendix 4.
Asset returns have time-varying volatility and their distributions are both heavy-peaked and
tailed, as reviewed in Chapter 7. These empirical regularities are very well-known at least
since the seminal work of Mandelbrot (1963) and Fama (1965). Engle (1982) and Bollerslev
(1986) introduce the rst parametric models aiming to capture these stylized facts through
the celebrated Auto Regressive Conditionally Heteroskedastic (ARCH) models. ARCH models
have played a prominent role in the analysis of many aspects of nancial econometrics, such as
the term structure of interest rates, the pricing of options, or the presence of time varying risk
premiums in the foreign exchange market, as summarized by the classic survey of Bollerslev,
Engle and Nelson (1994).
An ARCH model works as follows. Let { } =1 be a record of observations on some asset
returns, = ln
is the asset price. The variance of is, then, modeled as an
1 , where
autoregressive process, as follows:
=
(0
2
1
2
1
(10.29)
where , , and are parameters and denotes the information set as of time . This model is
known as the GARCH(1,1) model (Generalized ARCH). It was introduced by Bollerslev (1986),
and collapses to the ARCH(1) model introduced by Engle (1982) once we set = 0. In other
words, the variance of the distribution of asset returns tomorrow, is linear in the expectation
2
2
error, (
), and rises linearly with the current realized variance, 2 , viz
1( )
2
2
2
+( + ) 2+
1
+1 =
The quintessence of ARCH models is to make volatility dependent on the variability of past
observations. An alternative formulation, initiated by Taylor (1986), makes volatility driven
482
c
by
A. Mele
by some unobserved components. This formulation gives rise to the stochastic volatility model.
Consider, for example, the following stochastic volatility model,
ln
=
=
+
+
ln
(0
1
1
+ ln
2
1
);
;
(0
where , , , and 2 are parameters. The main di erence between this model and the
GARCH(1,1) model in Eq. (10.29) is that the volatility as of time , 2 , is not predetermined
by the past forecast error, 1 . Rather, this volatility depends on the realization of the stochastic
volatility shock at time . This makes the stochastic volatility model considerably richer than
a simple ARCH model. As for the ARCH models, SV models have also been intensively used,
especially following the progress accomplished in the corresponding estimation techniques. The
seminal contributions related to the estimation of this kind of models are mentioned in Mele
and Fornari (2000). Early contributions that relate changes in volatility of asset returns to
economic intuition include Clark (1973) and Tauchen and Pitts (1983), who assume that a
stochastic process of information arrival generates a random number of intraday changes of the
asset price.
10.5.1.2 ARCH and di usive models
Under regularity conditions, ARCH models and stochastic volatility models behave essentially
the same as the sampling frequency gets su ciently high. Precisely, Nelson (1990) shows that
ARCH models converge in distribution to the solution of the stochastic di erential equations, in
the sense that the nite-dimensional distributions of the volatility process generated by ARCH
models converge towards the nite-dimensional distributions of some di usion process, as the
sampling frequence goes to innity. Mele and Fornari (2000) (Chapter 2) contain a review of
results relating to this type of convergence, and Corradi (2000) develops a critique related to
the conditions underlying these convergence results. To illustrate, heuristically, consider the
following model,
ln
= ( ) +
(10.30)
2
2
= (
) + 2
and
are correlated, with correlation , and
where
Consider, further, the ARCH model:
+1 =
+1
2
+
(|
|
)2 +
+1 =
, and
NID (0 1)
2
(10.31)
where
)
(ln (
)), and
refer to the indexing of ob+1 = ln (
+1
+1
served data and the sampling frequency (weekly, say), and
,
,
are positive parameters,
possibly depending on the sampling frequency, and
( 1 1). The parameter allows to
capture the Black-Christie-Nelson leverage e ect (Black, 1976b; Christie, 1982; Nelson, 1991)
discussed in Chapter 8. Note that the second of Eqs. (10.31) can be written as:
2
2
2
1
1
=
(|
|
)2
1
+1
+
1
2
(10.32)
and
(|
|
)2
(|
|
)2 . The rst two terms dene the drift
term for the variance process, and the last term is the di usive component. Suppose that
483
c
by
A. Mele
1
1
lim 0
=
, lim 0
(|
|
)2
1
=
, and, nally,
1 2
=
, where
var (
lim 0
2 ). Then, under regularity conditions,
2
the sample paths of and
in Eqs. (10.31) converge to those of and 2 in Eqs. (10.30),
with a well-dened correlation coe cient (see Fornari and Mele, 2006).6
10.5.2 Implied volatility, smiles and skews
Parallel to time-series research into asset volatilities reviewed in the previous section, research
on option prices over the 1980s challenged the assumption of a constant volatility in the Black
& Scholes and Merton model. As we know, the Black & Scholes model relies on the assumption
that the price of the underlying asset is a geometric Brownian motion with constant volatility,
=
where
is a Brownian motion, and , are constants. As we also know, is the only parameter
to enter the option pricing formula, which leads to a crucial point. Not only is the assumption
of a constant inconsistent with the time-series evidence reviewed in the previous section. It
is also inconsistent with empirical evidence on the cross-section of option prices. Let $ (
)
denote the time market price of a call expiring at with strike , and consider the price
predicted by the Black-Scholes, BS (
;
) in Eq. (10.15). Dene the Black-Scholes
implied volatility as the value of that equates the Black-Scholes formula to the option market
price, IV say,
IV : $ (
) = BS (
; IV)
(10.33)
We know from Section 10.4.7 that the Black-Scholes option price is strictly increasing in .
Therefore, this denition of implied volatility makes sense, in that there exists a unique value
for IV such that Eq. (10.33) holds true. In fact, the market practice is to quote options in terms
of implied volatilities, not prices. Moreover, implied volatility is the same for both the call and
the put. Indeed, by the put-call parity in Eq. (10.7), viz
$
)=
(10.34)
;
) =
This equation also holds for the Black-Scholes model for each , i.e. BS (
(
)
;
)
+
. Subtracting this equation from Eq. (10.34) shows that
BS (
the implied volatilities for a call and for a put options are the same.
If the Black & Scholes model holds, implied volatilities would be the same for each . Yet
empirically, and at least since 1987, the cross section of implied volatilities exhibit striking
characteristics, when gauged against the moneyness of the option dened as,
(
(10.35)
Prior to 1987, the pattern of implied volatilities was unclear or U-shaped in 1 at best
a smile. After the 1987 crash, the smile pattern turned into a smirk, also referred to as
volatility skew. One possible explanation for these facts might refer to the fact that call
6 For
example, if
1
the moment condition for the di usive component is = lim
0
centered chi-square variates with one degree of freedom (and variance
motion increments
in the second of Eqs. (10.30).
484
), and
c
by
A. Mele
and put options that are deep-in-the-money and call or put options that are deep-out-of the
money are relatively less liquid than at-the-money options, thereby commanding a liquidity riskpremium. Since the Black-Scholes option price is increasing in volatility, the implied volatility
is U-shaped in 1 .
Figures 10.6 and 10.7 illustrate how smiles and smirks arise in a di erent context, one where
asset returns exhibit random volatility. We rely on the celebrated Hestons (1993) model in
which volatility is random, and refer the reader to Section 10.5.4 (see Eq. (10.54)) for technical
details regarding this model.7
rho=0
rho=0.5
0.108
0.125
0.107
0.12
0.106
0.115
Implied volatility
Implied volatility
0.105
0.104
0.103
0.102
0.11
0.105
0.1
0.101
0.095
0.1
0.09
0.099
0.098
0.7
0.8
0.9
1.1
1.2
0.085
0.7
1.3
r(Tt)
Ke
0.8
0.9
1.1
1.2
1.3
r(Tt)
/S
Ke
/S
FIGURE 10.6. Smile and smirk predicted by the Heston model in Eq. (10.54), with parameters xed at = 2, = 0 01, = 0 1 and, for the left-hand panel (the smile) = 0,
and the right-hand panel (the skew) = 0 5. The initial values of the asset price and
volatility are = 100, and
=
, and the short-term rate = 0, and the maturity
of the option is six months.
The rationale underlying the patterns in Figure 10.6 is the following. The Black & Scholes
model relies on the assumption asset returns are log-normally distributed. However, this assumption may not be correct, as the market might be pricing through alternative distributions
where a higher weight is given to tail events, due to market fears about extreme outcomes. For
example, the market might fear the stock price will fall below a given level, say , more than
the Black & Scholes model would predict. As a result, the market density should have a left
tail ticker than the log-normal, for values of
.
7 The densities in Figure 10.7 are those of 1
Eq. (10.59) for Hestonboth densities are with respect to
485
ln
in
c
by
A. Mele
This possibility is illustrated by the left panel of Figure 10.7, which depicts the risk-neutral
distributions of both the Black & Scholes model, and one model with random volatility, taken
to be the trutha model that does generate thick tails, as discussed below. A market density
with a left tail thicker than that of Black & Scholes implies that the probability deep-out-ofthe-money puts (i.e., those with low strike prices) will be exercized is higher under the market
density than under the log-normal. For this reason, the implied volatility we need to price
deep-out-of-the-money puts is higher than that required to price at-the-money calls and puts.8
At the other extreme, the market may attach a higher likelihood that the stock price will be
above some than predicted by Black & Scholes, which would translate into a market density
. This characteristics implies a
with a right tail ticker than the log-normal, for values of
higher probability that deep-out-of-the-money calls (i.e., those with high strike prices) will be
exercized, compared to the log-normal. As a result, the implied volatility needed to price deepout-of-the-money calls exceeds the implied vol needed to price at-the-money calls and puts, as
illustrated by the left panel of Figure 10.7. As mentioned, the second e ect has disappeared
since the 1987 crash, for reasons similar to those underlying the right panel of Figure 10.7,
leaving the smirk of the right panel of Figure 10.6.
This section suggests that out-of-the money options contain important information regarding
market expectations about future volatility. Indeed, the CBOE-VIX index does aggregate the
prices of out-of-the-money options to convey an estimate of the volatility expected to arise
corrected by risk. Section 10.8 develops details on this index and explains that it links to
the fair value of a variance swap, i.e. a contract in which a counterparty is insured against
uctuations in future volatility.
rho=0
rho=0.5
0.07
0.07
Stochastic volatility model
Black & Scholes model
0.06
0.06
0.05
0.05
Probability density
Probability density
0.04
0.03
0.04
0.03
0.02
0.02
0.01
0.01
0
80
8 Note
85
90
95
100
105
Asset price
110
115
0
80
120
85
90
95
100
105
Asset price
110
115
120
that if a call (put) option is out-of-the-money for a given strike, a put (call) is in-the-money option for the same strike.
486
c
by
A. Mele
FIGURE 10.7. Risk-neutral densities predicted by the Black & Scholes model (dashed
line) and the Heston model in Eq. (10.54). The Black & Scholes volatility parameter is
= 9%, and Hestons parameters are xed at = 2,
= 0 01, = 0 1 and, for the
left-hand panel = 0, and the right-hand panel = 0 5. The initial values of the asset
price and volatility are = 100, and
=
, the short-term rate = 0, and maturity
is six months.
While the previous conclusions rely on numerical results, an explanation of smiles is available
since the early 1990s (see, e.g., Ball and Roma, 1994; Renault and Touzi, 1996). To illustrate,
consider the continuous time model,
=
2
+
(
(10.36)
)
+ (
2
(
)
2
The option price is, (
) =
E (
)+
, where E [] is the
expectation under some risk-neutral probability . Assume that the correlation = 0. Then,
the implied volatility predicted by this model satises,
2
IV :
= BS (
; IV)
1 2
2 !
q
2
(
)
(
)
1
2
IV (
)
( )+
( )
(10.37)
3
2
( )(
)
q
( ) = E ( ).
where
These interesting properties interestingly link to a compelling lesson we learnt over the early
statistical literature on ARCH and random variance models: random changes in volatility lead
to a return distribution with tails thicker than the normalone with kurtosis larger than three
(Mandelbrot, 1963; Fama, 1965; Nelson, 1990; Mele and Fornari, 2000)a feature that the
Hestons model illustrates vividly in Figure 10.7. For example, we know from Nelson (1990),
that even if unexpected returns are conditionally normally distributed, they are approximately,
and unconditionally, Students t, once we assume their variance follows a GARCH(1,1) process.
Mathematically, denote the unexpected returns as of time with , and suppose that =
, where
NID (0 1) and , the conditional volatility of , is some random process. Then
2
2
we have, by Jensens inequality, that ( 4 ) = ( 4 ) ( 4 )
( 4 ) [ ( 2 )] = ( 4 ) [ ( 2 )] ,
4
( )
which is an equality when
is not random. It follows that the kurtosis, Kurt
[ ( 2 )]2
487
c
by
A. Mele
( 4 ) = 3. That is, random volatility makes the unconditional return density leptokurtotic even
when the conditional is normal. Although these calculations relate to unconditional densities,
similar conclusions would apply to conditional: random volatility makes a -day conditional
density leptokurtotic even when the one-day conditional is normal.
As a result of this leptokurtoticity in asset returns, the probability out-of-the money options
is exercized is larger than that implied by the log-normal distributionthe smile e ect. As for
the smirk e ect, we need
0, as shown by Figure 10.7. Intuitively, when
0, the left tail
of the return distribution is thicker than the right, thereby making out-of-the money puts most
valuable.
The model in Eqs. (10.36) has been extended to one with jumps, where the variance process
follows a mean-reverting process such as:
2
2
=
+S
+
where is a Poisson process with intensity (see Section 4.7 in Chapter 4), S 0 is the size
of the jump, which we suppose to be constant for illustration purposes only, and, nally, ,
and are constants. In this model, the presence of positive jumps,
0, makes the left tail of
the return distribution thicker, when
0. Therefore, we need a high to avoid a too thicker
distribution. With = 0, instead, a thicker distribution can only be obtained through lower
values of .
Naturally, the e ects illustrated in this section mostly refer to explanations related to stochastic volatility although in general, they might arise for other reasons leading to leptokurticity,
such feedback e ects. [In progress, mention the literature on feedback e ects of the 1990s.]
An even simple channel is one in which stock prices are simply driven by jumps, which make
the left tail of the distribution thicker than the right one, as in the following model,
= ( )
S
constant is to isolate crashophobia
488
c
by
A. Mele
then that of the replicating portfolio, independently of risk-appetite.10 Naturally, markets can
be completed by the option, although in this case, option pricing is not preference-free as we
shall show in the next sections.
To summarize, stochastic volatility entails two inextricable consequences: (i) There is an
innity of option prices consistent with absence of arbitrage, which correspond to the many riskneutral probabilities consistent with the model: there are many risk-adjustments that we can
make to the drift term of the variance process in Eqs. (10.36); (ii) there cannot be perfect hedging
strategies only relying on the underlying asset. As regards point (ii), we might, alternatively,
either (a) use a strategy, which albeit not self-nanced, would still allow for a perfect replication
of the claim, or (b) a self-nanced strategy that would apply to some misspecied model. In case
(a), the strategy leads to a hedging cost process. In case (b), the strategy leads to a tracking
error process, although there might be situations where the claim can be super-replicated, as
explained below.
We start with a short detour about how to understand replicability in a general context, and
proceed to option evaluation in subsequent subsections.
10.5.3.1 Spanning and cloning
A set of securities spans a set of payo s, if any point in that set can be generated by a linear
combination of the security prices. As explained in Chapter 4, the set of payo s may include
those promised by a contingent claim, for example, that promised by a European call, or nal
consumption, as in Harrison and Kreps (1979) and Du e and Huang (1985). Chapter 4 relies
on this spanning property and solves for consumption-portfolio choices through martingale
methods. In this section, we show how spanning helps dene replicating strategies with the
purpose of pricing redundant assets. Consider the following model (introduced in Chapter
4), where asset prices are assumed to be driven by a -dimensional Brownian motion W ,
y =
(y )
(y ) W
(10.38)
where and are vector and matrix valued functions. The value of a portfolio is = + ,
where + denotes the vector of the security prices and the money market account, and is a
portfolio process with the same notation as in Chapter 4. As explained in Chapter 4 (Section
4.3.1) the value of a self-nanced portfolio satises
= + , or,
= >(
1 )+
(10.39)
+ >
where
( 1
)> ,
,
( 1
)> ,
is the price of the -th asset,
is
its drift and is the volatility matrix of the price process. Chapter 4 utilizes the risk-neutral
probability, Q, to help characterize the securities span. Let us, now, do the same under the
physical probability, . In our context, asset prices are semimartingales under ,
=
(10.40)
where is a process with nite variation, and satises regularity conditions. Let us conjecture
=
for all . By the unique decomposition property mentioned in Chapter 4 (Section
that
10 Note that the payo of the option at the expiration
is (
)+ and does not depend on 2 . However, the value of the option
at any
does depend on 2 because, intuitively, the risk-neutral expectation of (
)+ conditional on the information set
2 .
|
at is calculated through the risk-neutral transition density say, which obviously depend on 2 :
489
c
by
A. Mele
>
and
)+
>
(10.41)
, obtaining,
>
)+
(10.42)
Let us suppose that the price of the underlying asset is solution to Eqs. (10.36). The rational
2
pricing function of a European-style option is
= (
). Suppose two such options
1
2
1
are traded, with prices
and
, where we take 1
by trading
2 . We cannot replicate
the underlying asset and the money market account. Indeed, let be the value of a self-nanced
strategy including the asset price and the money market account, which obviously satisfy:
=( (
)+
(10.43)
where is the value of the invested underlying asset. Instead, the price of the rst option
satises:
1
1
=L
+ 21
(10.44)
+ 1
1
1 2 2
1
1
1
+
+
+ 12 2 21 2 +
where L is the innitesimal generator, L 1
2 + 2
1
cients of the option price in Eq.
2 . We see that we cannot match the di usion coe
(10.44) through that in Eq. (10.43).
Instead, we might replicate the price of the rst option, 1 , through a self-nanced portfolio
strategy including (i) the underlying asset, (ii) the option expiring at 2 , and (iii) the money
490
c
by
A. Mele
)+
2)
2
2
2
2
(10.45)
where 1 is the invested asset value, and 2 is the value of the investment in the second option.
We match the di usion coe cients of Eq. (10.44) and Eq. (10.45), and obtain:
1
1
1
2
2
2
2
(10.46)
1 =
2 =
2
2
2
2
Replacing these expressions into Eq. (10.45), and equating the drift of Eq. (10.45) to that of
Eq. (10.44), leaves:
1
2
1
1
2
2
L
(
)
(
)
L
=
(10.47)
1
2
2
These two ratios agree. They must then be equal to some process
of the maturity of the option. Therefore, we obtain that,
+
+(
1
2
1
2
2 2
(10.48)
The interpretation of
is that of the unit risk-premium required to face the risk of stochastic uctuations in volatility. The problem, absence of arbitrage does not su ce to recover a
unique . By the Feynman-Kac stochastic representation of the solution to a partial di erential equation, there are many solutions to Eq. (10.48),
2
(
)
2
=
E
)+
(
(10.49)
is a risk-neutral probability, induced by the many that are consistent with absence
where
of arbitrage.
The previous derivation suggests two possible uses of the portfolio strategies of Eqs. (10.46).
The rst, obvious, is hedging. We can always hedge the rst option with 1 and 2 in Eqs.
(10.46). The second is more subtle. If we really think our evaluation model for the rst option
is better than the market, we can always synthesize the rst option with the portfolio in Eqs.
(10.46), and replicate the payo of the rst option at expiration, ( 1 ).
Eqs. (10.47) and (10.48) can be interpreted as APT relations. Indeed, let us dene the unit
risk-premium related to the uctuations of the asset price, = (
) . Then, Eq. (10.47)
or Eq. (10.48) imply that,
2
L( )
+
=
= +
| {z }
| {z }
2
is the beta related to the volatility of the option price induced by uctuations in
where
the stock price, , and 2 is the beta related to the volatility of the option price induced by
uctuations in the return volatility. It is a generalization of the APT relation in Eq. (10.16)
that holds for the Black & Scholes model.
491
c
by
A. Mele
Hull and White (1987), Scott (1987) and Wiggins (1987) develop the rst option pricing models
with stochastic volatility. Heston (1999b) provides an analytical solution assuming an a ne
model for the variance process, for otherwise, we need to solve through numerical methods such
as Montecarlo simulation or the numerical solution to partial di erential equations.
Hull & White
Hull and White (1987) derive a rst pricing formula based on a continuous-time model where
asset returns and volatility are uncorrelated,
=
+
(10.50)
2
2
where
and
are uncorrelated Brownian motions, dened under the risk-neutral probability.
They show that the option price takes the following form:
q
2
)]
= E [ BS (
;
(10.51)
q
and E denotes the conditional risk-neutral expectation taken with respect to laws generating
. According to Eq. (10.51), the option price is simply the Black & Scholes formula averaged over all possible values taken by future average variance, . Accordingly, the authors
provide a Taylors expansion around the conditional expectation of ,
( ) = E ( ),
2
= BS (
;
)
;
)
1 2 BS (
( )
+
2
= ()
1 3 BS (
;
)
+
( ) +
(10.52)
6
=
()
In fact, Eq. (10.52) is a general formula applying to models more general than that in Eq.
(10.50), say one of the models encompassed by Eqs. (10.36), provided of course
and
are uncorrelated, as explained more formally in Appendix 3. In Appendix 3, we also explain
that Hull & White Equation (10.51) can be generalized to the case in which
=
+
492
c
by
A. Mele
p
2
1
, where is a standard Brownian motion and is a stochastic correlation. Romano
and Touzi (1997) show that in this case, and provided (), and () () in Eqs. (10.36) are
independent of (),
q
2
)]
(
) = E [ BS (
;
Z
(10.53)
1
2 2
+
2
2
= 1
2
(1
)
Heston
Heston (1993b) develops an analytical solution to a model with stochastic volatility, relying on
the following dynamics of the stock price:
(
1 2
ln
=
+
2
1
p
(10.54)
2
2
2
=
(
) +
+
1
1
2
The instantaneous variance is thus a square-root process. We provide a few hints on the
derivation of Hestons formula, relying on a general formula based on the same line of reasoning
leading to Eq. (10.10) in Section 10.2, as follows:
(
=
=
=
)
(
E (
Z
)
)+
0
m
I
m
Z0
)+
I
m
(10.55)
2
where (
) is the risk-neutral joint density of the stock price
and variance 2 at
m
m
( ) is the risk-neutral marginal density of
, ( ) is another marginal density of
m
with Radon-Nikodym derivative with respect to
( ) given in Eq. (10.11),
( )=
m (
m
(
)
=
)
,
,
(10.56)
) and
(
) are two probabilities with densities m and m ,
and nally, (
respectively. All these densities and probabilities are conditional upon the information at time
.
By Girsanov theorem, the density process associated to the Radon-Nikodym derivative in
Eq. (10.56) satises,
( )
)
= (
1
( )
such that the stock price is solution to:
(
=
+ 12 2
+
ln
2
= (
(
)
1
2
+
493
1 +
p
1
(10.57)
c
by
A. Mele
1 +
and 1 is a Brownian motion under the new probability with density m , with
1 =
.
Let
ln . In the Black-Scholes case, 2 is a constant, and the two probabilities, (
ln )
and
(
ln ), can be expressed in closed-form, using Eq. (10.57) and Eq. (10.54), respectively, leading to the celebrated formula in Eq. (10.15), as explained in Section 10.4.2.
2
(
In the Hestons model, the two probabilities, 1 (
)
ln ) and
2
)
(
ln ), are solutions to:
2(
2
2
1
=0
=0
(10.58)
2
2
with the same boundary condition
(
) = I ln , = 1 2, and where and are
the innitesimal generators associated to Eq. (10.57) and Eq. (10.54). Indeed, we have that:
2
(i) probabilities are martingales, due to 2 (
) = E (I ln ) = Pr (
ln ), and
similarly for 1 (); and (ii) probabilities are di usion processes because the stock price is. Eq.
(10.58) then follows by the Feynman-Kac representation theorem.
The solution to the two partial di erential equations is unknown in closed-form, though.
However, their characteristic functions can be shown to be exponential a ne in and 2 .
Precisely, dene the two characteristic functions:
i
2
2
; =E
; =E
1
i=
1
2
denotes the expectation taken with respect to m , and E denotes the conditional
where E
expectation taken against m . The two functions satisfy the same partial di erential equations
(10.58), but they can be solved in closed-form, because their boundary conditions are simply
2
(
) = i . Indeed, a fundamental denition is that a model is a ne if its characteristic
function is exponential-a ne in its state variables. A ne models were already used to analyze
the term structure of interest rates, since at least Vasicek (1977) and Cox, Ingersoll and Ross
(1985), as we shall discuss in Chapter 12. Hestons model is the option pricing counterpart to
these models of the yield curve.
The solution to the two characteristic functions is given by:
2
(
; )+ (
; ) 2 +i
; =
where
(
(
; ) =
)+
2
i+
i
i+
; ) =
such that,
i(
1
1
q
= (
1 1
= +
2
1
2
i + )(
2 ln
1
1
i)2
2
i ln
Re
1
2
(
i
; )
(10.59)
[Write a small technical Appendix on inversions of characteristic functions] Replacing these two
probabilities into Eq. (10.55), yields the celebrated Hestons formula.
494
c
by
A. Mele
Figure 10.8 depicts the relative frequency of the P&Ls corresponding to all cases we consider.
495
c
by
A. Mele
Heston
0.8
Relative frequency
Relative frequency
0.8
0.6
0.4
0.2
0
100
50
0.6
0.4
0.2
0
100
50
Hestonhigh vol
50
0.8
Relative frequency
Relative frequency
Hestonlow vol
0.8
0.6
0.4
0.2
0
100
50
50
50
0.6
0.4
0.2
0
100
50
50
FIGURE 10.8. Frequency of Prot and Losses regarding a six month accumulator under
di erent market assumptions: Black & Scholes (NW quadrant), Hestons with current
volatility xed at its long-term value (NE), Hestons with current volatility lower than its
long-term value (SE), and Hestons with current volatility higher than its long-term value
(SW).
Note that although the average P&Ls are positive in all cases, the frequency distributions
of the P&Ls exhibit quite high standard deviations, with long left-tailsdownside risk is quite
substantial, consistently with the payo structure that we depicted in Figure 10.4. In the
Hestons market, average prots are higher than in Black & Scholes because accumulators
necessitate less puts. Within the Hestons market, prots lower when we move to both a low
volatility and high volatility scenarios. In the low volatility scenario, average prots lower
because the number of puts in the accumulator needs to increase. In the high volatility scenario,
prots lower because whilst the number of puts in the accumulator decreases, a more volatile
market also makes the accumulator more likely to generate adverse outcomes, thereby leveling
down the expected prots.
c
by
A. Mele
(i) Vega trading, or volatility surface trading. It refers to a trade aiming to prot from a
view that the term-structure of implied volatilities will changefor example, from the
expectation of a attening or a steepening term-structure of implied volatilities. It requires
positioning into multiple types of options according to the nature of the expectation. For
example, a bull attener relies on the expectation that long-term implied volatilities will
decrease faster than short-term implied volatilities, leading the term-structure of implied
volatilities to atten, which could be implemented through a portfolio which is: (i.1) long
short-term options, and (i.2) short the long-term ones. (This portfolio would need to be
delta-hedged, for reasons explained below).
(ii) Gamma trading. It is a trade that aims to generate prots from a realized volatility
exceeding the current implied volatility. It relies on directional views regarding ongoing
volatility developments. For this reason, gamma trading has an horizon that is typically
shorter than that of vega trading.
Option-based strategies might allow us to have views about these volatility developments,
and include trading straddles, strangles, butteries, calendars, or delta-hedged option positions, as we shall explain below. They consist of portfolios comprising options and
assets underlying these options, and aim to make P&Ls consistent with views about volatility
developments.
A natural question arises. We know option prices are, generally, increasing in volatility. So
why do we need to create portfolios of options and underlyings, in order to trade volatility? The
reason is that option prices are increasing in both volatility and the asset price. For example, in
2
a stochastic volatility setting, the option price is (
), and if the volatility
increases,
2
the option price (
) increases as well, in general. However, it might be possible that
the increase in volatility occurs exactly when the asset price decreases. Incidentally, this circumstance is quite likely to occur, given the empirical evidence about the negative correlation
between
and
reviewed in Section 10.5. The implication would be that the increase in
determined by an increase in
might be o set by the fall in following the drop in . To
isolate movements in the asset price volatility, we need to consider portfolios reverse-engineered
so as to be insensitive to changes in the underlying asset price. [Mention here and in the next
section Goldman Sachs approach to VIX]
To mitigate the e ects of the movements in the underlying price, we may consider BlackScholes hedges, such that the long position in the call option is o set by the short-position in
the Black-Scholes replicating portfoliowhich, by construction, only neutralizes movements
in , not . An alternative is a portfolio comprising options with nal payo s driven by the
stock price, and negatively correlated, such as a European put and call options. For example, a
straddle is a portfolio of one call option and one put option that have the same strike price and
the same maturity. (A strangle is the same as a straddle, with the di erence that the strike of
the call di ers from that of the put.)
Figure 10.9 depicts an example of payo s arising from going long a straddle. The left panel
shows the nal payo , equal to (
)+ +(
)+ , for = 100, as well as the value of this
payo at , assuming a Black and Scholes market, with a risk-free rate = 1%, instantaneous
volatility = 20%, and under two assumptions about the maturity of the straddle, three months
and one month. The right panel shows, instead, the P&L of this straddle, dened as (
)+ +
+
(
)
(
)
Cost , where Cost = BS (
;
) + BS (
;
), BS ()
is the Black-Scholes formula in Eq. (10.15), BS () is the corresponding put price, and
is
497
c
by
A. Mele
1
3
the maturity of the straddle, with 1
= 12
and 2
= 12
. We assume that the index level
at is
= 100, such that the straddles are approximately at the moneythe strike leading
(
)
to at-the-money straddles is:
=
, but for comparison reasons, we keep on setting
= 100 for both straddles.
10
9
8
7
6
96
97
98
S_T
-1
4
3
-2
2
1
0
-3
90
95
100
105
110
-4
FIGURE 10.9. The left panel depicts the payo of a straddle with strike price = 100
3
(thin dashed line), as well as the value of a straddle with maturity 2
= 12
(solid line)
1
= 12 (dashed line). The right panel depicts the P&Ls of roughly at-the-money
and 1
(
) Cost , where
straddles bought at time , dened as (
)+ + (
)+
3
Cost = BS (
;
) + BS (
;
), with 2
= 12 months (solid
1
line), and 1
= 12 (dashed line), = 1%, = 10%, and the index level
= 100.
The logic behind a straddle is that a call and a put have deltas that roughly compensate with
each other, thereby allowing this portfolio to change primarily because of volatility movements.
Figure 10.9 illustrates that a straddle helps express views about volatility, in that it pays o
whenever the stock price moves su ciently away from the initial level of the index, = 100.
Note a technical complication, arising because the delta of the straddle is not precisely always
zero, especially when the index level drifts away from moneyness. By Eq. (10.21), and the putcall parity in Eq. (10.7), it is: 2 ( ) 1. For example, we have that in the Black & Scholes
market,
Straddle = + = (2 ( ) 1)
1
2
Straddle
2
=2
2
=2
1
1
2
2
1
1
2
2
1
1
2
(0) + (0)
2
1
2
498
c
by
A. Mele
0.8
0.6
0.4
0.2
0.0
96
97
98
99
100
-0.2
101
102
103
104
105
-0.4
-0.6
-0.8
FIGURE 10.10. The delta of a straddle, 2 ( ) 1, where ( ) is the Black and Scholes
3
1
delta in Eq. (10.21), the strike = 100, for maturity 2
= 12
(solid line) and 1
= 12
(dashed line), and = 1% and = 10%.
Naturally, shorting a straddle leads to payo s with opposite sign than those in Figure 10.9
shorting a straddle relies on the expectation that markets are going to be stable. Straddles
bear some inglorious history. In 1995, the 233-year old Barings Bank collapsed, because of the
famous short-straddle one of its traders, Nick Leeson, was implementing on the Nikkei Index. A
short-straddle is, of course, a view volatility will not raise. However, in January 1995, a violent
earthquake made the Nikkei index crash by almost 7% in a week. The straddle was naked,
i.e. delta-hedged, at most, and led to losses Leeson was not only unable to absorb, but also to
amplify, given he was insisting on having views the Index would stabilize. The Index did not.
Potential losses arising from a short position in straddles can be reduced, by going long
one additional portfolio comprising: (i) an out-of-the money put, which pays exactly when the
underlying goes down, and (ii) an out-of-the money call, which pays when the underlying goes
up. Combining this portfolio with a short-straddle leads to what is known as buttery spread.
Figure 10.11 depicts payo s and P&L relating to a buttery, where the straddle has strike
= 100 and maturity one month (as one of the straddles in Figure 10.9), and the strikes of the
out-of-the-money call and put are
= 102 and = 98. The right panel shows the P&L of
+
(
)
the buttery, dened as
(
) +(
)+ +(
)+ +(
)+
Cost ,
11 Note that by standard homogeneity properties,
1
at-the-money straddle,
Straddle|
=
2
Straddle = 2 ( )
.
499
c
by
A. Mele
1
where
= 12
, and Cost is the value of the buttery at time in the same Black & Scholes
market considered in Figures 10.9-10.10.
1.4
1.2
1.0
0.8
0
-1
96
98
100
102
104
0.6
S_T
0.4
-2
0.2
-3
0.0
-0.2
-4
96
98
100
102
104
S_T
-0.4
-5
FIGURE 10.11. The left panel depicts the payo of a buttery with maturity equal to
one month (solid line), which is (i) short one straddle (the thin dashed line) with strike
= 100, and (ii) long one out-of-the-money put, with strike
= 98, and one out-of= 102 (the dashed line). The right panel depicts the P&L
the-money put, with strike
+
(
) Cost ,
of the buttery,
(
) +(
)+ + (
)+ + (
)+
1
where
= 12 , and Cost is the value of the buttery at time , obtained in a Black &
Scholes market with = 1% and = 10% and the index level
= 100.
Calendar spreads are alternative strategies to straddles with which to express views about
volatility. They are portfolios long one call with maturity 1
and short one call with maturity
, where 1
2
2 , and where the two calls have the same strike price. If the underlying
asset price does not move too much, the calendar spread value drops, because the price decay
due to the passage of time (see Section 10.4.6.2) is more severe for the call with lower time to
maturity.
However, at any given point in time, the calendar spread increases in value as soon as the
underlying price moves, regardless of whether this movement is positive or negative. This property is due to convexity. Let us explain. Note that as the option maturity decreases, the option
value becomes more convex with respect to the underlying price. Therefore, at any given point
in time, when the underlying increases, the short-dated option value increases more than the
long-dated; and when the underlying decreases, the short-dated option value decreases less than
the long-dated.12
Therefore, going long a calendar is consistent with the view that the asset prices will quite
uctuate away from their current levels (regardless of whether on their way up or down), that
is, that market (realized) volatility is about to increase. Figure 10.12 depicts the payo and the
1
P&Ls of a calendar, with strike
= 100, an index level
= 100, maturities 1
= 12
and
12 This argument is best understood while comparing the intrinsic value of an option to the option value before maturity. The
former is at for relatively small values of the underlying price. Otherwise, it increases one-to-one with underlying.
500
c
by
A. Mele
3
= 12
, and assuming the same Black & Scholes market as in Figures 10.9-10.11. The payo
2
)+
;
, whereas the P&L is
one month after the initial positioning is (
BS
12
1
+
2
12 (
)
;
;
)
;
)).
given by (
BS
BS (
1
BS (
2
12
2
0.6
0.4
0.2
0.0
-0.2
96
97
98
99
100
101
102
103
104
105
S_T
-0.4
-0.6
-0.8
-1.0
-1.2
-1.4
-1.6
FIGURE 10.12. Payo and P&L of a calendar, a portfolio which is (i) long a call option
1
with strike price
= 100 and time to maturity 1
= 12
, and (ii) short a call option
3
= 12 . The solid line plots the payo
with strike price = 100 and time to maturity 2
2
after one month from inception, dened as ( )
(
)+
,
BS
12 ;
whereas the dashed line is the payo inclusive of the initial value of the position, ( )
1
12 ( BS (
;
)
;
)), with the index level
= 100.
1
2
BS (
10.6.2 P&Ls of
-hedged strategies
Straddles, calendars, or Black-Scholes hedged strategies, are not necessarily the best way to
formulate views regarding ongoing volatility developments. To understand volatility trading
through option-based strategies, consider the simplest strategy, where one buys an option and
hedges it through the Black and Scholes formula.13 Suppose to live in a world with stochastic
volatility, where the asset price moves as in Eqs. (10.36). Assume that at time , we go long
2
a call option with market price equal to (
). Let us build up a self-nanced portfolio
with value ,
=
+
(10.60)
where
2
0 =
0
0 0
BS
; IV0 )
(10.61)
and IV0 is the Black-Scholes implied volatility as of time = 0, i.e. the time at which we are
to take a view on future volatility.
Consider, rst, the following heuristic arguments. Assume a Black & Scholes market, where
the short-term rate, , is zero and is also zero. While volatility is constant in this market, there
13 The following arguments also apply to the hypothetical situation where an investment bank, say, purchases an option for a
mere market making scope, and then tries to hedge against it through Black-Scholes. It is, however, an unrealistic situation, as
investment banks hedge through books, not through the single units adding up to the books.
501
c
by
A. Mele
2
might be periods where the realized instantaneous variance,
, is higher than IV20 . What is
the P&L of a call option delta-hedged through Black & Scholes? Note that a call option deltahedged with Black & Scholes is simply a portfolio with value equal to
=
,
BS
such that, approximately,
=
+
|
1
2
1
2
)2 +
IV20
{z
1
+
2
BS
BS
1
) =
2
2
"
IV20
(10.62)
, BS =
, the Delta, = 2 , the Gamma, and the second equality follows
where =
by the Black-Scholes pricing equation (10.14). Aggregating up to the maturity of the option,
delivers the P&L at :
"
#
2
X
1X
2
P&L
=
IV20
(10.63)
2
=1
=1
Note that the Black & Scholes delta is needed to compensate for those portions of the call
price movements arising due to the asset price movementsthe term BS
in the brackets of
Eq. (10.62) contributes positively to the P&L, when
0, and negatively otherwise. So the
2
hedging is natural: the call price can go up because of realized volatility (i.e.,
) or because
the underlying price goes up. To isolate the views about volatility, we need to hedge against
movements in the underlying price. As is clear, hedging through the Black-Scholes delta helps
neutralize this e ect.14
Hedging can only be e ective in the very short-term, as the period-by-period prots
in
Eq. (10.62), only depend on how far realized volatility is from the initial Black-Scholes implied
volatility.15 In general, hedging might lead to a P&L inconsistent with views, because
the
2
2
IV20 , is weighted with the Dollar Gamma,
, which is positive as the Black-Scholes price
is
in . In other words, we may well end up with a negative P&L , even if the terms
convex
2
IV20
are positive for most of the time, a feature known as price-dependency. We
now illustrate these facts through a continuous-time model with random volatility.
Consider a general situation where volatility is not constant, such that the model is misspecied. El Karoui, Jeanblanc-Picque and Shreve (1998) make the following observations. Consider
the value of the self-nanced portfolio in Eq (10.60). Because this portfolio is self-nanced,
=
=
=[
+
+ (
+ (
)
+
2
in the continuous time limit, and assuming Black & Scholes is true,
IV20 , such that Eq. (10.63) shrinks
to zero. Therefore, in the Black-Scholes market, views on realized volatility can be channelled only due to discrete-rebalancing.
Below, we shall explain that a non-zero P&L obtains even in the continuous time limit, once the stock price displays stochastic
volatility.
15 A similar P&L obtains regarding straddles and calendars.
14 Naturally,
502
c
by
A. Mele
IV0 ) =
BS
BS
1
+
2
BS
2
BS
where
BS
BS
BS
BS
BS + (
1
2
{z
2
2
BS
2
2
+ IV20
BS
2
BS
BS
1
2
+(
IV20
BS
2
2
1
2
IV20
2
2
BS
2
BS
2
Therefore, the tracking error, or P&L , dened as the di erence between the Black-Scholes price
and the portfolio value,
P&L
; IV0 )
BS (
satises
P&L =
At maturity
,
P&L
1
P&L +
2
BS (
= max {
Z
1
=
2
0
;
0}
2
IV20
2
2
BS
2
IV0 )
IV20
BS
2
(10.64)
This expression is the continuous-time counterpart to Eq. (10.63) in a market with stochastic
volatility. Moreover, it can be shown that a delta-hedged straddle strategy leads to twice the
expression in Eq. (10.64), with the second partial of the straddle replacing the Black-Scholes
Gamma. Because the Black-Scholes price is convex, Eq. (10.64) tells us that even if we do not
exactly know the law of movement of volatility, but still hold the view it will be persistently
higher than the initial Black-Scholes implied volatility, we can obtain positive prots through
(i) a long position in a call, and (ii) a short position in the Black-Scholes replicating portfolio. Naturally, it isnt an arbitrage opportunity. The critical assumption is that volatility will
increase.
Eq. (10.64) is problematic. Even if the volatility
is higher than IV0 for most of the time,
the nal P&L may not necessarily lead to a prot. The reason is that each volatility view,
2
2
IV20 , is weighted by the Dollar Gamma, 2 BS
2 . It may be that bad realization of the
volatility views, i.e. 2
IV20 , occur precisely when the Dollar Gamma is largethe pricedependency issue raised whilst discussing Eq. (10.63). Moreover, the strategy is costly, as it
relies on -hedging. The volatility contracts of Section 10.6 overcome these di culties.
c
by
A. Mele
exists a beautiful connection between the model in this section and the price of volatility, i.e.,
the price to be paid in dedicated variance swaps aiming to protect investors from changes in
volatility over a xed horizon.
10.7.1 Issues
Stochastic volatility models might provide interesting explanations, such as the smile e ect, as
discussed in Section 10.5.2. However, these models cannot allow for a perfect t of the smile.
Towards the end of 1980s and the beginning of the 1990s, a modeling approach emerged to cope
with issues relating to a perfect t of the yield curve. As reviewed in the next two chapters,
this modeling approach was a response to the need of pricing interest rate derivatives while
relying on models in which the underlying assets in the banks books (bonds say) were priced
without errors. In the early and mid 1990s, methods were developed to deal with equity options
(Derman and Kani, 1994; Dupire, 1994; Derman, 1998), which we succinctly review in this
section.
Why is it important to exactly t all of the already existing options? Trading deals with
both plain vanilla and less liquid, or exotic derivatives. Suppose we wish to price exotic
derivatives. We want to make sure the model we use to price the illiquid option predict that
the plain vanilla option prices are identical to those we are trading. How can we trust a model
that is not even able to pin down all outstanding contracts? A model like this could give rise
to arbitrage opportunities to unscrupulous users. To achieve these tasks, we need to feed the
model not only with information regarding the current price, but also with information linking
to the entire collection of available option prices.
[Plan of this Section]
10.7.2 Implied binomial trees
We begin with the usual binomial tree of Cox, Ross and Rubinstein (1979), then deal with
implied binomial trees. The idea underlying implied binomial trees is to enrich the basic Black &
Scholes model with information regarding the very same asset we are pricing (i.e., the options).
While Black-Scholes and Cox-Ross-Rubinstein feed the evaluation model with information only
relating to the initial stock price, implied binomial trees also use information available through
derivatives data, aiming to model developments in the stock price. The resulting model is one
with time-varying volatility, in which each derivative price used to feed the model is t without
errors. The model can be used to price exotic derivatives, while by design able to price existing
derivatives without errors.
10.7.2.1 Binomial trees
It is the discrete-time version of the local volatility model in continuous time. In discrete time,
Derman and Kani (1994) and Rubinstein (1994), and in continuous time, Dupire (1994).
We assume that the short-term rate is constant, or exogenously given. We shall rely on
Arrow-Debreu securities, i.e., securities that pay o one unit of numeraire in a given state of
the world, and zero otherwise (see Part I of these lectures). Let ( + 1) be the price of an
504
c
by
A. Mele
$0
(
):
(1
1 ):
( ))
( )
time
%
&
%
&
+ 1) : $1
$0
time + 1
In this tree, ( ) is the risk-neutral probability of an upward movement in the stock price at
time in state . Displayed on the second row of the tree are the values of the Arrow-Debreu
security that pays at (
+ 1), both at ( ) and (
1 ). By explanations given in Chapter
2, we have that
( + 1) =
( )
(1
( )) +
( )
( )
(10.65)
Eq. (10.65) is known as the forward equation for the set of all the Arrow-Debreu security
prices. It can be solved recursively once we know ( ) is given, as mentioned. To illustrate,
suppose that risk-neutral probabilities are constant and equal to , such that the solution to Eq.
(10.65) is, simply, the discounted value of the risk-neutral probability that upward movements
occur over + 1 trials:
( + 1) =
( +1)
+1
(1
+1
This example can be used to price Arrow-Debreu securities in the context of the Cox-RossRubinstein model of the previous section.
We now turn to a more general problem. Suppose we are given a set of European option
prices. What are the values of ( ) to be assigned in each node of the tree such that these
options are priced without error? More generally, what are the values of ( ) and those of
the stock price in each node of the tree such that all the European options are priced without
error?
Our general approach in Chapter 11 is to freeze the risk-neutral probabilities while allowing
the short-term rate to be stochastic. That is, things are coupled in that the need arises to
determine an implied binomial tree for the short-term rate that allows for a perfect t of the
entire yield curve. In this chapter, we take as given as explained, and determine an implied
binomial tree for the risk-neutral probabilities. Suppose that we have solved everything up to
time , for example, = 2 as in the example depicted below.
505
c
by
A. Mele
%
&
%
&
%
&
=0
=1
% 2
&
% 1
&
% 0
&
=2
3
2
1
0
=3
+ 1) =
( )
= 0
( )
=0
( +1
) + (1 )(
) if
( )=
( +1
)
if =
0
if
(10.66)
, and expiring at
(10.67)
The strikes in Eq. (10.66) are chosen such that the options are (roughly) at-the-money one
period before their expiration and, accordingly, the expressions in Eqs. (10.67) for the prices of
+1 .
the call options one period before the expiration follow, because by construction,
While market prices for these options do not necessarily exists, we can interpolate the skew,
and predict the missing points needed to implement Eq. (10.66).
To solve for the 2 + 1 parameters, we use (i) the pricing equations in (10.66), (ii) the
martingale conditions satised by the stock price at time ,
=
+1 + (1 )
= 0
(10.68)
and, nally, (iii) a (2 + 1)-th condition, a renormalization condition, which we shall discuss
below.
Replacing Eq. (10.68) into the the rst of Eqs. (10.67) leaves,
( )=
if
506
(10.69)
c
by
A. Mele
+ 1) =
( )
( )+
( )
( )=
( ) ( +1
= +1
)+
( )(
= +1
(10.70)
+1
(10.71)
Replacing Eq. (10.71) into Eq. (10.70) and rearraning terms, leaves a recursion for the stock
price over the nodes at time + 1,
(
+
1)
(
)
(
)
( ) (
)
$
= +1
+1 =
(10.72)
P
(
+
1)
(
)
(
)
(
)
(
)
$
= +1
Eq. (10.72) needs a re-normalization, as mentioned. [In progress] Once we solve for +1 , we
can solve for , using Eq. (10.71), then update the price of the Arrow-Debreu securities in Eq.
(10.65), and solve everything recursively.
This algorithm might lead to negative risk-neutral probabilities, which implies arbitrage in
which case we change the options to use. [In progress]
We now turn to the continuous time approach.
10.7.3 The perfect t, in continuous time
We know that the only input to the option pricing problem is the instantaneous volatility of
the underlying asset price. Which volatility should we use? At least, we know that option prices
are a function of this volatility. The idea is to nd a volatility function such that this very same
volatility delivers back the prices of all the already traded options. Its an inverse problem.
Let us outline the steps we need to price new derivatives, while avoiding any pricing errors for
the existing ones:
(i) We take as given the prices of a set of actively traded European options. Let and be
strikes and time-to-maturity of these liquid options. We aim to match models predictions
to data:
)= (
),
varying,
(10.73)
$(
where $ (
) denote market option prices, and (
) are the corresponding models
predictions. Note, we are assuming a continuum of options, and, that their prices are
di erentiable as much as we need for the solution to our problem to be well-dened
technically, we need Eq. (10.75) below to be well-dened. The question we now ask is
whether it is mathematically possible to consider a di usive model for the stock price,
such that the initial collection of European option prices, $ (
), is predicted without
errors by the resulting model, as in Eq. (10.73)?
(ii) The answer is in the a rmative. Consider a di usion process for the stock price:
=
+ (
507
(10.74)
c
by
A. Mele
where is a Brownian motion under the risk-neutral probability. The only function to
calibrate to make Eq. (10.73) hold is the volatility function, (
).
(iii) The Appendix shows that Eq. (10.73) holds if and only if (
v
u
)
$(
$(
u
+
u
)=u
loc (
t2
2
)
$(
2
)=
loc
), where:
)
(10.75)
The function loc ( ) is referred to as local volatility. Its square is the local variance,
dened as the conditional expectation under
of the instantaneous variance given the
market level at some future date ,
2
2
)=E
(
) =
(10.76)
loc (
where E [ | ] is the conditional expectation taken under the risk-neutral probability. All in
all, local volatility is the function loc ( ) in Eq. (10.75) such that the theoretical price
generated by the model in Eq. (10.74) equals the market price of all available options.
(iv) Finally, we can price illiquid options through numerical methods, for example through
simulations. In the simulations, we use
=
loc
Empirically, the local volatility surface, loc ( ) is typically decreasing in for xed , a
phenomenon known as the Black-Christie-Nelson leverage e ect discussed in Chapter 8 and
Section 10.5.2. This fact might lead to assume from the outset that ( ) =
( ), for some
function and some constant
0, as simplication leading to the so-called CEV (Constant
Elasticity of Variance) model. A convenient model is one that combines local vols with stoch
vol, as follows:
=
+ (
)
(10.77)
= ( ) + ( )
where is another Brownian motion, and , are some functions. The appendix shows that
in this specic case, the initial set of all European options prices is pinned down by:
loc (
)= p
E( 2|
loc (
where
loc
)
=
(10.78)
+ loc (
= ( )
+ ( )
Practitioners are also heavily rely on the so-called SABR model, which is parametric in
nature. [Provide references, and explain why SABR is important, compared to local-vol.] A
note on recalibration. Clearly, local surfaces are obviously functions of the initial state where
the calibration starts o . The calibration has to be re-performed all the time to reect new
information.
508
c
by
A. Mele
Section 10.5.5 provides the expression of the P&L relating to a long position in a call option,
delta-hedged with Black and Scholes using an implied volatility xed at an initial level IV0 ,
P&L
BS (
1
=
2
IV0 )
IV20
2
2
BS
2
(10.79)
E 0
2
hR
i
IV20 =
2 2 BS
E 0
2
=
for some deterministic
20 = IV20
. In this case, the P&L would be similar to that in Eq. (10.79), with
1
P&L =
2
2
2
E 2 BS
2
=E
=
2
E 2 BS
2
where E
=
E
BS
2
2
BS
2
2
2
BS
2
We term
(10.80)
(10.81)
, dened as,
(10.82)
So implied vols are expectations of future realized vols, but only under the Dollar Gamma
probability. Clearly, then, they cannot be used as the fair value of a variance swaps, unless we
tilt the variance contract by a random multiplier coinciding with the Dollar Gamma.
509
c
by
A. Mele
2
2
E
=E E
=E
2
loc
=E
=E
Z
=
2
loc
2
loc
)
2
loc (
)
2
E
)
BS
2
( |
BS
2
0)
(10.83)
where the rst equality follows by the law of iterated expectations, ( | 0 ) denotes the conditional density of given 0 , and 2loc (
) is the local variance, as dened in Section 10.6.2.
Finally, is a deterministic, most likely path of , after Gatheral (2006, Chapter 6), a
sort of certainty equivalent for the local variance, for a xed . We also know that at = ,
2
2
is Diracs delta centered at , such that we may safely condition = and, then,
BS
.
view as a bridge starting from 0 and ending at . As a simple example,
0(
0)
As a second example, E ( |
= ), which we may approximate assuming
is a Geometric
1 2
2
. Gatheral
Brownian motion with parameters and , in which case
0
0
argues, with a numerical example, that these approximations are quite reasonable, at least for
options with time to maturity less than a year.
Using the approximation in Eq. (10.83) delivers:
Z
1
2
2
)
IV0 =
(10.84)
loc (
0
Surfaces depend on the initial state, as mentioned in Section 10.6.2. Sticky smiles can
roughly be dened as those where the skew does not depend on the initial state.
Suppose a
[In progress]
c
by
A. Mele
max (ln
[
min (ln
[
511
c
by
A. Mele
B
100
98
96
94
A
92
10
15
20
25
30
days
While this contract is clearly tracking volatility, it does so imperfectly. The stock price has not
been volatile except for a few isolated days. We would like to make reference to contracts that
pay o when volatility has been sustained over the whole month. The next section describes
such contracts.
10.8.2 Fear gauge contracts
10.8.2.1 Variance swaps and the VIX index
=
where
( :
with 1
(10.85)
is F -adapted, i.e. F can be larger than that generated by the stock price, F
). Next, dene the realized integrated variance within the time interval [ 1
:
Z
2 ],
var (
2)
We dene a variance swap as a contracts with zero value at the inception date, , and payo
at maturity given by:
var
= (var ( ) Pvar ( )) N
(10.86)
where N is the notional value of the contract, and Pvar ( ) is the variance swap rate agreed
at , and paid o at time .16 Therefore, this contract is a forward, not a swap really. If is
deterministic, then the swap rate must satisfy:
Pvar (
) = E (var (
))
16 Note that this contract relies on some notion of realized variance, as a continuous record of returns is obviously unavailable.
, where VN is the
Moreover, it has long been market practice to dene the variance notional in such a way that N = 2 VN
Pvar
vega notional, that is, the notional expressed in volatility percentage points. Suppose, for example, that realized volatility
is 1 vega (i.e., one volatility point) above the square root of the variance swap rate, var (
) = ( Pvar + 1)2 , such that
1
var = (1 +
) VN VN. That is, vega notional is approximately the notional for each vega realized volatility that exceeds
2 Pvar
the square root of the variance swap rate.
512
c
by
A. Mele
It remains to determine E (var ( )); below, we show that this is the same as a portfolio of
out-of-the money options. First, let us consider a simple case, in which = 0. In this case, we
can solve for E (var ( )) while relying on previous results in the previous section regarding
local volatility. Indeed, note that by Eqs. (10.75) and (10.76), and the connection between
risk-neutral densities and convexity of the option price, Eq. (10.28), we have that
E
=2
(10.87)
(
) is the price as of time of a call option expiring at and struck at . We
where
can compute the risk-neutral expectation of this realized variance, under the assumption that
= 0. By Eq. (10.87),
E (var (
2 ))
=2
2)
)) = 2
1)
(10.88)
)
2
)
2
(10.89)
where
is the forward price:
= ( ) , and (
) is the price as of time of a put
option expiring at and struck at . A proof of Eq. (10.89) is in the Appendix. It is a weighted
average of out-of-the-money options, and it is a fear gauge as a resultthe market assessment
of extreme movements is aptly captured by an average of out-of-the-money options.
The new VIX index maintained by the CBOE is an estimate of the square root of E (var ( ))
in Eq. (10.89), annualized:
VIX (
E (var (
))
(10.90)
where
is expressed as a fraction of a year. The approximation relies on a nite number of
out-of-the-money options.
Note an interesting point in this section. Up to the previous sections, we were used to think
that volatility determines option prices (see, e.g., Section 10.5). We now have a theoretic construct that makes option prices sum up to a su cient statistics for (market-adjusted, i.e.,
risk-neutral) expected volatility.
The VIX index is typically referred to as a fear-gauge. Eqs. (10.89)-(10.90) illustrate that
this denition is quite appropriate. The VIX index depends on those of the out-of-the money
options, that is, the options that are exercized in case of tail events. Moreover, tail events are
those arising when worst-case scenarios occur. This connection suggests an interpretation of
VIX behavior and in terms of aversion to Knightian uncertainty as surveyed in Chapter 8, an
issue not explored yet in theoretical research. It is well-known that the VIX index spikes exactly
when equity markets drop. However, Eqs. (10.89)-(10.90) indicate the index potentially reects
market fears regarding both tail-events.
513
c
by
A. Mele
The previous results are extraordinary. We know since Section 10.2 that for any generic risk ,
its forward price is E( ), provided interest rates are constant, such that E( ) = ( )
as soon as
is traded, for otherwise we would need to rely on a model to determine the
expectation E( ). Note that here, var ( ) is not traded, and yet we can still express its
price, E [var ( )], without relying on any model. In other words, we can price volatility in a
model-free fashionthe annualized price of a variance swap is simply the square of the VIX
index.
It is useful to review the steps that lead to Eq. (10.89). The starting point is the so-called
log-contract, a concept rst introduced by Neuberger (1994). This idealized
contract is designed
as to ensure a a payo equal to ln . Its fair value is obviously E ln
, and is negative, as
we shall shaw in a moment.
The intuitive reason the price of the log-contract is negative is that the payo ln
is skewed
to the left, due to the concavity of the log function, such that the expected losses on the
downside are larger than the expected gains on the upside. These facts are conrmed by an
application to Itos lemma to Eq. (10.85), which yields:
1
E ln
=
(10.91)
E [var ( )]
2
It is remarkable. The price of volatility is the same as the value of going short a log-contract.
Naturally, we are not done, because we still dont know how to price the log-contract in the
rst place! The pricing of this, and related contracts, relies on so-called spanning arguments,
as explained for example by Bakshi and Madan (2000) and Carr and Madan (2001). In the
Appendix, it is shown that the payo of the log-contract can be written as
Z
Z
1
+ 1
+ 1
ln
=
(
)
(
)
+
(
)
(10.92)
2
2
0
The expectation on the R.H.S. of Eq. (10.92) is (minus) half that on the R.H.S. of Eq. (10.89),
assuming interest rates are constant. Eq. (10.89) then follows by Eq. (10.91).
10.8.2.3 The market for volatility and further developments
The behavior of the VIX index has been a topic intensively studied in empirical research. [Cite
references.] The following pictures depict the time series behavior of the new VIX index since
its inception as well as dynamics of volume on VIX options.
514
c
by
A. Mele
515
c
by
A. Mele
The trading of derivatives referenced to VIX has increased at a very fast pace. These derivatives aim to replace expensives straddles and makes books less messy with outcomes consistent
with viewseliminates price dependency. It is not a mere theoretical curiosity. The following
table depicts transaction data for options on the VIX index as compared to other options cleared
by CBOE. (Note that the notionals relating to VIX options and futures are not the same, being
$1000 for VIX futures and $100 for VIX options, as of August 2011.)
CBOE trading volume (contracts)Average per day, August 2011
Total trading volume
12,000,000
2,000,000
1,300,000
582,000
79,000
(i)
(ii)
1
6
1
2
of (i)+(ii)
of (ii)
Section 10.6.3 explains how the skew relates to local volatility, but how is the expected
variance in Eq. (10.89) related to the skew? Demeter, Derman, Kamal and Zou (1999) show
that if the implied volatility varies linearly with the strike,
IV = IVatm
for some constant , then,
1
E [var (
)]
IV2atm 1 + 3 (
That is, the existence of a skew, 6= 0, increases the value of the fair variance above the
at-the-money implied volatility.
Sometimes it is said that variance swaps are protable to protection sellers, because The
derivative house has the statistical edge, meaning that the realized variance from to , say,
is general lower than future expected variance under the risk-neutral probability, reecting
variance risk-premiums, as shown in the following picture.
516
c
by
A. Mele
(10.94)
(2)
Pvar (1 2)
var (1 2)
(10.95)
Moreover, the two year variance swap we went long at time zero (component (i) of 10.94) gives
rise to the following payo at time 2:
2
(2)
var (0 2)
517
Pvar (0 2)
(10.96)
c
by
A. Mele
Adding Eq. (10.95) and Eq. (10.96), and using the relation, var (0 2) = var (0 1) + var (1 2),
leads to:
Pvar (0 2)
(2)
1 (2) + 2 (2) = Pvar (1 2) + var (0 1)
we shorted at time zero (component (ii)
(Pvar (0 1)
var (0 1))
(10.97)
Investing (1) for a further year at the safe interest rate delivers
the total prots at time 2 are:
tot
(2) + (1)
= Pvar (1 2)
(1)
Pvar (0 2) + Pvar (0 1)
(10.98)
E (var (
(var (
) + var (
) + Pvar (
)
)
Pvar (
Pvar (
))
))
(10.99)
where E denotes the risk-neutral expectation conditional upon the information available at
time .
Marking to market suggests an alternative way to implement the forward volatility trading
of the previous section. Suppose, then, again, to have the view that markets for volatility will
be such that (10.93) holds true at time 1, and, accordingly, consider the strategy in (10.94). If
(10.93) holds true at time 1, we may close the position (i) in (10.94) at time 1. By Eq. (10.99),
the market value of the two year variance swap we went long at time 0 is,
(1)
(var (0 1) + Pvar (1 2)
Pvar (0 2))
(10.100)
At time 1, we obtain (1)+ (1), which we can invest at the safe interest rate for one additional
period, delivering the prot tot in Eq. (10.98), for time 2.
10.8.5 Stochastic interest rates
When interest rates are stochastic, but still independent of volatility, the expressions given for
the contract and indexes do not hold anymore, and there are a number of qualications, which
we make in Remark A.1 of Appendix 5. Moreover, the forward volatility trading strategy in
10.94 should be modied. For example, we might use the following strategy:
(i) long a two year variance swap, struck at Pvar (0 2) , with notional one
2)
(ii) short a one year variance swap, struck at Pvar (0 1), with notional (0
(0 1)
518
c
by
A. Mele
If come time 1, Eq. (10.93) holds true, we may liquidate (i), thereby accessing the payo relating
to (ii), for a total payo equal to:
(var (0 1) + Pvar (1 2)
Pvar (0 2))
= (Pvar (1 2)
(1 2) + (Pvar (0 1)
(1 2)
+ (var (0 1) Pvar (0 1))
var (0 1))
(0 2)
(0 1)
(1 2)
(0 2)
(0 1)
where the rst term on the left hand side arises by the liquidation of (i) and by Eq. (10.100),
and the second term on the left hand side arises by (ii). By Eq. (10.93), the rst term on the
2)
right hand side is positive. If the short-term interest rate was deterministic, (1 2) = (0
,
(0 1)
and the second term on the right hand side would be zero. When interest rates are stochastic,
the second term can take on any sign although then, its absolute value should be quite low,
compared to the rst term on the right hand side.
10.8.6 Hedging
A nancial institution might be merely interested in intermediating the contract, which then
needs to be hedged against. Suppose, for example, that the nancial institution sells protection
at time , thereby promising to pay the realized integrated variance var ( ) at time . We
want to replicate this integrated variance. By Itos lemma:
Z
Z
1
1
var ( ) = 2
=2
2 ln
(
)
2 ln
(10.101)
The rst term can be replicated by continuously rebalancing a stock position, which is always
long
= 2 shares of the stock, adjusted for the time value of money. Precisely, consider a
self-nanced portfolio (
), such that its value satises:
=
where
1
)
(10.102)
(10.103)
R 1
(
). In Appendix 5, we show that ( )
such that: (i) = 0, and (ii) =
is self-nanced. The bottom line is that we can hedge the rst term in Eq. (10.101) through a
self-nanced portfolio that costs nothing at time . This portfolio is simply (2 2 ).
To replicate the second term in Eq. (10.101), the payo of the so-called log-contract, note
that we simply have to make reference to twice Eq. (10.92). Therefore, the log-contract can be
replicated by shorting 2
units of forwards, which are of course costless at time , and going
long a continuum of out-of-the-money options with weights 2 2 , which cost
Z
Z
(
)
(
)
(
)
2
+2
=
E [var ( )]
2
2
0
519
c
by
A. Mele
(
)
where the equality follows by Eq. (10.89). We borrow
E [var ( )] to purchase these
options, and once this is done, we are guaranteed var ( ) is replicated at time , as we now
have replicated both the rst term and the second term in Eq. (10.101). Finally, come time ,
we pay back the loan, worth E [var ( )], and receive a payo equal to var ( ) E [var ( )],
due to the sale of insurance. Since var ( ) is replicated, no additional funds are needed at
time .
(10.104)
=
, the forward rate, and the remaining usual notation. Eq. (10.104) shows
where
that any Markov payo can be spanned through a set of European options. For example, if
( ) = ln , we can price a log-contract, which leads to the new VIX index, as explained in the
previous section. We are interested in skewness
contracts.
Z
(
)
E [ v ( )] =
) Put (
)
+
) Call (
)
v (
v (
0
2
00
)
) = 2 1 ln
v (
v (
Which volatility contract does the payo v ( ) relate to? Its a contract relating to the second
moment of the cumulative return ln , rather than the realized volatility of the previous section,
R 2
dened as the sum of the instantaneous return variances,
. Precisely, note that by
Itos lemma,
" Z
2 #
Z
E [ v ( )] = E 2
(10.105)
1 ln
ln
+
This volatility contract is a bit
2unusual at thetime of
2writing, as the standard notion of variance
R
we typically price is
, rather than ln
.
The current literature and practice on skewness contracts have a similar cumulative return
avor. Consider the following payo , introduced by Bakshi, Kapadia and Madan (2003),
3
)
ln
E ln
sk (
sk ( ) = 0
The payo sk ( ) refers to the third moment of the cumulative return over a certain investment
horizon. Instead, the notion of a realized skewness would rely on the third moments of the
520
c
by
A. Mele
instantaneous returns, averaged over the given investment horizon. Pricing results relating to
realized skewness are not available at the time of writing. Let us keep on relying on the payo
), and consider the denition of skewness, adjusted for risk,
sk (
E[
Skew
E[
where,
vv
)]
)] 2
sk
vv
E ln
ln
vv
( )=0
and adjustment for risk relates to the fact that the expectations in the denition of Skew are
taken under the risk-neutral probability. Note that vv ( ) is the de-meaned version of v ( ),
and its expectation can be easily found as,
E[
where
log
E[
) = ln
log
vv
)] = E [
)]
E[
log
)]2
)] =
1
2
Put (
1
2
Call (
Likewise,
E[
sk
)] =
sk
sk
( )
00
sk
)=
3
2
( ) Put (
ln
E[
log
)
(
521
)]
sk
( ) Call (
c
by
A. Mele
+
)
1 2 2
+
+
2
1 2 2
+
+
2
+ 12 2 2
=
=
+
1 2 2
2
1
The last equality, and the boundary condition, lead to the Black-Scholes partial di erential equation
(10.14).
522
c
by
A. Mele
E(
where
1
2
. We have
)+ = E (
Iexe )
E (Iexe )
E (Iexe )
E
Iexe
=
=
where the probability
=
,
=
(
E (Iexe )
is dened as:
Note that
(Iexe )
=
a
) = ( 2 ) and
1
2
2 1
1
2
) = ( 1 ), where
=
ln
qR
523
1
2
R
2
c
by
A. Mele
ln
ln
=
=
(
Z
)
2
1
(
2
)
)
=(
Then, we use the law of iterated expectations and elaborate on Eq. (10.49), and arrive to Eq. (10.51)
as follows:
2
(
)
2
=
E (
)+
h
(
)
2
(
)+
=E E
[ ]
2
)
= E [ BS (
;
]
) 2 ]
= E [ BS (
;
Z
q
) ( 2 )
=
;
BS (
q
)]
E [ BS (
;
(10A.1)
where ( 2 ) denotes the density of conditional upon the current level of the variance, 2 .
The third and fourth equalities follow by the assumption
and
are uncorrelated, such that
2
) = ( 2)
(
for otherwise the current level of the index, , would help predict . In other words, Eq. (10A.1)
reveals that the price of an option in this market with stochastic volatility is Black & Scholes weighted
with the probability density of the realized variance.
10.13.2 Extensions
Romano and Touzi (1997) extend the Hull & White equation to the case where asset returns and
volatility are correlated. Consider the following model:
p
2
=
+
+ 1
2
( )
+ ( )
524
and
c
by
A. Mele
is as in Eq. (10.53) of the main text. We have, using the Law of Iterated Expectations,
2
(
=E
h
E
=E[
)+
BS (
)+
q
where is as in Eq. (10.53) of the main text. The third equality follows because by assumption, both
the variance and correlation processes are independent of { } [ ] , such that conditionally upon
the variance and the correlation paths, 2 [ ] and ( ) [ ] , ln
is normally distributed
under the risk-neutral probability,
ln
= (
1
2
with
E
h
h
ln
ln
= (
i Z
=
]
]
)
2
p
1
1
(
2
)
)
=(
The fourth line follows by the same arguments leading to Eq. (10A.1).
=E[
BS (
)] =
BS (
; IV (
))
(10A.2)
ln
q
Let
( ) = E ( ) the expected average volatility, and consider a Taylors second order expansion
( ),
of the Black & Scholes function about
E[
BS (
BS (
)]
1
( )) +
2
BS (
;
2
and,
BS (
BS (
; IV (
))
;
( )) +
BS (
525
()
()
(IV (
q
(
( ))
c
by
A. Mele
)
BS /
1
( )+
2
; )
BS (
;
2
BS
2,
()
q
(
(10A.3)
)
)/
;
;
BS (
BS (
( 12
3(
2(
+ 12
2
))
2(
BS (
Replacing these expressions into Eq. (10A.3) yields the approximation in Eq. (10.37) of the main text.
526
c
by
A. Mele
(10A.4)
where
is some F -adapted process. For example,
(
) , all , where is solution to the
second of Eqs. (10.77). Next, we assume that we are given a continuum of option prices $ (
)
along the two dimensions of strikes
and time-to-maturities . We want to match the prediction of
the model with the market prices,
$(
)=
)+
E(
(10A.5)
where
is solution to Eq. (10A.4).
Let us expand the right-hand side of Eq. (10A.5) with respect to time-to-maturity, for xed
1
+
2 2
(
+I
(
) = I
+
)
2
where
)+
E(
E(
where
E
=
=
)+ +
ZZ
Z
Z
( )
(
Z
( )E
such that,
(
)
2
|
(
)
{z
joint density of (
2
|
=
)
}
( )E
)+
E(
(
1
) + E (
2
E (I
)+ +
E(
)+ +
1
) +
2
E (I
)=
$(
)+
527
(10A.6)
,
)
E(
)+
(10A.7)
c
by
A. Mele
,
(
)=
E (I
$(
)
2
( )
(10A.8)
where the second relation is simply the famous relation in Eq. (10.28) of the main text. By replacing
Eq. (10A.5), (10A.7) and Eqs. (10A.8) into Eq. (10A.6) leaves,
$(
$(
)=
1
2
$(
2
This is,
E
$(
=2
$(
+
2
$(
)
E
(10A.9)
)
2
loc (
(10A.10)
That is, if Eq. (10A.5) holds true, volatility must be restricted to satisfy Eq. (10A.10). As an
example, let
(
) , where is solution to the second of Eqs. (10.77). Then,
2
)=E 2
=
loc (
= E 2(
) 2
=
)E 2
=
= 2(
)E 2
=
2loc (
0=
(
1
2
2(
)=(
[0
$(
$(
0) = (
$(
1
2
2 2 (
loc
$(
)
2
$(
)
2
), such that, by
where denotes the initial price. The previous partial di erential equation is known as the Dupires
equation.
528
c
by
A. Mele
=2
=2
+
2
0
(
( )
(
( )
)
2
where the second line follows by Eq. (10A.10), and the third line follows by Eq. (10.28).
Proof of Eq. (10.89). By a Taylor expansion with remainder, we have that for any function
smooth enough,
Z
0
(
) 00 ( )
(10A.11)
( ) = ( 0) + ( 0) (
0) +
0
Let
= ln
= ln
1
1
Z
Z
1
2
)+
)+
= ln
R
R
) 12 = 0 0 (
where the second equality follows because 0 (
the third equality follows because the forward price at satises
Z
Z
(
)
(
)
=
+
E ln
2
0
)+
)+
(10A.12)
R
)+ 12 + 0 (
)+ 12 , and
= . Hence, by E ( ) = ,
(
)
(10A.13)
2
2E ln
(10A.14)
By replacing Eq. (10A.14) this formula into Eq. (10A.13) yields Eq. (10.89).
Remark A1. The previous proof results hold when the short-term rate is constant. The case of
stochastic interest rates can actually be dealt with, although with some tools, which will be introduced
more systematically in Chapter 12 (Section 12.2). We anticipate how these tools work in the present
appendix, as they allow us to solve for the fair price of variance contracts even when interest rates are
stochastic. Note that if interest rates are stochastic, Eq. (10A.13) generalizes to:
Z
(
)
(
)
= E
+
ln
+
E
2
2
0
(10A.15)
529
c
by
A. Mele
E
ln
= (
)E
ln
(
)
)E
ln
(10A.16)
where E
denotes the expectation taken under a new probability, known as the forward probability.
Naturally, the rst term on the right side of Eq. (10A.15) is zero, as a forward has no value at inception.
But then, this zero value condition implies that:
!
=E
=E
That is, the forward price is a martingale under the forward probability. Therefore, Eq. (10A.14) is
replaced with,
Z
2
ln
(10A.17)
= 2E
E
now denotes the instantaneous volatility of the forward price. By combining Eqs. (10A.15),
where
(10A.16) and (10A.17), we get,
Z
Z
Z
Z
(
)
(
)
2
2
2
=E
=
+
E
2
2
(
) 0
That is, the fair price of a variance contract for a swap of forward volatility can be expressed in a
model-free format. Note that it is the price of a variance contract that we can express in a model-free
fashion, not the (undiscounted) expected realized variance. Indeed, the payo of a variance contract
for forward realized variance is:
Z
2
Pvar (
)=
1
(
Pvar (
=E
)
Z
=
+
=
+
(10A.18)
. With (
=
=
=
), we have that:
530
(10A.19)
c
by
A. Mele
where we have used the portfolio weights in Eq. (10.102) and the expression for the portfolio value
in Eq. (10.103). Eq. (10A.19) is the same as Eq. (10A.18), once we use the portfolio weight in Eq.
(10.102). Therefore, ( ) is self-nanced.
531
c
by
A. Mele
)=
=
( )+
( )+
( )(
( )(
)+
)+
Z
Z
(
00
)
( )(
00
()
+
00
( )(
)
where
= (
, the forward rate. Multiplying both sides of this equation by
taking expectations, yields Eq. (10.104) in the main text.
532
)+
(
),
and
c
by
A. Mele
References
Bakshi, G. and D. Madan (2000): Spanning and Derivative Security Evaluation. Journal of
Financial Economics 55, 205-238.
Bakshi, G., N. Kapadia, D. Madan (2003): Stock Return Characteristics, Skew Laws, and
Di erential Pricing of Individual Equity Options. Review of Financial Studies 16, 101143.
Ball, C.A. and A. Roma (1994): Stochastic Volatility Option Pricing. Journal of Financial
and Quantitative Analysis 29, 589-607.
Bergman, Y. Z., B. D. Grundy, and Z. Wiener (1996): General Properties of Option Prices.
Journal of Finance 51, 1573-1610.
Black, F. (1976a): The Pricing of Commodity Contracts. Journal of Financial Economics
3, 167-179.
Black, F. (1976b): Studies of Stock Price Volatility Changes. Proceedings of the 1976 Meeting
of the American Statistical Association, 177-81.
Black, F. and M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal
of Political Economy 81, 637-659.
Bollerslev, T. (1986): Generalized Autoregressive Conditional Heteroskedasticity. Journal of
Econometrics 31, 307-327.
Bollerslev, T., Engle, R. and D. Nelson (1994): ARCH Models. In: McFadden, D. and R.
Engle (Editors): Handbook of Econometrics (Volume 4), 2959-3038. Amsterdam, NorthHolland
Britten-Jones, M. and A. Neuberger (2000): Option Prices, Implied Price Processes and
Stochastic Volatility. Journal of Finance 55, 839-866.
Carr, P. and D. Madan (2001): Optimal Positioning in Derivative Securities. Quantitative
Finance 1, 19-37.
Christie, A.A. (1982): The Stochastic Behavior of Common Stock Variances: Value, Leverage,
and Interest Rate E ects. Journal of Financial Economics 10, 407-432.
Clark, P.K. (1973): A Subordinated Stochastic Process Model with Fixed Variance for Speculative Prices. Econometrica 41, 135-156.
Corradi, V. (2000): Reconsidering the Continuous Time Limit of the GARCH(1,1) Process.
Journal of Econometrics 96, 145-153.
Cox, J.C., S.A Ross and M. Rubinstein (1979): Option Pricing: A Simplied Approach.
Journal of Financial Economics 7, 229-263.
Cox, J.C., J.E. Ingersoll and S.A. Ross (1985): A Theory of the Term Structure of Interest
Rates. Econometrica 53, 385-407.
533
c
by
A. Mele
Demeter, K., E. Derman, M. Kamal and J. Zou (1999): More Than You Ever Wanted To
Know About Volatility Swaps. Goldman Sachs Quantitative Strategies Research Notes.
Derman, E. (1998): Stochastic Implied Trees: Arbitrage Pricing with Stochastic Term and
Strike Structure of Volatility. International Journal of Theoretical and Applied Finance
1, 61-110.
Derman, E. and J. Kani (1994): Riding on a Smile. Risk 7, 32-39.
Du e, D. and C-f. Huang (1985): Implementing Arrow-Debreu Equilibria by Continuous
Trading of Few Long-Lived Securities. Econometrica 53, 1337-1356.
Dumas, B. (1995): The Meaning of the Implicit Volatility Function in Case of Stochastic
Volatility. Available from:
http://www.insead.edu/facultyresearch/faculty/personal/bdumas/research/index.cfm.
c
by
A. Mele
Mele, A. (1998): Dynamiques non lineaires, volatilite et equilibre. Paris: Editions Economica.
Mele, A. and F. Fornari (2000): Stochastic Volatility in Financial Markets. Crossing the Bridge
to Continuous Time. Boston: Kluwer Academic Publishers.
Merton, R. (1973): Theory of Rational Option Pricing. Bell Journal of Economics and
Management Science 4, 637-654.
Nelson, D.B. (1990): ARCH Models as Di usion Approximations. Journal of Econometrics
45, 7-38.
Nelson, D.B. (1991): Conditional Heteroskedasticity in Asset Returns: A New Approach.
Econometrica 59, 347-370.
Neuberger, A. (1994): Hedging Volatility: the Case for a New Contract. Journal of Portfolio
Management 20, 74-80.
Parkinson, M. (1980): The Extreme Value Method for Estimating the Variance of the Rate
of Returns. Journal of Business 53, 61-68.
Renault, E. (1997): Econometric Models of Option Pricing Errors. In: Kreps, D., Wallis, K.
(Editors): Advances in Economics and Econometrics (Volume 3), 223-278. Cambridge:
Cambridge University Press.
Renault and Touzi.
Romano, M. and N. Touzi (1997): Contingent Claims and Market Completeness in a Stochastic Volatility Model. Mathematical Finance 7, 399-412.
Rubinstein, M. (1994): Implied Binomial Trees. Journal of Finance 49, 771-818.
Scott, L. (1987): Option Pricing when the Variance Changes Randomly: Theory, Estimation,
and an Application. Journal of Financial and Quantitative Analysis 22, 419-438.
SEC-CFTC (2010): Findings Regarding the Market Events of May 6, 2010. A joint report
by the Securities and Exchanges Commission & the Commodity Futures Trading Commission, September.
Tauchen, G. and M. Pitts (1983): The Price Variability-Volume Relationship on Speculative
Markets. Econometrica 51, 485-505.
Taylor, S. (1986): Modeling Financial Time Series. Chichester, UK: Wiley.
Vasicek, O. (1977): An Equilibrium Characterization of the Term Structure. Journal of
Financial Economics 5, 177-188.
Wiggins, J. (1987): Option Values and Stochastic Volatility: Theory and Empirical Estimates. Journal of Financial Economics 19, 351-372.
535
11
Engineering of xed income securities
11.1 Introduction
This chapter is an introduction to the practice of xed income security pricing. Fixed income
securities quite di er from equities and equity derivatives. Consider the simple example of a
simple pure discount bond, which is quite di cult to price, being tied down to the time value
of money. Its value reects intertemporal preferences and beliefs of market participants, which
are unobservable and, importantly, not traded. For this reason, the price of this bond cannot
be related to the current state of the world in a preference-free format. This is of course not
the case while we price equity derivatives in a complete market setting such as that in Black
and Scholes (1973).
This chapter reviews models where we can still price xed income securities in a preferencefree setting. We rely on no-arbitrage models, which are the xed income counterparts to the
local volatility models reviewed in Chapter 10. Within this framework, we give up modeling
the current security prices in the rst place. Rather, we take these prices as given, and exploit
all the information embedded into them so as to extract risk-neutral probabilities of future
price movements. Once risk-neutral probs are reverse-engineered, we can price any interest rate
product in a preference-free format. No-arbitrage models means that the only assumption we
are really making is absence of arbitrage.
The main model that illustrates this way to proceed through closed-form formulae is that of
Ho and Lee (1986). The Ho and Lee approach is an elegant way through which models can be
calibrated to data while ensuring absence of arbitrage. However, the model relies on unrealistic
assumptions, and might lead to negative interest rates. We develop a calibration approach based
on the extraction of Arrow-Debreu security prices, which can accommodate for more realistic
interest rate developments. Arrow-Debreu securities are abstract securities that only pay o in
mutually exclusive states of the world, and their value then naturally relates to the risk-neutral
probability of the events where they specically pay o (see Chapter 2). While these assets do
not obviously trade, we can extract their shadow value from the price of xed income securities
based on a model; we can then use these extracted values to price any interest rate derivative.
We center around these themes, which we illustrate through simple numerical applications
including the pricing of interest rate derivatives such as options on bonds, swaps, caps, callable
11.1. Introduction
c
by
A. Mele
or convertible bonds, emphasizing the joint behavior of derivative prices and the underlying
for example, the derivative price predicted by a given model can tell us about whether the
underlying is mispriced. The framework of analysis is in discrete time, and relies on the same
implied binomial trees introduced in Chapter 10 to model equity derivatives. Implied binomial
trees simplify many conceptual intricacies; Chapter 12 deals with more advanced topics in
interest rate modeling and derivative evaluation, including the empirical motivation underlying
them as well as a systematic analysis through continuous time methods. Finally, this chapter
does not cover credit risk, which is the focus of Chapter 13. We now proceed with a number
of basic pieces of motivation, explaining more in detail a few of the very issues arising in xed
income markets.
11.1.1 Relative pricing in xed income markets
While current bond prices cannot be given a preference-free representation, we can still aim to
price interest rate derivatives in a preference-free fashion. The keyword is relative pricingthe
situation in which we price a number of assets given the price of others, while ensuring absence
of arbitrage, as explained in many previous junctures of these lectures. Pricing options on traded
assets is in general quite a di erent matter than pricing the underlying assets in the rst place.
Even in the equity case, we try to evaluate an option, say, in a preference-free fashion, even if
the underlying asset cannot really. By preference-free, we refer to the possibility to extract
risk-neutral probabilities, or Arrow-Debreu prices, from the price of already traded assets, and
price derivatives written on the states spanned by the extracted Arrow-Debreu prices.
We wish to achieve similar objectives in the xed income space, although the task is challenging. Consider, for example, the Black & Scholes formula. The logic leading to it cannot
exactly be applied to evaluate xed income securities. Indeed, the Black & Scholes model relies
on the assumption of a constant volatility of the underlying price. In the context of interest
rate derivatives, the volatility of the underlying asset price depends, instead, on the maturity
of the underlying, as it tends to zero as the time-to-maturity goes to zero.
More generally, pricing and hedging interest rate derivatives requires a model describing
developments of the whole yield curve, as we shall explain. It goes without saying that the
general principles underlying the APT are naturally still the same. The methods described
in this and following chapter aim to an objective, where the dynamics of the pricers under
the risk-neutral probability (e.g., the dynamics of the short-term rate) do not depend on riskaversion corrections. The reader who is impatient about being referred to a concrete illustration
of these somewhat surprising statements is referred to Eq. (11.42) of this chapter and the
discussion around it, or in general, to Section 12.6 of the next, where we shall deal with the
celebrated Heath, Jarrow and Morton (1992) model.
11.1.2 Many evaluation paradigms
After reading this introductory chapter, you might nd that so many methods and models might
readily become available, which could be used to price interest rate derivatives. Derivatives
houses do really have dozens of models, with di erent houses possibly asking for di erent prices
relating to the same product. This circumstance needs not to be an arbitrage opportunity. Rate
markets are typically over-the-counter, of the type studied in Chapter 9 of these lectures. Indeed,
this chapter relies on stylized examples of markets with di erent derivative prices predicted by
models that match the same set of initially given market prices. Market incompleteness and
segmentation are responsible of these interesting features, as we shall explain.
537
c
by
A. Mele
With dozens of methods available to price xed income products, we do not see the emergence
of a single model to price all of the extant xed income products. Typically, any bank has a
battery of di erent models, with pieces of this battery possibly ghting for di erent goals. For
example, a bank might display a preference for a certain type of models as a result of (i) its
culture and history, or (ii) the particular business is pursuing. For example, in the next chapter,
we shall see that to price interest rate options such as caps, we may use the market model,
which relies on the Black 76 formula. However, using this model implies that we do not have
a closed-form solution for the price of swaptions, which can only be solved through numerical
methods. If the swaptions business is not important for the bank then, we may safely adopt
the market model.
[In progress]
11.1.3 Plan of the chapter
The chapter is organized as follows. Section 11.2 through 11.3 develop the basics underlying
xed income securities, such as interest rate and market conventions, duration, convexity, and
an introduction to basic hedging and trading strategies. Section 11.4 is the rst section to deal
with models that aim to t the initial yield curve without errors, using binomial trees. We
have two fundamentals ways to achieve this goal. As for the rst one, developed in Section
11.4, we freeze scenarios, i.e. the values taken by the short-term rate on the branches of a
tree, and search for the risk-neutral probs such that the prices predicted by the model agree
with the market. As for the second, we x risk-neutral probs, and search for scenarios such
that the model and the market are the same. Section 11.5 deals with the second approach,
which is the essence of the Ho and Lee (1986) model. Naturally, we may consider situations
where we might simultaneously search for probabilities and short-term rate scenarios that make
models consistent with the markets. These situations are quite complex and necessitate a general framework of analysis, developed in Section 11.6, and hinging upon calibration through
Arrow-Debreu securities. Section 11.7 concludes this chapter and provides numerical examples
of how to evaluate bonds with callability and convertibility features, which will receive a more
systematic theoretical treatment in the next chapters.
Financial institutions trade deposits with each other, which span maturities that range from just
overnight to one year at a given currency. The LIBOR rate reects the rate at which nancial
institutions are willing to borrow in these markets, on average. It is an average indicative quote
of the interbank lending market. It is determined as follows. A panel of major banks exists,
where each bank belonging to the panel reports the rate it expects to be charged by its peers in
case of borrowing needs for a given maturity and a given currency. The four highest and the four
lowest rates of these banks reports are ignored, and the remaining reports are aggregated by
538
c
by
A. Mele
Thomson Reuters for ten currencies, and published daily by the British Bankers Association.1
The left panel of Figure 11.1 plots the LIBOR referenced to US dollars.
The LIBOR is a fundamental point of reference to nancial institutions, which look at it as
an opportunity cost of capital. Moreover, many xed income instruments are indexed to the
LIBOR: forward rate agreements, interest rate swaps, or variable mortgage rates (see Chapters
12 and 13). Because the banks submissions over the process of the LIBOR formation do not
rely on trades (only on subjective estimates of the cost of capital), there might be incentives for
report manipulation biased towards the rates that would make a given bank reap prots over
LIBOR-sensitive derivatives, as hypothesized over the LIBOR scandals emerging in June 2012.
While the LIBOR reects market conditions inherent the banking system for a given currency,
the US Federal Funds rate more directly links to the liquidity to be deposited within the Federal
Reserve. Banks have to maintain reserves with the Federal Reserve to partially back deposits and
to clear nancial transactions, as further explained in Section 13.6 of Chapter 13. Transactions
involve banks with excess reserves with the Fed, which earn no interest, to banks with reserve
deciencies. The Federal Funds rate is the overnight rate at which banks lend these reserves
to each other. It is a ected by the FDRBNY, which aims to make it lie within a range of the
target rate decided by the governors at Federal Open Market Committee meetings. This range
is maintained through open market operations.
11.2.1.2 Treasury rate
It is the rate at which a given Government can borrow at a given currency. The left panel
of Figure 11.1 depicts the time behavior of the interest rate on short-term and long term US
government debt: the 3 month T-bill rate and the 10 year Treasury yield.
11.2.1.3 Repo rate (or repurchase agreement rate)
A Repo agreement is a contract by which one counterparty sells some assets to another, with
the obligation to buy these assets back at some future date. The assets act as collateral. The
rate at which such a transaction is made is the repo rate. One day repo agreements give rise to
overnight repos. Longer-term agreements give rise to term repos.
11.2.1.4 Interest rate spreads
Interest rate spreads are the di erence between interest rates applying to two di erent markets.
Thus, they have the potential to remove components that are incorporated by both markets,
thereby isolating interesting pieces of information.
One important example is the LIBOR-OIS spread, which is the di erence between the 3month LIBOR minus the 3-month OIS. The OIS (overnight indexed swap) rate is the swap
rate in a swap agreement of xed against variable interest rate payments, where the variable
interest rate is an overnight reference, typically an average, unsecured interbank overnight rate,
such as the Federal Funds rate in the US, SONIA in the UK or EONIA in the Euro area.2 The
LIBOR reects the monetary policy stance but should also incorporate a premium related to
counterparty risk. Instead, the OIS is a mere interest rate swap; as such, it should primarily
reect the monetary policy stance. Therefore, the LIBOR-OIS spread has the potential to isolate
credit views on nancial institutions. Historically, the LIBOR-OIS spread has behaved quite
1 Instead, the LIBID (London Interbank Bid Rate) is the rate that these nancial institutions are prepared to pay to borrow
money, on average, but it does not rely on a formal setting procedure such as that leading to the daily LIBOR. Naturally, the
LIBID is less than the LIBOR.
2 See the next chapter (Section 12.8.5) for extensive discussions regarding interest rate swaps.
539
c
by
A. Mele
at, although then it reached high record levels during the 2007 subprime crisis (see the right
panel of Figure 11.1).
Another example of an interest rate spread commonly used in empirical research is the socalled TED spread, which is the di erence between the LIBOR and the Treasury bill rate.
The TED spread captures ight to quality e ects that typically occur during times of crisis,
when Treasuries are considered particularly valuable by investors (see the right panel of Figure
11.1). Due to this ight to quality reason, the TED spread might fail isolate views about
developments in the interbank market.
FIGURE 11.1. Left panel: The 3m LIBOR referenced to USD, the 3m T-Bill rate and
the 10 year Treasury yield. Right panel: The TED spread (the di erence between the 3m
LIBOR and the 3m T-Bill rate) and the LIBOR-OIS spread (the di erence between the
3m LIBOR and the 3m OIS). The shaded areas mark recession periods identied by the
National Bureau of Economic Research.
On a historical note, the Federal Funds rate has been the object of much empirical research.
In an attempt to explain how the credit view contributes to growth more than Friedmans
monetary view, Bernanke and Blinder (1992) show that the Federal Funds rate makes the
predicting power of M1 growth insignicant, as we further review in Section 13.6 of Chapter
13. This nding initially spread enthusiasm about the ability of this rate to explain short-run
aggregate uctuations. However, as surveyed for example by Stock and Watson (2003), the
explanatory power of the Federal Funds rate evaporizes, once we condition on the term spread,
a fact we comment in Section 12.2.2 of the next chapter.
540
c
by
A. Mele
)=
1
1+(
) (
], is dened as the
(11.1)
This denition is intuitive, and is the most widely used in the market practice. For example,
LIBOR rates can be dened consistently with this way, with (
) being the initial investment
at that delivers $1 at .
11.2.2.2 Yield curves
The yield-to-maturity, or spot rate, for some maturity date is the yield on the zero maturing
at , denoted as ( ). It is the solution to the following equation,
(
)=
1
(1 + (
(11.2)
))
)=
ln
)=
+ )+
=1
Note that ( ) is xed at time . A par bond is one that quotes at parity, ( ) = 100%.
The par yield curve is the sequence of coupon rates ( ), for varying, that correspond to
the par:
( )
( )
( )= P
( )=1
(11.3)
( + )
=1
In other words, the coupon rates ( ) have to adjust to make the market happy to have
the coupon bearing bond quote at par, ( ) = 1. An interesting interpretation
of par-yield
P
is the following. Rewrite Eq. (11.3) as follows: 1
( ) = ( )
(
+ ). The
=1
right-hand side of this equation is the present value of the ow of known coupons, ( ),
receivables at the dates + 1, + 2 . The left-hand side is the present value of the ow
541
c
by
A. Mele
Val
X
1 =
( (
1
( +
+ ) for = 1 2
( +1
+ )
1)
1, receivable at
+ )) = 1
=1
In a forward rate agreement (FRA, henceforth), two counterparties agree that the interest
rate on a given principal (say $1) in a future time-interval [
] will be xed at some level
. The FRA works as follows: at time , the rst counterparty receives $1 from the second
counterparty; at time
, the rst counterparty pays back $ [1 + 1 (
) ] to the
second counterparty. The amount
is agreed upon at time . Therefore, the FRA makes it
possible to lock-in future interest rates. We consider simply compounded interest rates because
this is the standard market practice.
The amount for which the current value of the FRA is zero is called the simply-compounded
forward rate as of time for the time-interval [
], and is usually denoted as (
). We
can use absence of arbitrage to express (
) in terms of bond prices, as follows:
(
(
)
=1+(
)
(11.4)
Indeed, an investor in a zero from time to time is one who simultaneously makes (i) a
spot loan from to , and (ii) a forward loan from to . In the absence of arbitrage, it must
be the case that,
= [1 + ( )]
[1 + (
[1 + ( )]
|
{z
}
|
{z
} |
zero loan
spot loan
) (
{z
forward loan
)]
}
where ( ) is the spot rate at time for maturity . Eq. (11.4) follows by the denition of
( ) in Eq. (11.2).
Alternatively, consider the following portfolio implemented at time . Go long one bond
maturing at and short ( )/ ( ) bonds maturing at , for the time period [ ]. The
initial cost of this portfolio is zero because,
(
)+
(
(
)
)
)=0
At time , the portfolio yields $1, which originates from the bond purchased at time . At time
, we buy the ( )/ ( ) bonds shorted at , and maturing at the very same , which
542
c
by
A. Mele
( )=
X
=1
(1 + )
1
(1 + )
(11.5)
P
1
where
. Eq. (11.5) di ers from the price formula ( ) =
=1 (1+ ( )) + (1+ ( )) , by
utilizing the same discount rate to discount the future payments. Clearly, spot rates coincide
with the YTM on a zero, i.e. = ( ).
Next, suppose that coupon payments are the same for each ,
= say, and the payment
dates are set regularly. Eq. (11.5) then collapses to,
+
( )= 1
(1 + )
(11.6)
That is, the price of the coupon bearing bond, ( ), is a convex combination of that of a
the
bond price is closer to that of a
zero. If , ( )
1 , and if , ( )
1 . In the special case where = ,
)=
( )+
=1
543
c
by
A. Mele
where
is a sequence of coupons paid o over some dates , = 1
, and is the
frequency of coupon payments. For example = 2 for semiannual coupon payments, in which
1
1
case
as the rst available
1 = 2 and in general,
1 = . Finally, we dene
coupon payment date a bondholder would have access to after time , i.e. :
.
1
Due to the discreteness of coupon payments, discontinuities arise, because at time
, the
coupon is paid o , determining a discrete drop in the bond price. In other words, we have:
(
(
)+
( )+
for
= +1
)=
(11.7)
( )+
for =
= +1
is equal to,
, for =
)=
It is market practice to avoid consider the e ects relating to this drop in value while quoting
bond prices. Dene the accruals as that idealized portion of the coupon payments that occurs
between
1 and ,
1
Accr
0
Accr =
for
for =
1
1
( )+
) (11.8)
= +1
c
by
A. Mele
= 5 years.
102
100
98
Prices
96
94
92
90
Dirty price
Clean price
88
86
2
3
Calendar time, in years
The sharp increase in the dirty price is mainly due to the term 2 ( ) in Eq. (11.7). This
term increases roughly linearly over the payment dates, and overwhelms the increase in value
of the second term of Eq. (11.7). The drop is precisely 2 at each payment date, where the dirty
price equals the clean price.
The pattern in the picture is of course not unique. For example, the clean price decreases
over time when the yield curve is at at a continuosly compounded = 2%. When the level
of the yield curve is high, future coupons are heavily discounted such that the value of them
is low when time to maturity is high. When, instead, the level of the yield curve is low, the
current value of future coupons is high, and the clean price converges to 100 from above. We can
determine a threshold value for , say , such that the clean price is approximately constant at
100, using Eq. (11.6). Because we assume the yield curve is at at continuously compounded,
and = 2, we need to nd : = (1 + 2 )2 , with = 2%, whence = 3 96%.
X
=1
(1 + )
1
(1 + )
(11.9)
This function mimics how the market price ( ) would behave after the initial YTM
changed to some value and, naturally, is such that, (; ) = ( ). The function ( ; )
is thus the simplest bond pricing model we could ever formulate. It is perhaps too simple, as it
545
c
by
A. Mele
does not rely on absence of arbitrage. Nevertheless, we can use this very preliminary model to
say something about interest rate risk.
We dene a measure of risk of the bond based on the sensitivity of the hypothetical bond
price ( ; ) with respect to changes in . We answer the following question: What happens
to the hypothetical bond price ( ; ), once we perturb the one rate that discounts all the
payo s? The sensitivity we are dealing with is simply the rst partial of the bond-pricing
formula in Eq. (11.9) with respect to ,
( ; )=
1
1+
"
X
=1
1
+
(1 + )
(1 + )
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
YTM
FIGURE 11.2. The relation between the YTM and the bond price, and its rst-order
(duration) and second-order (convexity) approximations. The solid line depicts the price
of a zero coupon bond expiring in 10 years, as a function of the YTM, (1 + YTM) 10 ,
and the two dashed lines are rst-order and second-order Taylors expansions around
YTM = 5%.
11.3.1 Duration
We dene the Macaulay duration as,
DMac
X
( ; )
(1 + ) =
( ; )
=1
546
c
by
A. Mele
1/ (1 + )
( ; )
In words, the Macaulay duration is a weighted average of the payment dates. The weights
are the discounted coupons at the various payment dates,
/ (1 + ) , related to the current
market value of these coupons, i.e. the bond price ( ; ) when the YTM is . That is, the
weights are the proportionsPof the bonds present value that is attributable to the payo at
date . The weights satisfy
+ = 1. Therefore, DMac
. The Macaulay duration is
=1
a measure of how far in the future the bond pays o . For zeros, DMac = .
For small , DMac ( ) is simply the semi-elasticity of the bond price with respect to the YTM.
This semi-elasticity is also referred to as modied duration:
D
DMac
1+
D
=
+
.
A simple computation reveals that the modied duration, D, satises:
Therefore, the modied duration is decreasing in the YTM when the bond price is su ciently
convex in the YTM, which is surely the case for long-term maturity dates. Interestingly, the
modied duration is increasing in the YTM when the bond price is concave in the YTM,
a property featured by callable bonds and mortgage-backed securities (MBS, henceforth), as
explained in the next chapters (see also, Section 11.8.1 of this chapter, for a basic numerical
example). Intuitively, the incentives to proceed to early repayments kick in as the YTM
decreases, which makes the duration of the MBS decrease.
The Macaulay duration for continuously compounded rates is even simpler to calculate. First,
dene the continuously compounded YTM as the single number such that
(; ) =
=1
where (; ) is the market price of a bond paying o the principal of one at maturity and
the stream of payo s . Next, consider, the function 7
( ; ). Dene the semi-elasticity
of the bond price ( ; ) with respect to the continuously compounded YTM ,
( ; )
=
( ; )
+
( ; )
=1
X
=1
( ; )
( ; ) =
,
= ( ; ) and = ( ; ) . Note, the weights are such that
+ = 1. Therefore, the Macaulay duration for continuously compounded rates
=1
is equal to the semi-elasticity of the bond price with respect to the continuously compounded
YTM .3 This result may simplify some calculations.
where
P
3 Mathematically, we could have obtained this result in a straightforward manner, as follows. Dene the bond price function as
( ( )), where by denition, ( ) =
1. Hence,
( ( )) =
( ( )) 0 ( ) =
( ( ))
=
( ( )) (1 + ). It follows
that DMac =
(1+ )
547
c
by
A. Mele
11.3.2 Convexity
Convexity measures how the sensitivity, , changes with . It is the second partial of the bond
price with respect to ,
. Positive convexity means that the interest rate sensitivity declines
as increases, as in Figure 11.2. This properties arises because (
)=
0. Formally,
convexity is dened as,
C
such that we can expand the bond price as follows:
1
+ C ( )2
(11.10)
2
That is, for very convex securities, duration may not be a safe measure of return, as Figure
11.2 illustrates.
D
We can use duration to assess how exposed a bond portfolio is to movements in the interest
rates and, then, immunize this portfolio to interest rates changes. Duration is relevant to
asset-liability management. For example, pension funds have known streams of liabilities that
must be matched by the assets they hold. In words, the duration of the assets must equal
the duration of the liabilities. For example, in the UK, pension funds must mark-to-market
the liabilities. Therefore, one objective of these funds is to immunize their liabilities against
movements in the interest rates.
Alternatively, consider the following basic example. A bank borrows $100 at 2% for a year
and lends this money at 4% for 5 years, where the higher rate compensates for a variety of
factors such as business risk or the banks market power. Assuming that the banks borrower
does not default, the bank generates prots equal to $(4% 2%) 100 = 2 in the rst year, and
according to its books. However, the appropriate assessment of the current situation should not
make reference to past market conditions but, obviously, current. Suppose, for example, that
after one year, the interest rate for borrowing increases from 2% to 5%, and remains such for 4
additional years. This assumption is unrealistic, but it gives the idea of where the action is. The
045
market value of the assets is, then, 1001
= 100 09. Note, we discount at the 5% rate, as this is
1 054
4
the cost of capital for the bank. The market value of the liabilities is, instead, 100 1 02 = 102.
The banks problem is a duration mismatch.
Let us return to the pension fund example, and consider the following extreme case. In 30
years from now, a pension fund is due to deliver $100,000 to some future retiree. Suppose the
current market situation is such that the yield curve is at at 4%, such that the market value
of this liability is $100 000 (1 04) 30 = $30 832. Accordingly, the would-be retiree invests
$30 832 in the pension fund. So we have the following situation:
Cash
Pensions
$30 832 $30 832
4 Suppose,
at time 1 is
for example, that the bank wants to borrow $102 to pay o its liabilities, and for 4 additional years, then the prot
1 9057. Alternatively, the 5% interest rate is just an opportunity cost of capital, dened as
max {borrowing cost, lending rate}, where the borrowing cost is that the bank might obtain from other banks, for example.
548
c
by
A. Mele
Suppose, now, that the pension fund does not invest this cash. This strategy is of course
ine cient, but it is precisely the point of this exercise to see why it is so.
Consider two extreme cases, occurring under two scenarios underlying developments in the
xed income market. In one week,
(i) Scenario : the yield curve shifts up parallely to 5%. Accordingly, the value of the liability
for the pension fund becomes: $100 000 (1 05) 30 = 23 138.
Cash
$30 832
Prot
$7 694
Pensions
$23 138
(ii) Scenario : the yield curve shifts down parallely to 3%. Accordingly, the value of the
liability for the pension fund is: $100 000 (1 03) 30 = 41 199.
Cash
$30 832
Loss
$10 367
Pensions
$41 199
A drop in the yield curve results in a loss for the pension fund: when interest rates go down,
the pension fund faces a challenging situation as it has to honour its obligations in 30 years, but
the nancial market yields less than it promised one week earlier. Naturally, the pension fund
would face the opposite situation were interest rates to go up, as in the rst scenario above.
There are many good ethical reasons we might dislike pension funds have to experience
interest rate volatility. The volatility in this very basic example stems from the simple fact
the pension fund receives $30 832, which it then puts under the pillow. The most e cient
way to erase volatility could have been to invest $30 832 in a 30 year bondat the market
conditions of 4%. This is perfect hedging, which relies on the assumption we have access to
such a long-term bond. How do we deal with situations in which we do not have access to such
a bond? The next sections illustrate these cases.
11.3.3.2 Hedging
Let us consider a portfolio of two bonds with di erent durations. Its value is given by,
=
(1 ) +
(2 )
where 1 (1 ) and 2 (2 ) are the market value of the bonds, 1 and 2 are the YTM on the
bonds and, nally, 1 and 2 are the quantities of bonds in the portfolio. Let us consider a small
change in the two YTM 1 and 2 . We have,
=
[ 1 D (1 )
(1 ) 1 +
549
2 D (2 )
(2 ) 2 ]
c
by
A. Mele
The question is: How should we choose 1 and 2 such that the value of the portfolio remains
the same, even after a change in 1 and 2 ?
Let us assume a parallel shift in the term structure of interest rates. In this case, 1 = 2 .
The portfolio is said to be immunized if its value
does not change when 1 and 2 change,
i.e.
= 0, which is true when,
1
D (2 )
D (1 )
(2 )
1 (1 )
2
(11.11)
A useful interpretation of this portfolio is that we may be holding a bond with some duration,
say we hold 2 units of the second bond. Given these holdings, we may wish to sell another
bond, possibly with a lower duration, to hedge against movements in the price of the bond we
hold.
Alternatively, we can think of the second asset as a liability, with a value that uctuates after
interest rates change. Then, we may wish to purchase some asset to hedge against the liability.
Mathematically, 2
0 and 1
0. Moreover, Eq. (11.11) reveals that the number of assets
to hold to hedge against the liability is high if the ratio of the two durations of the assets,
D (2 )/ D (1 ), is large. In this case, the hedging position is obviously ine cient. Asset-liability
management, and immunization, is costly when we hedge high-duration liabilities with low
duration assets. We now illustrate these claims through a few basic examples.
11.3.3.3 A rst example: hedging zeros with zeros
Suppose that we hold one bond, a zero with maturity equal to 5 years. We want to hedge
this risk through another bond, a zero with maturity equal to 1 year. Let us assume that the
term-structure is at at 5%, discretely compounded. Then,
1
1
DMac (1 )
1
= 0 95238 D (1 ) =
= 0 95238
=
=
1 + 1
1 + 0 05
1 + 1
1 + 0 05
1
1
DMac (2 )
5
= 4 7619
D (2 ) =
=
2 (2 ) =
5 =
5 = 0 78353
1 + 2
1 + 0 05
(1 + 2 )
(1 + 0 05)
1
(1 ) =
and:
D (2 ) 2 (2 )
4 7619 0 78353
1 = 4 1135
2 =
D (1 ) 1 (1 )
0 95238 0 95238
That is, to hedge the 5Y zero, we need to short-sell approximately four 1Y zeros. The balance
of this hedging position is,
1
(1 )
(2 )
3 1341
(11.12)
This is quite an ine cient hedging position. One reason it is ine cient is that hedging longterm bonds with short implies we should rebalance too often. Moreover, as time goes on, the
sensitivity of the short-term bonds to changes in the YTM is very small (eventually, the price
equals face value plus coupon, at maturity), compared to that of long-term bonds. Therefore,
rebalancing becomes increasingly severe as time unfolds.
Next, we study how the value of this portfolio changes after large changes in the YTM.
By the assumption that the initial term-structure is at at 5%, 1 = 2 = 5%. Moreover, by
rearranging Eq. (11.12),
2
3 1341
(11.13)
c
by
A. Mele
The left hand side of Eq. (11.13) is the price of the 5Y bond. The right hand side is the value
of a replicating portfolio, which consists of (i) approximately 4 units of the 1Y bond, and (ii)
the balance of the hedging position. Precisely, the right hand side is a net obligation: the value
of the assets we need to purchase back (approximately 4 units of the 1Y bond), net of some cash
we already have ($3.1341), which we can use to partially purchase these assets. In other words,
we can interpret this position as a trade, where we buy the 5Y bond and sell approximately
four 1Y bonds at some initial time 0 . Come some time 1
0 , we liquidate, by reversing the
position.
If interest rates do not change, then, approximately, and abstracting from passage of time,
there will be no prots or losses, once we liquidate, or mark-to-market, this position. If interest
rates change, 6= 5%, Eq. (11.13) can only approximately hold,
2
( )
4 1135
( )
3 1341
Figure 11.3 plots the left hand side and the right hand side of this relation.
1.0
0.9
0.8
0.7
0.6
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
YTM
FIGURE 11.3. Dashed line (top): The price of the 5Y zero, 2 ( ) = (1+1 )5 , where is
the YTM. Solid line (bottom): The value of the replicating portfolio consisting of (i)
4 1135 units of the 1Y zero, and (ii) the balance of the hedging position, which is equal
to $3 1341, i.e. 4 1135 1 ( ) 3 1341, where 1 ( ) = 1+1 is the 1Y zero price.
What is going on? We are hedging the 5Y zero by selling approximately four 1Y zeros. In a
neighborhood of = 5%, the value of the synthetic 5Y zero we sold, 4 1135 1 ( ) 3 1341,
behaves as 2 ( ). However, the 5Y zero displays more convexity than the synthetic bond.
This larger convexity implies that:
If interest rates go down, the price of the 5Y zero bond we hold increases more than the
value of the synthetic bond we sold.
If interest rates go up, the price of the 5Y zero bond we hold decreases less than the value
of the synthetic bond we sold.
551
c
by
A. Mele
+
In both cases, we make prots. Prots are, indeed, equal to ( 2+
( 2
1 1 )
1 1)
+
+
2
1
1 , where
2
1 and
2 are the prices of the 1Y and 5Y bonds. But by convexity,
increases more than 1 1 when interest rates go down, and 2 decreases less than 1 1 when
interest rates go up, or
0.
2
1
1
Note that this is not an arbitrage opportunity! The previous reasoning hinges on the assumption of a parallel shift in the term-structure of interest rates, that is 1 = 2 , where 1 = spot
rate for 1 year, and 2 = spot rate for 5 years. While parallel shifts in the term-structure seem
empirically relevant, they are not the only shifts that are likely to occur, as explained in the
next chapter.
To sum up, duration hedging is a useful tool, but with quite important limitations. As
Eq. (11.10) makes clear, duration is only a rst-order approximation to the price of a bond.
Moreover, duration hedging obviously requires rebalancing, which might be substantial. As
we know, a conventional bond is strictly convex in the YTM. Therefore, for large changes in
the YTM, the duration-based hedging ratios should be updated. Re-adjustments are in order
anyway, independently of whether YTM change or not, as the duration of conventional xed
income securities obviously decreases over time.
As a second example of duration hedging, consider the barbell trading, which is a way to
hedge some liability (a bullet) with duration 2 through two assets with durations 1 and
3 , where
1
2
3 that is a trade where we sell
2 and buy
1 &
3 . This trade is
expected to work as soon as the yield curve attens, with its short-end part not going high too
much. Moreover, investing in the short-term segment of the yield curve, allows one to invest
elsewhere relatively rapidly once the rst asset expires, were the bond market to go down.
[Mention di erences with atteners and steepeners]
To illustrate a barbell trade, consider the example in the previous section, and suppose that
another bond is available for trading, a zero with maturity equal to 10 years. We aim to hedge
against movements in the price of the 5Y zero with a portfolio consisting of (i) one 1Y zero and
(ii) the 10Y zero. We keep on assuming that the yield-curve is at at 5%, and only consider
parallel shifts in the term-structure of interest rates. We consider extensions below.
Such a buttery trade can be implemented as follows. We look for a portfolio of the 1Y and
10Y zero with the following properties: (i) the market value of the portfolio equals the market
price of the 5Y zero,
(11.14)
2 (2 ) =
1 (1 ) 1 + 3 (3 ) 3 ;
and (ii) the local risk of the portfolio equals the local risk of the 5Y zero,
D (2 ) 2 (2 ), i.e.:
D (2 )
(2 ) = D (1 )
(1 )
+ D (3 )
(3 )
(2 )
2 =
(11.15)
(2 )
3 (3 )
(11.16)
D (3 )
D (3 )
D (2 )
D (1 )
(2 )
1 (1 )
2
D (2 )
D (3 )
D (1 )
D (1 )
By the same calculation in the example of the previous section, we have that
and D (3 ) = 9 5238. Using the gures in the previous example, we calculate
(11.16),
1
552
4 7619
9 5238
3
1
(3 ) = 0 61391
and 3 in Eqs.
0 95238 0 78353
= 0 56724
0 95238 0 61391
c
by
A. Mele
Figure 11.4 depicts the behavior of the bullet price and the market value of the barbell as
we change the YTM. Note that the barbell portfolio is more convex than the bullet. Moreover,
the barbell trade is self-nanced. By construction, the value of the bullet we sell equals the
value of the barbell portfolio. Therefore, large movements in the YTM lead to prots under the
assumption of parallel shifts in the yield curve.
Note that in this example, as in that of the previous section, the direction of interest rate
movements does not matter for value creation. A convexity trading such as this might be the
basis of a standard non-directional strategy, resembling one where, say, we go long a number
of undervalued stocks and short a number of overvalued stocks such that the initial value
of the portfolio is zero. Then, we likely make prots: in good times, the undervalued stock
should increase in value more than the overvalued, and in bad times, the drop in value of the
undervalued stock should be less severe than that of the overvalued. For the barbell, the driver of
value is convexity: as Eq. (11.10) illustrates, the convexity term, C, is, trivially, always positive,
independently of the sign of
. Therefore, as soon as we hedge a bond with a portfolio that
has the same duration as the given bond, but higher convexity, the position leads to prots,
given the assumptions made so far.
Naturally, a barbell trade does not lead to an arbitrage. The P&L summarized by Figure
11.4 relies on the assumption of parallel shifts in the yield curve. However, and as explained in
the next chapter (Section 12.3), it is not realistic to assume that large and parallel movements
in the yield curve. Historically, large shifts, occurring over long horizons, are accompanied by
changes in the yield curve shape. In other words, factors a ecting parallel movements in the
yield curve are frequent, albeit not the only ones. At least three factors are needed to explain
the entire variation of the yield curve.
$ 1.0
0.9
0.8
0.7
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
YTM
FIGURE 11.4. Barbell trading. Dashed line (bottom): The price of the 5Y zero, 2 ( ) =
1
, where is the YTM. Solid line (top): The value of the barbell portfolio consisting
(1+ )5
of (i) 0 45706 units of the 1Y zero and (ii) 0 56724 of the 10Y zero, i.e. 1 ( 1 ) 0 45706 +
1
1
3 ( 3 ) 0 56724, where 1 ( ) = 1+ is the 1Y zero price and 3 ( ) = (1+ )10 is the 10Y
zero price.
553
c
by
A. Mele
Table 11.4 considers the case of non-parallel shifts in the term-structure. We assume that the
initial term-structure is not at. Then, we consider two scenarios: (i) A twist in the termstructure, i.e. long-term rates lower than short-term; (ii) a steepening of the term-structure.
TABLE 11.4.
YTM
Initial term-structure
1Y
5Y
10Y
1 = 4%
2 = 5%
3 = 6%
Bullet price
Mod. dur.
Barbell value =
1 1 (1 ) + 3 3 (3 )
(1 ) = 0 961 D (1 ) = 0 961
2 (2 ) = 0 783 D (2 ) = 4 762
3 (3 ) = 0 558 D (3 ) = 9 434
1
1 = 6%
2 = 5%
3 = 4%
(1 ) = 0 943 D (1 ) = 0 943
2 (2 ) = 0 783 D (2 ) = 4 762
3 (3 ) = 0 675 D (3 ) = 9 615
1
1 = 4%
2 = 5%
3 = 7%
(1 ) = 0 961 D (1 ) = 0 961
2 (2 ) = 0 783 D (2 ) = 4 762
3 (3 ) = 0 508 D (3 ) = 9 346
1
c
by
A. Mele
The previous convexity trades are examples of yield curve arbitrage strategies. They may
purely rely on convexity or, as discussed in the previous section, on directional views about
interest rate movements. For example, we have explained, we may short ve year bonds, and go
long two- and ten-year bonds, as we view that short-term interest raise will raise and medium
term interest rates will lower. This buttery strategy is somehow cheap, intellectually, and
not necessarily rewarding, and will be further analyzed in Section 11.6.6.
Swap spread arbitrage is a popular strategy. It was responsible of leading LTCM to a loss of
about $1.6 billion in 1997. The strategy works as follows: (i) enter a swap paying the oating
LIBOR, , and receiving a xed rate ; (ii) short a par Treasury with the same maturity as
the swap, thus paying the xed coupon rate
, and invest the proceeds at the repo rate .
Thus, the payo of the strategy is the xed spread to be received, =
, and the oating
spread to be paid,
=
. So we go long or short this strategy according to whether we
view
to be larger or smaller than the average oating spread
over the strategy horizon.
Historically, the spread
has certainly been volatile, but quite stable, so it is a reasonable
strategy. The problem occasionally, though,
can attain quite large values.
Trading strategies more sophisticated than the previous ones rely on models, aiming to identify points of the yield curve that are misaligned from those predicted by the models. All in all,
the strategy is to buy the cheap and short the model-based rich, where the model-based rich
is replicated through a portfolio with cash and the bonds that are well-priced by the model,
weighted with model-based delta, as in the derivation of the bond pricing formula in Section
12.4.2.2 of the next chapter.
What happens when bond prices have negative convexity? In the next chapter, we shall see
that the value of a callable bond can be concave in the short-term rate. A similar feature is
displayed by mortgage-backed-securities (MBS, henceforth), which can now be concave in the
YTM! The reason for this negative convexity is that early repayments are likely to occur as the
YTM decreases, which entails two inextricable consequences: (i) the price of the MBS increases
less than a conventional bond price after a decline in the YTM, especially when the YTM is
low; (ii) the duration of the MBS decreases as the YTM decreases.
Hedging against MBS might lead to an increased volatility in rate markets. The mechanism
is the following. Institutions that are long MBS would typically short conventional bonds for
hedging purposes, consistently with the prediction of Eq. (11.11). However, the duration of
MBS increases as interest rates increase, due to negative convexity: Duration = Convexity.
Therefore, an interest rate increase can lead these institutions to short additional conventional
bonds, which worsens liquidity and leads to a further increase in the interest rates, thereby
feeding a vicious circle. Perli and Sack (2003) estimate that in 2002 and 2003, this mechanism
may have amplied the volatility of long-term US rates by a factor between 15% and 30%. It is
an instance of what is sometimes dened as endogenous risk, the circumstance that the trend
of a certain economic variable triggers actions from market participants that in turn, reinforce
the initial trend, as in the case of the 1987 crash discussed in Section 10.4.5 of Chapter 10, or
in the case of assets sell-o in times of crisis, discussed in Section 13.5.3 of Chapter 13.
555
c
by
A. Mele
"up"
state
p
1-p
"up","down"
Today
"down","up"
p
1-p
"down"
state
1-p
"down","down"
F irst period
Second period
The previous diagram can be used to price options written on stocks. The stock price unfolds
through the branches of the tree. Then, we gure out the no-arbitrage movements of the option
price along the tree. Suppose, however, we wish to price an option written on a zero, a 3 Year
zero say. Can we apply the same methodology to price the option? The answer is no, and the
reason is that we cannot exogenously track the movements of the prices of the zero, as in the
case of the stock price. Instead, after one year, the 3 Year zero becomes a 2 Year zero, i.e. quite
a di erent asset.
These issues can be mitigated by modeling the movements of the entire yield curve. There are
two approaches, as in the diagram below. In the rst, we model the dynamics of the short-term
rate, dened as the interest rate on a loan with maturity equal to the time intervals in the
tree. The resulting model, referred to as model of the short-term rate, has implications in terms
of the movements of the entire term-structure. This approach, developed in the next section,
leads to evaluation formulae in which the current price of the zeros predicted by the model are
556
c
by
A. Mele
not necessarily equal to the market prices. A second approach, based on calibration, leads to
the so-called no-arbitrage models, where we model the dynamics of the entire term-structure.
This approach gives rise to option evaluation formulae in which the current prices of the zeros
predicted by the model are equal to the market prices. We describe this approach in the last
sections of this chapter, using binomial trees, with the next chapter developing their continuous
time counterparts.
Models of the short-term rate
Interest rates
No-arbitrage
Input
No-arb models
Market prices
Prices, not
market prices
Output
No-arbitrage
Interest rates
Consider a two-period, two-state tree, where the current short-term rate is . The development
of the short-term rate is uncertain. That is, the future short-term rate, , is random, and can
take two values: either + with (physical) probability , or
with probability 1 . We assume
that +
.
+
( + )
%
1
&
c
by
A. Mele
the portfolio in the rst period equals the value of the zero we wish to price. The appendix
develops the arguments, and shows that in the absence of arbitrage, there is a function of at
most, say , such that the expected excess return on the bond satises:
[ (
)]
(1 + )
)=
|
( )
Vol (
{z
)
}
= price volatility
(11.17)
|{z}
price volatility, because it measures the amplitude of the price variation due to changes in the
( )
, i.e. the price-sensitivity, where this price sensitivity is
short-term rate in the future,
[ (
)]
( )
Vol (
(1 +
(
))
)=
)=
Vol ( )
Once we assume is solution to a stochastic di erential equation, we can use Itos lemma, to
turn the previous equation into a partial di erential equation subject to a boundary condition
that states the the bond price is one at expiration. Chapter 12 contains a rigorous discussion
of these topics.
Eq. (11.17) can be cast in an alternative format that it equally easy to interpret. Rearranging
terms leaves:
(
) ( + ) + [1 (
)] (
)
[ ( )]
(
)=
=
(11.18)
1+
1+
where
is the risk-neutral probability.
Let us add a few considerations. We expect that
0 because bond prices are decreasing
5
in the short-term rate here. Then,
. Hence, the risk-neutral probability of an
5 To
558
and (ii)
. That is,
c
by
A. Mele
upward movement of the short-term rate, , is higher than the true probability, . An investor
who goes long a bond, is concerned by an increase of the short-term rate in the future and,
hence, corrects the true probability by assigning a higher risk-adjusted probability to the
upward state.
It is the FTAP again. There is not arbitrage if and only if there is a risk-neutral probability
such that prices earn an expected return equal to the short-term rate. As explained in the
rst part of these lectures, it does not mean that the market is risk-neutral. The market,
instead, prices assets by discounting the assets future payo s in a generous way, through . To
compensate for this, the expectation in the numerator is taken under a probability which overweights the bad states of nature, so as to lower back asset prices. It is like we wished to price
assets in Planet Earth, a planet full of frictions, risk-aversion and the like, which are reected
in relatively poor asset evaluation. Then, we search for another planet, which has no frictions
or risk-aversion, where the asset is evaluated through actuarial methods. To reconcile prices in
these two planets, we need the idealized and peaceful planet to display a distorted probability,
which assigns relatively higher chances to bad events than Planet Earth does.
11.4.1.2 One example
Assume the current short-term rate equals 10%. We know that in one year, and with (physical)
probability , will increase by 2 percentage points, and with probability 1
, it will decrease
by 2 percentage points. Finally, with the same probability , the short-term rate prevaling from
the next year to two years time, will increase by 2 further percentage points from its previous
value in one year time. We take the probability of an upward movement to be 20% and the
absolute value of the Sharpe ratio to be 30%. Given these data, we use the formula, and obtain
an estimate of the risk-neutral probability of an upward movement of the short-term rate, equal
to =
= 20% ( 30%) = 50%.
Pricing zeros
Let us price a zero maturing in two years, hinging upon the following tree:
r = 14%
q=
1
2
r = 12%
r = 10%
r = 10%
r = 8%
r = 6%
1 Y ear
559
2 Y ears
c
by
A. Mele
We can use Eq. (11.18) to ll-in each node of the tree. We start from the end of the tree,
where the price of the two year zero is $1, and then use Eq. (11.18) to ll every node, as
illustrated in Figure 11.5. In one year time, the price of the zero is simply one divided by the
discount factor prevailing at the beginning of the next year. The price we are looking for is
obtained by applying Eq. (11.18) yielding
[ ( 2)]
=
1+
2) + (1
1+
) (
2)
1
2
(0 8928) + 12 (0 9259)
= 0 8267
1 10
P=1
q=
1
2
1
1.12
= 0.8928
0.8267
P=1
1
1.08
= 0.9259
P=1
1 Y ear
2 Y ears
FIGURE 11.5.
Convexity e ects
Does the two-year spot rate equal to 10%? It is a natural and, as we shall see, interesting
question, as the short-term rate is a martingale under , in this particular example. However,
the answer to the previous question is in the negative. Let us elaborate. The two-year spot rate,
(0 2), satises, 0 8266 = (1 + (0 2)) 2 , or (0 2) = 9 98%. That is,
1
1
1
1
0 8267 =
= 0 8264
1+
1+
1 + 1 + ()
In other words, suppose the interest rate is known with certainty to be 10% in the second period.
Then, the price should be 0 8264, because 1 + (). Price, then, increase upon activation of
uncertainty. Its a convexity e ect, which we shall explain in deeper detail the next chapter
(Section 12.4.5.1).
11.4.2 Tree pricing
We can generalize the tree to a multiperiod case. We use Eq. (11.18) to evaluate zeros at all nodes
of the tree and maturities. Given , which can be estimated once we estimate and , we use
recursively Eq. (11.18), and calculate prices of zeros. We can, then, price any derivative written
on these zeros. The drawback of this approach is that the initial term structure is predicted
560
c
by
A. Mele
with error! Let us illustrate with a concrete numerical example. Consider the tree in Figure
11.6, where the current short-term rate for one year is = 4%. Also shown in this tree is the
price of a hypothetical 3 Year zero at the expiration date, = 3, and at = 2. At = 3, = 1
in all states of nature. At = 2, the price is (
) = [ ( )] (1 + ) = 1 (1 + ), for
= 6%, 4% and 2%. The issue, now, is how to determine the price of the zero in correspondence
of the remaining nodes. We should use the formula, (
) = [ ( )] (1 + ) to populate
the tree, but we do not know , , and . Suppose we estimate and . In this case, we
determine simply as =
, as in Eq. (11.18), and we come up with = 12 . The following
diagram gives the price of the zero in all the nodes at time = 1, and at the evaluation time
= 0, yielding a price of the 3 Year zero equal to 0.8893.
Next, consider a European call option written on the 3 Year zero, with expiration date equal
to 2 and strike price
= 0 95. The following diagram gives the value of the option predicted
by the model at each node of the tree. The model predicts that the current price of the call
option is 0 0124.
11.4.3 Introduction to calibration
Calibration is a procedure by which we search for a given models parameter values, such that
the models predictions coincide with selected empirical counterparts. For example, in the real
business cycle literature reviewed in Chapter 3, we might calibrate a models parameters to
ensure that the correlation of asset returns with output is the same as that in the data. In
this chapter, calibration is a procedure by which we search for a given models parameters that
make the model price equal to the market.
11.4.3.1 Motivation
The model we are dealing with in the previous section predicts that the price of the 3 Year
zero is equal to 0.8893. However, there is no guarantee that this model-implied price equals the
market price of the 3 Year zero. Suppose, instead, that the market price of the 3 Year zero,
$ say, equals 0.8700. What should we do to make the model-implied price of the 3 Year zero
equal to the market price? The question is important: how can we trust an option pricing model
that is not even able to pin down the initial market value of the asset underlying the option
contract? Alternatively, suppose the option is a bespoke product of a bank. The banks client
might question why the banks evaluation model predicts a price of the bond to be so high
(0.8893), compared to the observable and cheap market price (0.8700).
561
c
by
A. Mele
1
= 2
P=1
r = 6%
q=
1
2
1
1.06
Puu =
= 0.9433
r = 5%
Pu =
q=
qPuu +(1q)Pud
1.05
P=1
= 0.9070
1
2
r = 4%
P =
1
= 2
1
= 2
r = 4%
qPu +(1q)Pd
1.04
= 0.8893
q=
1
2
1
1.04
Pud =
= 0.9615
r = 3%
Pd =
qPud +(1q)Pdd
1.03
= 0.9427
P=1
r = 2%
Pdd =
1
1.02
= 0.9804
P=1
t=0
t=1
t=2
t=3
We look for perfectly tting trees, that is, those trees where risk-neutral probabilities and/or
the values of the short-term rate in each node are not given in advance but, rather, are such that
the initial yield curve is tted without error. These trees are called implied binomial trees
implied by the market prices. Let us consider the example in the previous section. To make
the model-implied price of the 3 Year zero equal to the market price, $ = 0 8700, we cannot
take the risk-neutral probability as given, i.e. independent of the observed price $ = 0 8700,
as we did before. Rather, we should calibrate the probability , as follows,
$
= 0 8700 =
1
[
1 04
(5%) + (1
562
(3%)]
(11.19)
c
by
A. Mele
2
q=
r = 5%
Cu =
q=
qCuu +(1q)Cud
1.05
= 0.0055
1
2
r = 4%
C=
qCu +(1q)Cd
1.04
= 0.0124
1
2
q=
r = 3%
Cd =
qCud +(1q)Cdd
1.03
= 0.0203
t=0
t=1
t=2
where 1 (5%) and 1 (3%) are the prices of the zero at time = 1, in the events that the
short-term rate is up to 5% or down to 3%.
The previous equation follows, again, by Eq. (11.18). Note, now, that the unknown is not
the price, which is instead given by the market price. Rather, we are looking for, or calibrating,
the probability that makes the RHS of Eq. (11.19) equal to its LHS. Naturally, we need to
calculate the prices of the zeros 1 (5%) and 1 (3%). These prices can be found by another
application of Eq. (11.18), as follows,
(5%) =
0 9433 + (1
1 05
) 0 9615
1
= 0 8700 =
1 04
0 9433 + (1
1 05
(3%) =
(5%) and
) 0 9615
563
+ (1
0 9615 + (1
1 03
) 0 9804
0 9615 + (1
1 03
) 0 9804
c
by
A. Mele
= 0 8779. Hence, we
The next diagram depicts the implied binomial tree, i.e. the tree that we obtain after we
match the model-implied price of the 3 Year zero to the market price, $ = 0 8700.
P=1
r = 6%
79
7
0.8
q=
Puu = 0.9433
r = 5%
Pu =
79
7
0.8
q=
qPuu +(1q)Pud
1.05
P=1
= 0.9005
r = 4%
P =
r = 4%
qPu +(1q)Pd
1.04
= 0.8700
q=
7
.87
Pud = 0.9615
r = 3%
Pd =
qPud +(1q)Pdd
1.03
P=1
= 0.9357
r = 2%
Pdd = 0.9804
P=1
Note that 1 (5%) and 1 (3%) are quite away from the values we found earlier whilst imposing
that = 12 . In the implied tree, they are smaller than those obtained with = 12 , state by
state. This is because in the implied tree, = 0 8779, such that the model can match a lower
initial price, 0 8700. The implied tree puts more weight on those states of nature where the
short-term rate is high or, equivalently, bond prices are low. We expect the price of the option
in the implied binomial tree to be lower than that we found earlier, because call option prices
decrease with the underlying. Let us perform the calculations, by relying on the implied binomial
tree depicted in the next diagram.
564
c
by
A. Mele
K = 0.9500
Cuu = (Puu K)
= 0.0000
q=
r = 5%
Cu =
q
77
.8
=0
qCuu +(1q)Cud
1.05
= 0.0013
r = 4%
C=
Pud = 0.9615
qCu +(1q)Cd
1.04
= 0.0026
9
877
K = 0.9500
Cud = (Pud K)
= 0.0115
0.
q=
r = 3%
Cd =
qCud +(1q)Cdd
1.03
= 0.0134
Pdd = 0.9804
K = 0.9500
Cdd = (Pdd K)
t=0
t=1
= 0.0304
t=2
The calculations in the previous diagram reveal indeed that the option price predicted by the
implied binomial tree is 0.0026, which is one order of magnitude less than the option price we
nd earlier, 0.0124! The interpretation of this result relates, again, to the implied risk-neutral
probability, which is much larger than = 12 . The implied tree puts a relatively large weight on
events where the short-term rate is high or bond prices are low, which reduces the likelihood
the option will be exercized leading then to a small option price.
11.4.3.3 Another zero
We might not be done yet. Let us go back to the zero pricing problem, and assume we observe
the price of a 2 Year zero, and that this price equals 0.9200, a reasonable gure. Is there any
chance that the inputs to the pricing problem for the 3 Year zero could also lead to t the 2
Year zero without errors? Of course there isnt. Indeed, in the next diagram, we use the inputs
to the 3 Year zero, and Eq. (11.18), and nd that the price of the 2 Year zero implied by the
price of the 3 Year zero is equal to 0.9178. Unless the market price happens, by chance, to equal
0.9178, we cannot simultaneously t the price of the 3 Year and the 2 Year zeros.
To simultaneously t the price of the 3 Year and the 2 Year zeros, we should implement at
least one of the two strategies: (i) to make the probabilities time-varying; (ii) to calibrate the
entire structure of the short-term movements in Figure 11.6. We implement the rst of these
two strategies in the next subsection. We develop the second strategy in Section 11.5.
565
c
by
A. Mele
P=1
.877
q=0
r = 5%
1
= 0.9523
Pu = 1.05
r = 4%
P =
qPu +(1q)Pd
1.04
P=1
= 0.9178
r = 3%
1
= 0.9709
Pd = 1.03
P=1
t=0
t=1
t=2
We build up implied binomial trees in more general cases, arising in the presence of several
bond prices to be matched. Suppose the time interval is six months, such that the short-term
rate is for six months. The current short-term rate is 3 99%, annualized. It can change to either
4 50% or to 4 00%, with equal (physical) probability. Suppose that two zeros are available for
trading: a 6M zero and a 1Y zero, where the current price of the 1Y zero is 0 95974. What is the
risk-neutral probability implied by this tree? This probability must be such that, the price of
all the zeros are matched exactly. Figure 11.7 depicts the tree corresponding to this situation.
566
c
by
A. Mele
p=
r=
1
2
r=
4.50%
2
r=
4.00%
2
3.99%
2
t=0
t = 0.5
FIGURE 11.7. The dynamics of the short-term rate: high interest rate scenario
= 0, equals, $ (0 0 5) = 1 1 + 0 0399
=
0
9804. This price is actually observed. That is, the
2
current short-term rate, 3.99%, is a mere denition. Next, we proceed to nd the no-arbitrage
movements of the 1Y zero, which are displayed below.
1
4.50%
2
1/(1 + 0.045
)
2
r=
Pu (0.5, 1) =
p=
r=
1
2
= 0.9779
3.99%
2
P$ (0, 1) = 0.95974
4.00%
2
1/(1 + 0.040
)
2
r=
Pd (0.5, 1) =
= 0.9804
t=0
t = 0.5
567
t =1
c
by
A. Mele
Note, the current market price, $ (0 1) = 0 95974, is less than the expected price to prevail
tomorrow, discounted at the current interest rate,
1
1
1
1
0 9779 + 0 9804 = 0 9599
[ (0 5 1)] =
1+
2
2
1 + 0 0399
2
Hence, = 12 cannot be the risk-neutral probability. To nd out the risk-neutral probability,
we proceed as follows. In the absence of arbitrage,
$
(0 1) = 0 95974 =
1
1+
0 0399
2
[ 0 9779 + (1
) 0 9804]
with obvious notation. This is one equation with one unknown, , which is solved by = 0 605.
We may now proceed with pricing derivatives. Consider a European call option on the 1Y
zero, with expiration date in six months and exercise price equal to 0 9785. Its payo is as
depicted below:
1
P (0.5, 1) = 0.9779
05
0.6
q=
r=
Cu = (P (0.5, 1) K)+ = 0
3.99%
2
C =?
P (0.5, 1) = 0.9804
Cd = (P (0.5, 1) K)+ = 0.0019
t=0
t = 0.5
t=1
1
1+
0 0399
2
[ 0 + (1
(11.20)
What happens when the short-term rate does not evolve as in the diagram of Figure 11.7
but, instead, as in Figure 11.8?
568
c
by
A. Mele
r=
r=
3.99%
2
r=
t=0
4.00%
2
t = 0.5
FIGURE 11.8. The dynamics of the short-term rate: low interest rate scenario
The previous tree is one where the short-term rate in the upper state of the world equals
= 4 4154%, not 4 50%, as in Figure 11.7. It implies that the price of the 1Y bond in 6 months
in this state is:
1
1
=
= 0 9784
up (0 5 1)
4
1+ 2
1 + 4154%
2
The risk-neutral probability, , solves:
$
(0 1) = 0 95974
1
[ up (0 5 1) + (1
=
1+
1
=
[ 0 9784 + (1
1 + 0 0399
2
down
(0 5 1)]
) 0 9804]
The solution is, = 0 756, which is higher than the solution we found earlier using the tree in
Figure 11.7 (i.e., = 0 605). The option price is, now,
=
1
1+
0 0399
2
[ 0 + (1
(11.21)
The up-state of the world in Figure 11.8 is less severe than that in Figure 11.7. Why then is
the price in Eq. (11.21) smaller than that in Eq. (11.20)? To match the initial price $ (0 1) =
0 95974, the model in Figure 11.8 must put more weight on the up-state of the world, i.e. a
larger implied risk-neutral probability. This implies a larger risk-neutral probability that low
bond prices will arise in the future and, hence, a lower option price.6
In a segmented market, two investment banks might have di erent views about developments
in the short-term ratethe view in Figure 11.7 and that in Figure 11.8. The rst bank favours
a high interest rate scenario, but it is not too risk-averse to that scenario ( up = 4 5%,
= 0 605). The second bank favours a mild interest rate scenario, although it assigns a
1
6 Mathematically, we have that
), where
0. While down predicted by Figure
up
down
$ (0 1) = 1+ ( down
11.7 is the same as that in Figure 11.8, the bond price volatility, i.e. the di erence
, is lower in Figure 11.8 than in Figure
11.7. Therefore, the tree in Figure 11.8 is consistent with the given market price, $ (0 1), only when increases from 0 605 to
0 756.
569
c
by
A. Mele
greater chance of this scenario to arise ( up = 4 4154%, = 0 756). But then, naturally, both
institutions need to agree on the initial bond price, $ (0 1) = 0 95974. (The rst bank might
have, then, a quite conservative risk-management system although then its option prices are
higher than the second bank.) The segmentation could arise, for example, because the client`ele
of the rst bank and that of the second bank are unlikely to meet and, the prices for the
option charged by the banks are not publicly known. In the absence of market imperfections
(and arbitrage), however, the investment banks should agree on the option price too. Note,
nally, that the price in Eq. (11.21) is almost half of that in Eq. (11.20). Derivatives can be
quite nonlinear object, due to their optionality. A small deviation in the assumptions on the
short-term rate developments can lead to dramatic option pricing implications.
Let us add a period in the tree of Figure 11.7, assuming that the short-term rate is as in the
following diagram:
q1
q0
r=
=0
605
r=
t=0
4.90%
2
r=
4.30%
2
r=
3.90%
2
4.50%
2
3.99%
2
r=
=?
r=
4.00%
2
t = 0.5
t=1
FIGURE 11.9.
In this tree, 0 is the risk-neutral probability for the rst period, and 1 is the risk-neutral
probability for the second period. We already know that 0 = 0 605. The probability 1 is
the risk-neutral probability for the time-period (0 5 1), and can di er from 0 . Suppose, also,
that an additional zero is available for trading, a 1.5Y zero. The current price of this zero is
$ (0 1 5) = 0 9382. To derive the risk-neutral probability 1 , we calibrate the implied tree for
the 1.5Y zero, as follows.
570
c
by
A. Mele
r=
4.90%
2
?
q 1=
r=
q0
r=
5
.60
0.049
)
2
4.50%
2
Pu (0.5, 1.5) =?
r=
3.99%
2
4.30%
2
= 0.9789
?
q 1=
r=
0.043
)
2
4.00%
2
Pd (0.5, 1.5) =?
r=
3.90%
2
0.039
)
2
= 0.9808
t=0
t = 0.5
t=1
t = 1.5
1
1+
0 045
2
1
1+
0 040
2
0 9761 + (1
1)
0 9789]
(11.22)
0 9789 + (1
1)
0 9808]
(11.23)
The problem, 1 is not known. Therefore, Eqs. (11.22)-(11.23) do not allow us to pin down
the prices
(0 5 1 5) and
(0 5 1 5). But here is where calibration comes in. We know the
current price of the 1.5Y zero, which is, $ (0 1 5) = 0 9382. In the absence of arbitrage,
$
(0 1 5) = 0 9382 =
1
1+
0 0399
2
571
(0 5 1 5) + (1
0)
(0 5 1 5)]
c
by
A. Mele
(0 5 1 5) and
0 9382 =
0 0399
2
[0 605
(0 5 1 5) + 0 395
(0 5 1 5)]
= 0 605. So we
(11.24)
where
(0 5 1 5) and
(0 5 1 5) are as in Eqs. (11.22)-(11.23). Hence, by replacing Eqs.
(11.22)-(11.23) into Eq. (11.24) leaves one equation with one unknown, 1 . Solving, yields,
(0 5 1 5) = 0 9549, and
(0 5 1 5) = 0 9600.
1 = 0 8412, which implies that,
To sum up, we have the tree below.
1
r=
q 1=
r=
0
5
.60
q 0=
r=
18
0.84
4.90%
2
4.50%
2
3.99%
2
r=
q 1=
r=
18
0.84
4.30%
2
4.00%
2
3.90%
2
t = 0.5
t=1
t = 1.5
We are ready to evaluate derivatives written on these zeros. Consider, for example, a call
option on the 1.5Y zero, with expiration date in 1Y and exercise price equal to 0 9800. The
price of the option at time = 0 5, is either zero or = 0 00012, as illustrated below.
572
c
by
A. Mele
q1=
0.841
C = 0.00000
C=
t = 0.5
t=1
The no-arbitrage price of the 1Y call option on the 1.5Y zero, struck at
=
1
1+
0 0399
2
[0
+ 0 00012 (1
0 )]
= 0 9804 [0 00012 (1
= 0 9800, is:
0 605)] = 4 647 10
We can use the tree in Figure 11.9 to price additional derivatives, such as, say, a call option
on the 1.5Y zero, with expiration date in six months, and exercise price equal to 0 9580. We
have the following tree.
573
c
by
A. Mele
.60
0
q 0=
3.99%
2
C =?
r=
t=0
t = 0.5
1
1+
0 039
2
0 + (1
0)
11.4.3.5 Summing up
What have we done? Our starting point is the tree in Figure 11.9, which we use to recover the
two risk-neutral probabilities 0 (for the time span (0 0 5)) and 1 (for the time span (0 5 1)),
using the information about the market price of two zeros, the 1Y and the 1.5Y. Precisely, given
$ (0 1), the price of the 1Y zero, we recover 0 , as illustrated below:
1
Pu (0.5, 1)
q0
1
P$ (0, 1)
Pd (0.5, 1)
1
t=0
This is possible as
(0 5 1) and
in a straightforward manner.
t = 0.5
t=1
(0 5 1) do not depend on
574
c
by
A. Mele
0,
we determine
1,
using
q1
q0
Pu (0.5, 1.5)
P$ (0, 1.5)
Pd (0.5, 1.5)
1
Pdd (1, 1.5)
1
t=0
t = 0.5
t=1
t = 1.5
575
c
by
A. Mele
q2
q1
q0
x
x
P$ (0, 2)
x
x
x
1
t=0
t = 0.5
t=1
t = 1.5
t=2
FIGURE 11.10.
Suppose that a two year zero coupon bond is traded for a price equal to $ (0 2) = 0 95500.
We assume that the short-term rate evolves over time according to the tree described in the
following diagram.
576
c
by
A. Mele
r=3%
r = 2%
r = 3%
r = 2%
r=2%
r = 2%
r = 1%
t=0
t=1
t=2
t=3
Suppose that a European call option written on a three year zero coupon bond is traded.
This option has a strike price equal to 0 97000, expires in two years, and quotes for $ (0 2) =
1 0141 10 3 . We can use the price of this derivative, to nd the no-arbitrage price of a three
year bond which, every year, pays o 3% of the principal of 1 00. Precisely, we use the price
of the two year zero coupon bond to recover the risk-neutral probability applying to the rst
year, and the price of the option to recover the risk-neutral probability applying to the second
year. With these probabilities, we determine the no-arbitrage price of the three year bond. (We
assume the two probabilities are state-independent, for otherwise we would need the price of
additional assets to reverse-engineer state independent risk-neutral probabilities.)
So we know that $ (0 2) = 0 95500. Moreover, as illustrated below, we can extract the price
of the 2Y bond in the up- and down- states of the world at time = 1.
1
r = 3%
?
q 0=
Pu (1, 2) =
1
1.03
r = 2%
r = 2%
Pu (1, 2) =
1
1.02
1
t=0
t=1
577
t=2
c
by
A. Mele
(1 2) =
1
1 02
(0 2) = 0 95500
1
( 0 (1 2) + (1
=
1 02
1
=
( 0 0 97087 + (1
1 02
0)
0) 0
(1 2))
98039)
Solving for 0 , yields, 0 = 0 6607. We use this probability, and the price of the option, $ (0 2),
to solve for the risk-neutral probability relevant to the second period, as illustrated below.
1
r = 3.5% , K = 0.97
Puu (2, 3) =
?
q 1=
1
1.035
= 0.96618
r = 3%
7
0
.66
=0
q0
Cu =?
r = 3% , K = 0.97
r = 2%
Pud (2, 3) =
1
1.03
= 0.97087
r = 2%
Cd =?
r = 2% , K = 0.97
Pdd (2, 3) =
1
1.02
= 0.98039
t=0
t=1
t=2
t=3
In this tree, = 0 97000 is the strike price of the option. The option price at time = 1, in
the two states, can be either
or , where:
1
(2 3)
[ 1 max {
1 03
1
=
(1
1 ) 0 00087
1 03
1
[ 1 max {
=
(2 3)
1 02
1
[ 1 0 00087 + (1
=
1 02
=
1)
0} + (1
1 ) max {
(2 3)
0}]
0} + (1
1 ) max {
(2 3)
0}]
0 01039]
578
c
by
A. Mele
(0 2) = 1 0141 10 3
1
( 0 + (1
=
)
0)
1 02
1
=
(0 6607 + 0 3393 )
1 02
1
1
1
0 6607
(1
[
=
1 ) 0 00087 + 0 3393
1 02
1 03
1 02
0 00087 + (1
1)
0 01039]
q 1=
00
0
0.8
r = 3%
07
6
0.6
=
q0
Pu (1, 3) =?
r = 2%
r = 2%
Pd (1, 3) =?
t=0
t=1
t=2
FIGURE 11.11.
579
t=3
c
by
A. Mele
(0 2) and
(0 2), is
1
[ 0 (1 3) + (1
(1 3)]
0)
1 02
1
=
(0 6607 0 93895 + 0 3393 0 95370) = 0 92545
1 02
(0 3) =
We are now ready to evaluate the 3Y bond with 3% coupon rate. It is,
coupon=3% (0
3) = 0 03 [ $ (0 1) + $ (0 2) + (0 3)] + (0 3)
= 0 03 (0 98039 + 0 95500 + 0 92545) + 0 92545 = 1 0113
The discretely compounded yield curve implied by the previous calculations is given by
2
0 1 = 2 00% (1Y); 0 2 : 0 95500 = (1 + 0 2 ) , or 0 2 = 2 328% (2Y); and 0 3 : 0 92545 =
3
(1 + 0 3 ) , or 0 3 = 2 616% (3Y). Note, we are capable of computing the yield curve, without
knowing all bond data, but inverting some of them from the price of an option! We can go
further. Suppose the price of the missing bond becomes available, so to speak. We want to
make sure this price is consistent with absence of arbitrage. Suppose, for example, that the
market price is $ (0 3)
(0 3) = 0 92545, say. Then, we can sell short the 3Y zero, and set
up a dynamic, self-nancing strategy aiming to replicate the 3Y zero, i.e. capable of delivering
$1 at maturity.
We would proceed as follows. Consider the tree in Figure 11.11. We build up a portfolio, which
is long the option and a MMA. We assume the 3Y bond converges to the values
,
and
in Figure 11.11, for otherwise we might implement a trivial arbitrage from time = 2 to
= 3. At time = 0, we go long 0 options and 0 units of the MMA, so as to make sure that
the portfolio delivers
(1 3) in the upstate of
(1 3) in the downstatethereby ensuring
that the price of the bond is replicated at time = 1. The value of this replicating strategy is,
of course, (0 3), so by short-selling the 3Y bond at = 0, we obtain an initial prot, equal
to $ (0 3)
(0 3). Suppose, then, that at time = 1, we are in the up-node, such that the
bond price is
(1 3). In this node, we can build up another portfolio long 1 options and 1
units of the MMA, aiming to replicate the price of the bond at time = 2either
(2 3)
or
(2 3). The value of this replicating portfolio would be just
(1 3), which is what is
obtained by the replicating strategy at implemented at time = 0. The strategy is clearly
self-nanced, as the following calculations reveal. By construction, 0 (1) + 0 (1 + ) =
(1 3) = 1 (1) + 1 (with = 2%), and 1 (2) + 1 (1 + ) =
(2 3), where
= 3%, and: (2) (2 3) are either
(2)
(2 3), or
(2)
(2 3), at time = 2,
580
c
by
A. Mele
(2 3)
(1 3) =
(2)
(1)
+
1
(2)
=
(1)
1
(1) +
(1 3)
Likewise, if, instead, at time = 1, we end up in the down-node, where the bond price (and the
(1 3), we can invest
(1 3) in options
value of the strategy implemented at time = 0) is
and MMA so as to replicate the price of the bond at time = 2either
(2 3) or
(2 3).
The presence of dynamically complete markets allows us to implement an arbitrage.
11.4.4.2 Arrow-Debreu securities, and the pricing of interest rate derivatives
Arrow-Debreu securities are assets that only pay over a specic state of the world, as explained
in Chapter 2. We shall deal with these securities in more detail in Section 11.7, because they
allow us to implement perfectly tting models quite elegantly. This section is a rst introduction
to them. We make use of Arrow-Debreu securities to, rstly, extract risk-neutral probabilities,
and, secondly, to price quite basic interest rate derivatives, such as a caplet or a forward rate
agreement. Likewise, the pricing of interest rate derivatives covered in this section is a preliminary introduction, as the next chapter will systematically deal with it, within a continuous
time setting.
Extracting risk-neutral probabilities from Arrow-Debreu securities
Assume that the discretely compounded one-year rate, or the short-term rate, evolves over
time as described by the following tree:
q2
q1
q0
uu: r = 7 %
u: r = 6%
r = 5%
uuu: r = 7.5%
uud: r = 6%.0
ud: r = 5%
d: r = 4 %
udd: r = 4.5%
dd: r = 4 %
ddd: r = 3%.0
t=0
t=1
t=2
t=3
Assume three securities are available for trading: (i) a zero coupon bond expiring in two
years, quoting for 0 91000; (ii) a zero coupon bond expiring in three years, quoting for 0 86500;
581
c
by
A. Mele
(iii) an Arrow-Debreu security, paying o $1 only at time 3 in the state uuu of the previous
diagram, where the short-term rate equals 7.5%, quoting for 0 10000.
Assume that the risk-neutral probabilities of upward movements in the short-term rate change
over time and take three values, 0 , 1 and 2 , but are independent of the state of nature, as
illustrated in the previous diagram. We can calibrate these probabilities, through the previously
given available market data. First, we derive 0 using the price of the two year bond, $ (0 2)
say, as follows:
$
(0 2) = 0 91000 =
1
(
1 05
(1 2) + (1
0)
(1 2))
where
1
1
(1 2) =
= 0 94340
= 0 96154
1 06
1 04
Solving for 0 yields 0 = 0 33284. Next, we calibrate 1 to match the price of the three year
bond, $ (0 3) say. We have,
(1 2) =
(0 3) = 0 86500 =
=
1
(
1 05
(1 3) + (1
1
(0 33284
1 05
(1 3) + 0 66716
0)
(1 3))
(1 3))
where
1
(
1 06
1
(1 3) =
(
1 04
(1 3) =
(2 3) + (1
1)
(2 3))
(2 3) + (1
1)
(2 3))
and
(2 3) =
1
= 0 93458
1 07
(2 3) =
1
= 0 95238
1 05
i.e.
(3) = 0 10000 =
1 1 1
1 05 1 06 1 07
0 1 2
(2 3) =
1
= 0 96154
1 04
3 through
$
1 1 1
0 33284 0 66507
1 05 1 06 1 07
= 0 53798.
Next, we use the previously calibrated probabilities to price some interest rate derivatives. First,
consider a caplet contingent on the rates prevailing at time = 3, paying o at time = 4,
with strike rate equal to 5%, and notional value equal to $100. The payo of this derivative
instrument at = 4 is max {
5% 0}, where is the rate at = 3. Therefore, the discounted
payo s at time = 3 are:
uuu:
1
1 075
uud:
1
1 06
max {7 5
max {6
5 0} = 2 32560
5 0} = 0 94340
582
c
by
A. Mele
1
1 045
ddd:
1
1 03
max {4 5
max {3
5 0} = 0
5 0} = 0
ud:
1
1 07
1
1 05
dd:
u:
d:
=1
1
1 06
1
1 04
( 2 2 32560 + (1
2) 0
94340) = 1 5766
( 2 0 94340 + (1
2 ) 0)
= 0 48336
+ (1
1)
2)
= 1 1419
+ (1
1)
3)
= 0 30910
1
(
1 05
+ (1
0)
2)
= 0 55837
Next, consider a forward rate agreement, whereby at time = 0, two counterparties agree
that at time = 4, they will exchange with each other the variable short-term rate prevailing
at time = 3, against a xed interest rate equal to . We can use the previously calibrated
probabilities to determine the forward rate, i.e. the level of
that makes the value of this
agreement equal to zero at time = 0. Take the case of a payer forward agreement, one for
which the discounted payo s at time = 3 are:
uuu:
1
1 075
uud:
1
1 06
udd:
1
1 045
ddd:
1
1 03
(7 5
(6
(4 5
(3
=1
dd: 3
1
1 07
1
1 05
1
1 04
u: 1
1
1 06
d: 2
1
1 04
2 1
+ (1
2)
2)
= 5 9519
0 87506
2 2
+ (1
2)
3)
= 4 7950
0 90443
+ (1
2)
( 2
1 + (1
2 + (1
= 3 5215 0 92632
)
0 83481
1
2 = 5 2495
0 87669
1 ) 3 = 4 2004
4)
We can now express the value of the contract as a function of the xed rate
0 82167
Fwd ( )
0 1 + (1
0 ) 2 = 4 3329
1 05
, as follows:
(11.25)
c
by
A. Mele
(3 4) =
1
1 075
uud:
(3 4) =
1
1 06
udd:
(3 4) =
1
1 045
ddd:
(3 4) =
1
1 03
= 0 93023
= 0 94340
= 0 95694
= 0 97087
(2 4) =
ud:
(2 4) =
dd:
(2 4) =
=1
u:
(1 4) =
d:
(1 4) =
1
( 2
1 07
1
( 2
1 05
1
( 2
1 04
1
1 06
1
1 04
(3 4) + (1
2)
(3 4)) = 0 87506
(3 4) + (1
2)
(3 4)) = 0 90443
(3 4) + (1
2)
(3 4)) = 0 92632
(2 4) + (1
1)
(2 4)) = 0 83481
(2 4) + (1
1)
(2 4)) = 0 87669
(0 4) =
1
1 05
(1 4) + (1
0)
(1 4)) = 0 82167
c
by
A. Mele
that the underlying price satises the martingale condition. Instead, interest rate derivatives
generally depend on non-traded risks, which are not martingales. Moreover, the mere presence
of boundary conditions induce bond return volatility to be time-varying.
Ho and Lee address these issues by modeling the movements of the entire collection of bond
prices. We have three ways to achieve this task: (i) by making risk-neutral probs time-varying,
for a given tree with predetermined values of the short-term rate (as in the previous sections);
(ii) by assuming a constant risk-neutral probability, and searching for the values of the shortterm rate on the tree; (iii) by a combination of (i) and (ii) as in the implied binomial tree of
Chapter 10. The Ho and Lee model relies on the second way. The key element of this model is
the determination of the no-arbitrage ups and down of the entire yield curve, through modeling
bond prices.
11.5.1 The tree
The key element of the model is the determination of the no-arbitrage ups and down of the
entire yield curve, obtained by directly modeling bond prices of arbitrary expiration. Note that
once bond prices are obtained, forward rates are obtained as a result. Therefore, the Ho and
Lee model is a model of forward rate movements. It is a simple but powerful remark, because
the key point of the model is, then, to re-express bond prices again, as a function of future
forward rates. In this sense, the Ho and Lee model is a representation of current bond prices
(in terms of forward rates), rather than a model of current bond prices.
Assume that the price of any zero evolves according to a binomial tree. Let ( ) be the
price of a pure discount bond as of time , with time to maturity
, after upstate price
movements
of
the
bond
price.
Let
(
1
),
a
binomial
random
variable,
meaning that
Pr ( ) =
, such that:
(1
)
( ) = (1
where 1
we have,
( ) = (1
( +1
%
&
( +1
That is, if at time , the number of upstate movements is equal to then, at time + 1, the
number of upstate movements can either jump to + 1, with probability 1
, or stay at ,
with probability . (Therefore, we are now following the convention to have high values of bond
prices in the upper parts of the tree.) Note, further, that after one period, any zero is one period
closer to maturity. At maturity, = , the price of any zero is worth one unit of numeraire, viz
(
) = 1,
for all
and
c
by
A. Mele
)=
+ 1) [(1
+1
( +1
)+
( +1
)]
(11.26)
()
where ( + 1) =
, and ( ) is the continuously compounded short-term rate at time
after upward movements. We call this condition the martingale restriction.
Let us introduce notation for the movements of the price of any zero along the tree,
+1
( +1 )
= (
( )
{z
1
(
up at
+ 1)
}
( +1 )
= (
( )
{z
|
and
down at
1
(
+ 1)
}
(11.27)
The two functions, () and (), also called perturbation functions, are taken to be stateindependent. They capture the fact that in the case of uncertainty, the price of the zero can
either go up or down with respect to the risk-free of return. In other words, Eqs. (11.27) tell us
that the discounted gross return from going long a bond is:
( +1 )
( + 1) =
| {z }
( )
|
{z
}
Discount
) with probability 1
) with probability
Gross return
(11.28)
)+
(11.29)
This relation matches the martingale restriction introduced in Chapter 10, and applying to
stocks, the current values of which are tied down to their future up and down movements
through the risk-neutral probability. The di erence in this context is that the up and down
movements of the prices depend on the asset maturity, through the two functions (
) and
(
), which are endogenous, which makes the evaluation problem more intricate.
11.5.3 The recombining condition and interest rate volatility
Ho and Lee consider a recombining tree: the price
( ) we are looking for depends only
on , not on the exact sequence of up and down movements leading to upstate movements.
To summarize, we are looking for two functions (
) and (
) such that (i) the noarbitrage condition in Eq. (11.29) holds true and (ii) the tree is recombining. We now elaborate
586
c
by
A. Mele
+1
( +1
( +2
+1
( +2
&
&
( +1
+2
&
( +2
The recombining property of the tree implies that the bond price at time + 2 in the event
of + 1 jumps, i.e. +1 ( + 2 ), can be generated by one of the two paths, which we track
by using the two Eqs. (11.27):
(i) The up & down path,
+1
( +2
)=
)=
+1
}|
+1
( +1
{
)
+ 1)
(
(
(
{z
( +1
( +1
( +2
1)
), where,
1
+1 ( + 1
+1
( +2
1)
1)
(11.30)
), where
(
(
{
)
+ 1)
(
{z
1)
1
( +1
up at +1
(
(
+ 2)
}
), down at
}|
+1
), up at
(
(
down at +1
( +2
+1 (
+1
( + 1 + 2)
( + 1 + 2)
+1
(1
+ 2)
}
(11.31)
(11.32)
where we take the ratio to be constant, and return to its interpretation in a moment. Eq.
(11.32) is a nite-di erence equation for the ratio (
) (
). Given the boundary
conditions in Eqs. (11.28), the solution for this ratio is:
(
(
)
=
)
1)
(11.33)
We claim that in Eq. (11.32) is a contant relating to the volatility of the short-term rate.
Indeed, by taking logs, we obtain:
ln
+1
( + 1)
( + 1)]
(11.34)
c
by
A. Mele
[ ( + 1)]
1
1)
(
)= (
(
)
(1
) (
)+ (
)=1
The solution to this system is,
(
)=
1
)+
(1
1
1
)=
(1
)+
(11.35)
So the problem is now solved. Once we assign values to and , and an initial bond price
( ), we plug Eqs. (11.35) into Eqs. (11.27), and populate the treewe plug bond prices
on each node of the tree. Once this is done, we can price interest rate derivatives, i.e. assets
with payo s indexed to bond prices or interest rates over a given set of nodes. Note that Eqs.
(11.27) dictate the law of motion of the entire structure of bond prices along the trees, and they
are silent regarding the initial condition, i.e. at time = 0. The natural initial condition is the
market price of the bond for each maturity, say 0 (0 ) = $ (0 ) for all .
We can actually do more, and develop a closed-form solution for the bond price. Let
()
be the forward rate as of time after the occurrence of upward price movements, and let the
continuously compounded forward rate ( ) be dened as,
( ) ln 1 +
()
In Appendix 2, we show that,
)=
(0 )
(0 )
1
=
()
(0))
(11.36)
That is, we can express the price in closed-form, once we are able to do the same with the
forward rate changes, ( ) (0). In Appendix 2, we show that,
)
( ) = (0) + ln ( + 1
( + 1)
588
) ln
(11.37)
c
by
A. Mele
We replace Eq. (11.37) into Eq. (11.36), use the solution for the perturbation function () in
Eqs. (11.35), and using the condition that initial node of the tree is the same as the market
price for each / , we nd that:
)=
(0 )
$ (0 )
)(
Y1 (1
=
(1
)+
)+
(11.38)
From the perspective of time 0, the price of the zero at , in each state , is only a function
of the initial yield curve, the volatility parameter , and the risk-neutral probability . Note in
particular how important this volatility is. It is, in practice, the only parameter to be determined
once we x , which leads to an interesting parallel between this model and that of Black and
Scholes (1973). In both models, the inputs are the volatilities of the fundamentals (the shortterm interest rate, in Ho & Lee, and stock returns, in Black & Scholes). However, in Ho & Lee,
the input does not link to the short-term rate, but the entire yield curve, and the output relates
to future bond price movements. In Black & Scholes, the input is that of the fundamental, the
stock price, and the output is the initial option price.
Finally, the model displays the properties we would require from a perfectly tting one,
and illustrated by the next picture. First, it matches each price at time zero, as it can be
veried by collapsing = 0. Second, it bridges
to one when = . Third, it is random at
any time 0
, ( ) with probability
. Naturally, the values taken by
(1
)
( ) are determined by the equilibrium of the model, i.e. Eq. (11.38). All in all, the model
predicts random outcomes at time , ensuring at the same time that the initial yield curve is
tted without errors. The reason we might be interested in random outcomes at time , is that
we might have to evaluate options expiring at that date, through a model that predicts the
underlying to be pinned down without errors as explained.
j=t
Pj(T,T)=1
P (t,T)
j
P (0,T)
0
j=0
589
c
by
A. Mele
We need to estimate the value of . We can proceed as follows. Consider Eq. (11.38), and let
= + 1. We have,
1
(0 + 1)
( + 1) =
(0 )
(1
)+
The continuously compounded short-term rate predicted by the model is,
()
ln ( + 1) = (0) + ln (1
)+
(
) ln
where (0)
ln
(0 )
(1)
ln
(11.39)
(0) = 1 (0)
0 (0) + ln ((1
)+
) + ln
(1
Hence, the parameter can be chosen such that the volatility of the short-term rate predicted
by the model matches exactly the volatility
pof the short-term rate that we see in the data. Concretely, we can take = exp( Std ( )/
(1
)), where Std( ) is the standard deviation
of the short-term rate in the data.
Note, then, the interesting feature of the model. The Ho and Lee model doesnt take any
a priori stance on the dynamics of the short-term rate. Rather, it imposes: (i) the martingale
restriction on bond prices, an economic restriction, Eq. (11.29); and (ii) the simplifying assumption the tree is recombining, a technical condition, Eq. (11.27). These two conditions su ce to
to tell what to expect from the dynamics of the short-term rate. While deliberately simple, the
Ho and Lee model is quite powerful. The modern approach to interest rate modeling simply
aims to make the Ho and Lee methodology more accurate for practical purposes.
11.5.6 An example
Assume that three zero coupon bonds are available for trading, with current market prices: (i)
$ (0 1) = 0 9851 (the price of a 6M zero), (ii) $ (0 2) = 0 9685 (the price of a 1Y zero), and
(iii) $ (0 3) = 0 9445 (the price of the 1.5Y zero). We know that the price of one-period zero
at time , in the event of upward price-jumps from the current date to , is:
(
+ 1) =
(0 + 1)
$ (0 )
(1
1
)+
(11.40)
where $ (0 ) is the current market price of a zero expiring at time , with equal to six
months, one year and eighteen months, in this example. We assume that = 12 and = 0 9802.
11.5.6.1 The dynamics of the short-term rate
We want to determine the developments of the short-term rate on a recombining tree for as
many periods as we can, given the market price of the zeros we observe. We use Eq. (11.40) to
nd the one-period zeros in each node.
= 0. We have, trivially,
= 0:
(1 2) = 2
(0 1) =
$ (0
$ (0
2)
1
1) 1+
(0 1) = 0 9851.
= 0 9733
590
c
by
A. Mele
= 1:
(1 2) = 2
$ (0
$ (0
2) 1
1) 1+
= 0 9930
= 0:
(2 3) = 2
$ (0
= 1:
(2 3) = 2
$ (0
= 2:
(2 3) = 2
$ (0
$ (0
$ (0
$ (0
3) 2 1
2)
1+
3)
1
2) 1+
3) 1
2) 1+
= 0 9557
= 0 9750
= 0 9947
q=
1
2
P (1, 2) = 0.9733
q=
1
2
P (0, 1) = 0.9851
P (1, 2) = 0.9750
P (1, 2) = 0.9930
P (1, 2) = 0.9947
t=0
t=1
t=2
Suppose, now, that we want to nd the price of some additional bond, e.g., a 1.5Y bond which
pays (semiannually) coupons at 3% of the principal of $1. First, we need to nd the value
of this bond in each node of the tree. Note, at each node, the price equals (i) the discounted
expectation of its future value (including coupons), and (ii) the current coupons, as illustrated
in the tree below. That is, the convention, here, is that the bond purchased at time doesnt
give the owner the right to receive any coupon at time , only from time + 1 onwards.
591
c
by
A. Mele
1.03
q1 =
1
2
Pu (1, 2) = 0.9733
1.03
q0 =
1
2
P (0, 1) = 0.9851
Pd (1, 2) = 0.9930
1.03
1.03
t=0
t=1
t=2
t=3
Since the bond does not pay coupons at time zero, its current price is,
1
1
1
1
1 0267 + 1 0667 = 0 9851
1 0267 + 1 0667 = 1 0311
= (0 1)
2
2
2
2
Naturally, this price could been obtained by simply adding [ $ (0 1) + $ (0 2) + $ (0 3)]
0 03 + $ (0 3), although the results in the tree above are going to matter while pricing derivatives written on the coupon bearing bond.
11.5.6.3 Pricing European options
Next, we wish to nd the price of options, say the price of two call options on the 1.5Y bond
considered in the previous subsection, when the strike price is $1 and the maturities of the
options are 6 months and 1 year. Again, we need to gure out the no-arbitrage movements of
592
c
by
A. Mele
the ex-coupons bond price. (This is because if we purchase the bond today, we are not entitled
to receive any coupon, today. The ow of coupons we are entitled to receive starts from the
next period.) We easily obtain the tree below. We must just subtract the coupon, 0.03, from
each cum-coupons price in each node of the tree. Then, we obtain:
q=
q=
1
2
1
2
P (0, 1) = 0.9851
t=0
t=1
t=2
We are ready to price the two options. As for the call option on the 1.5Y bond, with 6 months
maturity, and strike price = $1, we have the following tree:
P = 0.997
0.5
q=
C = (P K)+ = 0
P (0, 1) = 0.9851
C=?
P = 1.0367
C = (P K)+ = 0.0367
t=0
t=1
593
c
by
A. Mele
1
1
0 + 0 0367 = 1 808 10
= 0 9851
2
2
The call option on the 1.5Y bond with 1 year maturity, and strike price
with similarly. We have the following tree:
= $1, is dealt
P = 0.984
q=
C = (P K)
1
2
= 0.000
P (1, 2) = 0.9733
C = P (1, 2)( 12 0 + 12 0.004)
q=
= 0.0019
1
2
P = 1.004
P (0, 1) = 0.9851
C =?
C = (P K)
= 0.004
P (1, 2) = 0.9930
C = P (1, 2)( 12 0.004 + 12 .0.024)
= 0.014
P = 1.024
C = (P K)
t=0
t=1
t=2
1
1
1
1
= (0 1)
0 0019 + 0 014 = 0 9851
0 0019 + 0 014 = 7 831 10
2
2
2
2
11.5.7 Continuous-time approximations, with an application to barbell trading
11.5.7.1 The approximation
( (1
= 0.024
ln , and
()
(0) + ln
) ln
594
()
, such that:
(1
)+
(1
c
by
A. Mele
() =
and
V0
( ) = ln
1 2
(1
(1
), and, then, =
. Replacing this into the denition
such that we may dene
of , yields, after expanding terms up to the second order,
= (0) + ln
(1
) + (1 )
2
(0) + ln 1 + 1 2 (1
)
2
(0) + 1 2 (1
) 2
2
1
= (0) + 2 2
2
Note, this expansion is accurate when is small, which empirically is indeed, as we have that,
typically,
10 2 , which is reasonable small for values of up to at least 50 years! However,
these calculations might also be considered as the starting point for the initial drift of the
short-term rate from zero to time . So, we have, approximately, that,
E0
1
( ) = (0) +
2
2 2
and V0
() =
(11.41)
In the next chapter (Section 12.4.2), we shall show, consistently with the previous calculations,
that in continuous time, the Ho and Lee model predicts the short-term rate to be the solution
to:
2
()
()=
+
(11.42)
$ (0 ) +
0
2
( 0) ( )
( )=
(
) +
(11.43)
( )
( )
),
)=
( 0)
( )
(11.44)
the continuous time counterpart to the two conditions in Eqs. (11.41). By combining Eqs.
(11.43)-(11.44), we obtain, after simple computations, that:
(
)=
)(
) + ( 0)
(11.45)
As shown in the next chapter (see Section 12.6.1), we have that for any model, including Ho &
Lee,7 the following representation holds true:
(
7 For
)=
(0 )
(0 )
[ (
(0 )]
example, Eq. (11.36) in the Appendix provides the discrete time counterpart to Eq. (11.46).
595
(11.46)
c
by
A. Mele
[ (
(0 )]
(0 )
1 2
( )=
exp
(
)2
(0 )
2
)2 + (
( ()
)( ( )
(0 )) (
(0 ))
(11.47)
It is a neat expression, which we may use, for a variety of purposes, such as option pricing. The
next section develops an example relating to barbell trading.
11.5.7.2 Application to barbell trading
We revisit the barbell trading strategy of Section 11.4.3.4, where we argued that this strategy
leads to positive prots due to convexity, as summarized by Figure 11.4. The key point in
this argument is that it abstracts from passage of time, and may, in fact, lead us to misinterpret
what is a merely static analysis. We may use the Ho and Lee model to analyze the prot and
losses of a barbell trade, in a dynamic context free from arbitrage. We consider two situations:
one, where the initial yield curve is at, and a second, where the initial yield curve is upward
sloping.
As for the at yield curve, we use the continuously compounded rate corresponding to the at
5% of Section 11.4.3.5, delivering = ln 1 05 = 0 04879. The number of assets to include into
the portfolio, 1 and 2 , are as in Eq. (11.16), i.e. 1 = 0 45706 and 3 = 0 56724. Instantaneous
forward rates are (0 ) = lim
(0
) =
ln (0 )
= . Using Eq. (11.47),
with volatility parameter = 0 03, we calculate the value of the strategy a few months later,
as follows:
Barb ( ) = 100 ( 1 ( 1) + 3 ( 10)
( 5))
(11.48)
Figure 11.12 depicts the value of the barbell, Barb ( ), for investment horizons equal to 1 month,
3 months, 6 months and one year.
596
c
by
A. Mele
2.5
1.5
1.5
1
1
0.5
0.5
0
0
0.5
0.02
0.04
0.06
shortterm rate in one month
0.08
0.1
2.5
0.5
0.02
0.04
0.06
shortterm rate in six months
0.08
0.1
0.02
0.04
0.06
shortterm rate in one year
0.08
0.1
1.5
1.5
0.5
1
0
0.5
0.5
0
0.5
0.02
0.04
0.06
shortterm rate in three months
0.08
0.1
FIGURE 11.12. Prot and losses arising from barbell trading, Barb ( ) in Eq. (11.48),
under the assumption the yield curve is driven by the Ho and Lee model, Eq. (11.47).
The initial yield curve is assumed to be at at = 4 8790%. Investment horizons are
= 1 12 (NW quadrant), = 3 12 (SW quadrant), = 6 12 (NE quadrant) and, = 1
(SE quadrant). The vertical dashed lines pass through = 4 8790%, and the horizontal
dashed lines pass through zero.
This trade is quite risky. For long investment horizons, it pays o when the short-term rate
uctuates signicantly away from the initial value, = 4 8790%. The amount of uctuations in
the short-term rate diminishes as we shrink the investment horizon. Nevertheless, this amount
appears to be considerable: for example, at one-month horizon, we should require the shortterm rate to move from = 4 8790% to either values larger than = 6% or lower than = 4%,
in order to claim for positive prots. Actually, these results suggest that a short position in the
barbell trade (i.e., sell the barbell portfolio and go long the 5Y bond) should be an interesting
strategy to implement in periods where we do not expect high volatility of interest rates. For
example, for investment horizons of 6 months, the prots from a short position in the barbell
trade are positive within a quite signicant range of variation of the short-term rate, [2.5%,
6.8%].
Finally, we consider a scenario where the initial yield curve is upward sloping, and generate
()
prices as (0 ) =
, where ( )
0 01(1 + ln ). We still determine the value of the
portfolio according to Eq. (11.16), i.e., we rely on the self-nancing condition in Eq. (11.14)
and both (i) the locally riskless condition in Eq. (11.15),
2 (2 ) = 1
1 (1 ) + 3
3 (3 ),
597
c
by
A. Mele
and (ii) the (generically incorrect) assumption of parallel shifts in the yield curve,
1 (1 )
1
1
3 (3 )
3.
3
2 (2 )
Figure 11.13 depicts the prot and losses arising from the trade.
2.5
1.5
1.5
1
1
0.5
0.5
0
0
0.5
0.01
0.02
0.03
0.04
shortterm rate in one month
0.05
0.06
2.5
0.5
0.01
0.02
0.03
0.04
shortterm rate in six months
0.05
0.06
0.01
0.02
0.03
0.04
shortterm rate in one year
0.05
0.06
1.5
1.5
0.5
1
0
0.5
0.5
0
0.5
0.01
0.02
0.03
0.04
shortterm rate in three months
0.05
0.06
FIGURE 11.13. Prot and losses arising from barbell trading, Barb ( ) in Eq. (11.48), under the
assumption the yield curve is driven by the Ho and Lee model, Eq. (11.47). The initial yield curve
is assumed to be upward sloping, generated by the equation, ( ) = 0 01(1 + ln ), with prices given
( ) . Investment horizons are
by (0 ) =
= 1 12 (NW quadrant), = 3 12 (SW quadrant),
= 6 12 (NE quadrant) and, = 1 (SE quadrant). The vertical dashed lines pass through the current
short-term rate, = 1 0%, and the horizontal dashed lines pass through zero.
Similarly as for the prot and losses summarized in Figure 11.12, the trade leads to prots
only when the short-term rate increases, and signicantly, from the initial value = 1%. In
particular, when moves around 1%, prots increase as lowers, and decrease, as goes up. This
e ect relates to that arising within the static exercise described in Table 11.4: long term bonds
benet from a decrease in more than short-term, and lose their value more than short-term
bonds as increases. However, as the interest rate increases signicantly, the barbell generates
prots because the convexity of 10 year bonds dominates overall.
c
by
A. Mele
implement solutions for models, without requiring to solve them in closed-form. For example,
the Ho and Lee (1986) model relies on a number of assumptions, which might be unrealistic,
in practicethe short-term rate can take on negative values in this model. It is quite unusual
that a model displaying realistic features also has a closed-form solution.
The approach in this section relies on the Arrow-Debreu securities introduced in Chapter 2,
and parallels the work of Derman and Kani (1994), Rubinstein (1994) and Dupire (1994) on
equity options reviewed in Chapter 10. Arrow-Debreu securities do not exist, in practice, for
the reasons put forward in Chapter 2, relating to the simple circumstance we may disagree on
models and, hence, on the relevant events and states of nature. However, we can extract the
shadow price of these securities from traded securities, and used them to price interest rate
derivatives. Naturally, we do so, by relying on a given model. The rationale of all this is that
we do need models to evaluate interest rate derivatives, and Arrow-Debreu security prices are,
in fact, the top of the iceberg of a given model, so to speak. Note that the emphasis in the
rst and second parts of these Lectures is on the determination of Arrow-Debreu security prices
in given economies, by relying on assumptions such as production possibilities, preferences or
markets. The approach in this section is to extract the price of Arrow-Debreu securities from
that of already traded assets. We illustrate this approach, by elaborating on three points.
First, we show how Arrow-Debreu securities can be used in the specic context of xed
income security evaluation; in particular, we illustrate how to exploit the prices of these assets
to turn the martingale restrictions of the previous sections into a set of equivalent conditions
that are directly usable for practical purposes.
Second, we just use the previously extracted Arrow-Debreu prices, and develop algorithms to
populate the short-term rate tree, while ensuring that the initial yield curve is tted without
errors.
Third, we illustrate this procedure and solve two models: (i) the Ho and Lee model, and
(ii) a model developed by Black, Derman and Toy (1990). While we know the solution to the
rst model, it is useful to review how it could alternatively be solved, as this would naturally
give us insights into the mechanisms underlying the calibration algorithms of this section. Note
in any event, the retrieving the price of Arrow-Debreu security prices is crucial even in the
context of the Ho & Lee model, as they can help determine the price of complex derivatives
without a closed-form expression. The second model is one where even the bond price has not
a closed-form solution.
11.6.1 Extracting Arrow-Debreu securities from the yield curve
We know, from Chapter 2, that an Arrow-Debreu security is an asset that pays o $1 in some
prespecied state of the nature, and zero otherwise. Consider, for example, the diagram in
Figure 11.14.
599
c
by
A. Mele
x s,
x
1
(0,0)
s, +1
s 1,
Arrow-Debreu security
1
0
0
t=0
t = 0.5
t=1
t = 1.5
t=2
FIGURE 11.14. In the binomial tree of this section, an Arrow-Debreu security for state
at time + 1 is a security that pays $1 at time + 1 in state , and zero otherwise. This
section aims to show how to recover Arrow-Debreu prices from the price of xed income
securities.
(0
)=
( )
(11.49)
=0
Our objective is to make use of the initial yield curve, and retrieve the price of all ArrowDebreu securities, i.e. ( ) for all and , where
{1 }, from the observation of the
600
c
by
A. Mele
initial term-structure of interest rates. Consider the Arrow-Debreu security that pays $1 in
node (
+ 1) (see Figure 11.14). Denote with
[
+ 1] its value at time , and in state
,
. What is this value at time in all states? A key observation is that in this tree,
the node (
+ 1) (the lled circle) can only be accessed to through the nodes ( ) and
(
1 ) occurring at time (the two empty circles in Figure 11.14). At time then, the value
[
+ 1] is zero in all the nodes ( ) except the empty circles ( ) and (
1 ). Indeed,
if we do not happen to be at one of those nodes denoted with empty circles, we could not reach
the node (
+ 1) (the lled circle), where the Arrow-Debreu security pays o .
So, we are left with nding the values
[
+ 1] in the nodes corresponding to the empty
circles ( ) and (
1 ), i.e.
[
+ 1] and
+ 1]. Let ( ) be the continuously
1 [
compounded short-term rate in node ( ). Consider the upper node ( ). We have,
[
( )
+ 1] =
[0 + 1 (1
( )
)] =
(1
1 ),
1(
+ 1] =
[1 + 0 (1
+ 1] =
+ 1] =
1 [
[
+ 1] = 0
( )
1(
)] =
(1
1(
)
)
(11.51)
for all
These payo s are simply the market value of the Arrow-Debreu security for (
+ 1), in the
various states occurring at time , i.e. the money the holder can make by selling the asset at
time , in the various states. Therefore, we can apply Eq. (11.50), and obtain,
( + 1) =
( )
+ 1]
=0
( )
+ 1] +
( )
+ 1]
By replacing the Arrow-Debreu prices in (11.51) into the previous equation, we obtain the
so-called forward equation for the Arrow-Debreu prices,
( + 1) =
( )
( )
(1
)+
( )
1(
(11.52)
Eq. (11.52) is the counterpart to the forward equation used in Chapter 10 to t European
option prices. The approach in this section di ers from that in Chapter 10, because we take
the risk-neutral probability to be constant, and interest rates to be time-varying and statedependent, whereas in Chapter 10, interest rates are exogenously given, and the risk-neutral
probability is time-varying and state-dependent. Naturally, the approach in this section can be
generalized to the case of stochastic risk-neutral probabilities, once we also wish to t European
options, on top of bond prices. In the next section, we explain how to t bond prices.
601
c
by
A. Mele
To implement an algorithm, we make a repeated use of the forward equation (11.52) and the
following zero pricing equation,
$ (0
+ 1) =
( )
( )
(11.53)
=0
The inputs to the algorithm are a number of zeros equal to the largest maturity date the
tree extends to. Note an important feature of the calibration procedure. While we extract
Arrow-Debreu security prices, we need to make reference to a given model for the underlying
short-term rate movements, ( ) and, indeed, we shall illustrate the algorithm by hinging upon
two examples in the following sections. Instead, in Chapter 10, we have illustrated that in the
equity case, cross section of option prices is enough to uniquely pin down the underlying stock
price movements.
11.6.2 Two model examples
We begin with Ho and Lee, assuming continuous compounding. By Eq. (11.39), the short-term
rate predicted by the Ho and Lee model is:
( ) = (0) + ln ((1
)+
) ln
(11.54)
+ 1],
where (0) is the continuously compounded forward rate at time zero for maturity [
and is the number of upward movements of the entire set of bond prices. Dene
(
),
which is the number of downward movements of the bond prices or, equivalently, the number
of upward movements of the short-term rate. Hence, we can equivalently index the short-term
rate by , instead of , and rewrite Eq. (11.54) with a slight abuse in notation, as follows:
( ) = (0) + ln ((1
|
{z
0(
)+
)
) + ln
}
(11.55)
such that 0 ( ) is the short-term rate at time , in the event of zero upward movements in
this rate, and is the usual volatility parameter, which can be calibrated through ln 1 =
Std( )
, with straightforward notation. Note, incidentally, that the short-term rate movements
(1
(0
+ 1) =
X
=0
( )
( )
0(
( )
=0
where the second equality follows by the assumption that the short-term rate is solution to Eq.
(11.55).
602
c
by
A. Mele
By rearranging terms in the previous equation, we obtain a closed-form expression for the
future short-term rate at time , in the event of zero upward movements,
P
( )
=0
(11.56)
0 ( ) = ln
+ 1)
$ (0
This is the counterpart to zero pricing equation (11.53).
We use Eq. (11.56) and the forward equation (11.52) to populate the interest rate tree, under
the assumption that = 12 . Precisely, the algorithm proceeds as follows:
(i) Given the boundary condition for the Arrow-Debreu price, 0 (0) = 1, determine the
initial value of the short-term rate, 0 (0), using Eq. (11.56), as 0 (0) = ln(1/ $ (0 1)).
(ii) Suppose we know the future value of the short-term rate at time
1, in the event of no
1). Then, given the value of 0 (
1), and the price of
upward movements, i.e. 0 (
the Arrow-Debreu securities (
1) for
1, determine ( ) for
, through
the forward equation (11.52),
( )=
0(
1)
1)
(1
)+
1)
0(
1
= ,
2
1)
where the last equation follows by plugging Eq. (11.55) into Eq. (11.52).
, use Eq. (11.56) to determine the future
(iii) Given the Arrow-Debreu prices ( ) for
value of the short-term rate at time , in the event of no upward movements, i.e. 0 ( ).
(iv) If
As a second example, consider the Black, Derman and Toy (1990) model. In this model, the
short-term rate is solution to,
( )=
(11.57)
0( ),
where is, once again, a volatility parameter.8 For computational convenience, this model
assumes that the short-term rate in Eq. (11.57) is discretely compounded. Accordingly, we
rewrite the forward equation (11.52) in terms of discretely compounded rates,
( + 1) =
( )
1
1+
( )
(1
)+
( )
1
1+
(11.58)
( )
(0 1) =
1
1+ 0 (0)
(ii) Suppose we know the future value of the short-term rate at time
1, in the event of no
upward movements, i.e. 0 (
1). Then, given the value of 0 (
1), and the price of
the Arrow-Debreu securities (
1) for
1, determine ( ) for
, through
the forward equation (11.58),
( )=
1)
1
1+
1)
(1
)+
1)
1
1+
1)
1
2
where the last equation follows by plugging Eq. (11.57) into Eq. (11.58).
8 In its most general form, this model assumes that
( )=
is a volatility parameter that varies determinis0 ( ), where
tically over time. This more general formulation leads to more exibility, which is useful to t the term structure of volatility.
603
(0
+ 1) =
( )
=0
c
by
A. Mele
( ) for
1
1+
( )
to solve, numerically, for the future value of the short-term rate at time , in the event
of no upward movements, i.e. 0 ( ). Note, we did not need this additional step for the
solution of the Ho and Lee model, as the short-term rate 0 ( ) is known in closed form
in the Ho and Lee model (see Eq. (11.56)). Note, since 0 ( ) 0, then, we also have that
( ) 0 by Eq. (11.57).
(iv) If
Consider, again the Ho and Lee model example in Section 11.5.5, where three zeros were traded:
(i) one zero maturing in 6 months, (ii) one zero maturing in 1 year, and (iii) one zero maturing
in 1.5 years, with market prices $ (0 1) = 0 9851, $ (0 2) = 0 9685, $ (0 3) = 0 9445. By
Eq. (11.55), the Ho and Lee model assumes that,
( ) = 0 ( ) + ln 1
(11.59)
We use Eq. (11.59) and nd the values of the short-term rate ( ) in each node, under
the assumption that = 12 , and that the standard deviation of the short-term rate is 0 014,
annualized. To nd , we may use the relation, ln 1 = Std( ) , where = 12 and Std( )
(1
is the standard deviation of the short-term rate, which equals Std( ) = 0 014, annualized.
1
Therefore ln 1 = 0 014
= 0 02 or = 0 9802.
2
2
For the Ho & Lee model, we know the closed-form expression for 0 ( ),
P
( )
=0
(11.60)
0 ( ) = ln
+ 1)
$ (0
where ( ) denotes the price of an Arrow-Debreu security which pays of $1 in state at time
, and zero otherwise. Given the term-structure of prices $ (0 + 1), = 0 1 2, we populate
the tree using Eq. (11.60) and the forward Arrow-Debreu prices equation (11.52),
( )=
1
2
0(
1)
1) +
1)
(11.61)
= 0 1 2.
= 0. Eq. (11.60) is trivial. It leads to, 0 (0) = ln $ (01 1) = 0 015 The forward equation
for the Arrow-Debreu prices, Eq. (11.61), is also trivial, 0 (0) = 1.
= 1. Let us use Eq. (11.61), the forward equation for the Arrow-Debreu prices, to nd
0 (1) and 1 (1). We have two cases:
604
c
by
A. Mele
= 0. We have:
0
(1) =
1
2
0 (0)
(0) + 0] =
1
2
0 (0)
= 0 4925
The previous relation holds because 0 (1) is the current price of the Arrow-Debreu
security which pays o $1 in state 0 at time 1, as illustrated by the tree in the Figure
1 below,
1
= 2
s=1
s=0
s=0
1
=0
=1
= 1. By a similar reasoning,
1
(1) =
1
2
0 (0)
[0 +
(0)] =
1
2
0 (0)
= 0 4925
0 (1) = ln
(1) + 1 (1)
$ (0 2)
0 4925 (1 + 0 9802)
= ln
0 9685
(1) =
(1) + ln
605
= 0 0069 + 0 02 = 0 0270
= 0 because the
= 0 0069
c
by
A. Mele
1
2
r1 (1) = 0.027
r0 (0) = 0.015
r0 (1) = 0.0069
=0
=1
We can now calculate the values of the short-term rate for one further period.
= 2. By Eq. (11.61), the forward equation for the Arrow-Debreu prices, we have the
following three cases:
( = 0)
( = 1)
( = 2)
(2) =
1 (2) =
2 (2) =
0
1
2
1
2
1
2
0 (1)
[ 0 (1) + 0] = 0 2446
[ 1 (1) + 0 (1)] = 0 4843
0 (1)
[0 + 1 (1)] = 0 2397
0 (1)
q=
1
2
s=1
s=0
s=1
s=0
s=0
=0
=1
=2
Consider, for example, 0 (2). It is the price of the Arrow-Debreu security for time 2,
under two consecutive downward movements of the short-term rate. This state can only
be accessed to through the state = 0 at time = 1. But at state = 0 at time = 1,
the value of the Arrow-Debreu asset is 12 0 (1) . Hence, 0 (2) = 0 (1) 12 0 (1) . By a
similar reasoning, we have that 2 (2) = 1 (1) 12 1 (1) = 1 (1) 12 0 (1) . Note, there
is some symmetry in the distribution of the Arrow-Debreu prices, with 1 (2) being the
largest, being the price of the security that pays o with the highest likelihood. However,
606
c
by
A. Mele
(2)
is constant and equal to 50%, because discounting is more severe
2 (2), even if
whilst crossing the nodes leading to = 2, compared to the nodes leading to = 0.
0
We can now calculate the values of the short-term rate for each node. Eq. (11.60) is, now,
2
0 (2) +
1 (2) +
2 (2)
0 (2) = ln
$ (0 3)
!
(2) + ln
= 0 0054 + 0 02
=0 1 2
(2) = 0 0054,
(2) = 0 0253,
The diagram below summarizes the implied tree for the short-term rate in this model.
q=
q=
1
2
1
2
r2 (2) = 0.0452
r1 (1) = 0.027
r0 (0) = 0.015
r1 (2) = 0.0253
r0 (1) = 0.0069
r0 (2) = 0.0054
=0
=1
=2
Assume that the spot yield curve is 2.5% for = 1 year, 4.5% for = 2 years, and 6% for = 3
years, continuously compounded and annualized. Consider the following model:
()=
( )+
(11.62)
where is a constant and equal to 0 01, ( ) is the continuously compounded short-term rate
as of time , after upward movements and, nally, the unit period of time is taken to be one
607
c
by
A. Mele
year. As we know, the Ho & Lee model predicts that the price as of time zero of an ArrowDebreu security paying o in state at time , denoted as ( ), satises the following forward
equation, for
1,
0 and
:
i
h
1
1)
0(
()=
(
1) +
1)
(11.63)
(1
)
1(
where is the risk-neutral probability of an upward movement in the short-term rate. Furthermore, according to this model, the price of a zero coupon bond, paying $1 at time , $ (0 ),
equals,
1
X
1)
0(
(0
)
=
(
1)
(11.64)
$
=0
Suppose, next, that the risk-neutral probability of an upward movement at any time is not
a constant , but a function of calendar time, say : is, then, the probability of an upward
movement in the short-term rate from time to time + 1. Naturally, the assumption that
is time-varying, makes this model markedly distinct from Ho & Lee model. To calibrate this
model, we consider the recursive equation for the Arrow-Debreu security prices:
h
i
1
1)
0(
()=
(1
(
1) + 1
1)
(11.65)
1)
1(
where
1 denotes the risk-neutral probability of an upward movement in the short-term rate
from time
1 to time . The boundary conditions are the usual ones: 0 (0) = 1, ( ) = 0,
for
and
0. Eq. (11.65) can be derived through the same arguments in Section 11.7.1.
Next, suppose the risk-neutral probability of an upward movement in the short-term rate in
the rst period equals 12 . Suppose, further, that available for trading is a derivative, which pays
o an amount of $1 in state = 2 and an amount of $1 in state = 0, both at time = 2.
The current price of this derivative equals 0 45514. The interpretation of the derivative is that
of a contract that pays o when the interest rate experiences extreme movements (up-up or
down-down)a very basic interest rate volatility contract. Its price can be expressed as the sum
of the two Arrow-Debreu securities for these extreme interest rate movements. Let us set the
nominal values of the zero coupon bonds to $1. To populate the interest rate tree, we need to
determine the three zero prices, which are:
$
(0 1) =
0 025
= 0 97531
(0 2) =
0 0452
= 0 91393
(0 3) =
0 063
= 0 83527
We can start populate the tree. Eq. (11.64) can be rewritten as:
P
)
()
=0 (
0 ( ) = ln
$ (0 + 1)
We have,
1
0 (0) = ln
0 97531
= 0 025
(11.66)
c
by
A. Mele
= 0: We have,
0
(1
0) 0
= 1: We have,
1
(1) =
0 (0)
(1) =
0 (1) = ln
0 (0)
(1) =
0 0
(1) + ( ) 1 (1)
$ (0 2)
0 (1) + 0 01 = 0 07
0
= 0 06
= 1 and
= 2. We use
0 (1)
(2) =
(1
1) 0
(1) =
0 06
(1
1) 0
48766
= 1: We have,
1
= 0,
= 0: We have,
0
0 01
0 48766 (1 +
= ln
0 91393
(2) =
0 (1)
= 2: We have,
(1
2
1)
(1) +
0 (1)
(2) =
1 0
(1) =
0 06
0 06
(1) =
(1
0 01
0 01
1)
0 48766
0 48766
We do not know yet 1 . Yet the rate volatility asset, which quotes for 0 45514, can be
used to extract 1 . At time = 1, its price is either
= 0 07 1 (in the up state of the
0 06
world), or
=
(1
1 ) (in the down state of the world). So by no-arbitrage, its
current price, satises
0 45514 =
1
2
0 025
)=
1
2
0 025
0 07
0 06
(1
1)
Solving for 1 yields, 1 = 0 90. Naturally, the same result is obtained by calibrating 1 so
as to make the price of the derivative, 0 45514, match the sum of the prices of the ArrowDebreu securities paying o in states 0 and 2 at = 2, viz 1 : 0 45514 = 0 (2) + 2 (2) =
0 06
0 06
0 01
(1
0 48766. So now, we can use 1 = 90% and calculate
1 ) 0 48766 +
1
the Arrow-Debreu prices, obtaining:
(2) =
1 (2) =
2 (2) =
0
0 06
(1
(1
0 06
9
0 06
9) 0 48766 = 0 04592
9) 0 01 + 9 0 48766 = 0 4588
0 01
0 48766 = 0 40922
Note, there is no symmetry at all in the distribution of these Arrow-Debreu security prices.
The price 0 (2) is very low, due to the fact that 1 is very high, such that the probability
of reaching the lowest node of the tree at time = 2 is quite low.
609
c
by
A. Mele
(2) + 2 2 (2)
0 (2) = ln
$ (0 3)
(2) +
and,
(2) =
2 (2) =
1
Finally, we wish to evaluate a European call option at time zero, written on the three year
zero coupon bond with nominal value equal to $1. This option expires at = 2 and has a strike
price equal to $0 91000. At expiry, the option pays o :
2 (2)
+ 0 09605
+
0 91 =
0 91 = 0
2 (2)
1 (2)
+ 0 08605
+
0 91 =
0 91 = 0 00755
1 (2)
0 (2)
+ 0 07605
+
0 91 =
0 91 = 0 01677
0 (2)
Then,
1 (1)
=
=
0 07
( 1 2 (2) + (1
1 ) 1 (2))
(0 9 0 + 0 1 0 00755) = 7 0396 10
0 (1)
=
=
0 06
( 1 1 (2) + (1
1 ) 0 (2))
(0 9 0 00755 + 0 1 0 01677) = 7 9786 10
where
(1 3) =
=
(1 3) =
=
1 (1)
(1 3)
2 (2)
0 85)+ + (
+ (1
1)
(1 3)
1 (2)
0 85)+
0 (1)
1 (2)
0 (2)
+ (1
1
1)
0 06
0 90 0 08605 + 0 10 0 07605 = 0 86498
0 07
= 0 025 12 (0 01498) = 0 00730. Suppose, now, that the market value of this option
That is,
diverges from
, i.e.
6= $ , where $ is the market value of the option. For example,
$ . To implement this arbitrage opportunity, we can sell the option, and use the proceeds
to build up a portfolio comprising the bond expiring in three years and a money market account,
with initial value:
0 =
$ (0 3) +
610
c
by
A. Mele
and
0
(1 3) +
=
(
):
0
(1 3) +
=
where
and
are the payo s of the one year option. The solution is,
=
(1 3)
(1 3)
0 01498
= 0 875
0 84786 0 86498
=
= 0,
(1 3)
(1 3)
= 0 01498,
(1 3)
(1 3)
(1 3) = 0 84786,
01498 0 84786
=
0 84786 0 86498
0 025 0
(1 3) =
0 72356
(0 3) + = 0 875 0 83527
0 72356 = 0 00730 =
c
by
A. Mele
value of the) callable bond expected to prevail in the next period. The rm, then, would exercise
at , should it expect its cost of capital will decrease at + 1, which would boost the market
evaluation of its debt. In this case, the price of a callable bond is clearly just . Otherwise, the
value of the callable bond is its discounted expected value over the next period. To summarize,
= min{
E (
+1 )}
E (
+1 )
max{
E (
+1 )
0}
(11.67)
We can view this problem under a slightly di erent angle, one where the rm may decide to
issue non-callable zero coupon bonds just upon convertion, such that the price of any callable
bond could be neatly decomposed as the price of a straight, non-callable bond, minus the option
to call the bond.
Let
( ) denote the price of a non-callable zero coupon bond as usual and suppose that
at some point in time , interest rates have decreased to an extent to have made
( )
su ciently large, in a sense to be explained in a moment. The problem we want to study is
actually one where the issuer is seeking for an optimal stopping time at which it can redeem
the bonds for , and issue new non-callable debt at = , priced at
( ). This would allow
the issuer to cash in a di erence equal to
( )
. Note that by doing so, the bond-issuer
is left with the same optionalities it would have by not exercising the option to call, but with
the additional money-shower,
( )
. It is, therefore, in the interest of the bond-issuer
to exercise at , whenever the di erence
( )
is positive and su ciently large, and it
is obviously not otherwise. Naturally, we consider re-issuance of non-callable debt because we
wish to achieve a neat decomposition of the value of callable debt in terms of non-callable debt
and an option to call.
How large the di erence
( )
has to be? It is a real option problem of the kind studied
in Chapter 4, 8 and 10 of these Lectures. The objective of the rm is to maximize the present
value of the money shower at some optimal stopping time , viz
P
=
sup E
( ( )
)+
(11.68)
C
= inf [ ] { : ( ) = }.
which gives rise to a free-exercise boundary,
We conjecture, accordingly, that the value of callable debt can be decomposed at any time
as the value of a non-callable debt minus the American option price C in Eq. (11.68),
=
( )
(11.69)
C =
=
E (
( )
+1
( )
max{(
C +1 )
( )
max{
+
E (
+1
( )
E (C +1 )}
C +1 )
0}
where the second equality follows by the martingale property of the bond price rescaled by the
money market account, and by rearranging terms. That is,
C = max{(
( )
)+
E (C +1 )}
(11.70)
conrming Eq. (11.69). In other words, the optimal stopping time for the problem in Eq. (11.67)
collapses to that in Eq. (11.68).
Puttable bonds, instead, are assets that give the holder the right to sell the bonds back to
the issuer at some exercise price, either at a xed maturity date or any xed date before the
612
c
by
A. Mele
expiration. The bondholders would exercise their option to tender the bonds to the issuer when
market conditions improve from their perspective, i.e. when interest rates are high enough, so
as to make bond prices lower than the exercise price. Therefore, issuing puttable bonds leads
to a lower cost of capital, to the extent of the value of the American put option given to the
bondholders to tender the bonds at the strike , in analogy with the pricing of callable bonds.
Suppose for example that the price of a non-puttable bond,
( ) as usual, lowers to a
level su cienly lower than the strike price , in a sense to be determined in a moment. The
bondholders, then, will nd convenient to tender the bonds at , buying conventional bonds at
( ), thereby cashing in
( ), and then wait until maturity. This trade would provide
bondholders with a money-shower equal to
( ), at the exercise date. Alternatively, the
bondholders would not exercise, and wait until maturity, in which case they would not receive
the prot
, at the exercise date. Therefore, it is optimal to exercise when
( ),
with the price being su ciently low for some , and it is obviously not otherwise. Therefore,
that the value of a puttable zero coupon bond, p say, satises:
p
p
p
= max{
E
E ( +1 ) + max{
E
0}
(11.71)
+1 } =
+1
Conjecture that,
( )+P
(11.72)
where P is the value of an American put on the zero-coupon bond, solution to,
P = max{(
( ))+
E (P +1 )}
(11.73)
Substituting Eq. (11.72) into Eq. (11.71) leaves Eq. (11.73) by arguments similar to those
leading to Eq. (11.70).
Convertible bonds are assets that give the holder the right to convert them into a prespecied
number of shares of the rm. Their value at each date when the conversion can take place is
max {CV } = + max {CV
0}, where CV denotes the conversion value of the bonds,
expressed in terms of the value of the rms shares: issuing convertible bonds now lowers the
cost of capital to the extent of the option given to the bondholders to convert the bonds into
shares. Convertible bonds can be made callable by the bond-issuers, at a strike . Usually, if
the bonds are called, the convertible bondholders have the option to either tender the bonds to
the rm, or to convert them. On the other hand, the only reason the bond-issuers might call
is that the price of the convertibles is up, compared to the strike price. Therefore, the option
to make convertible bonds also callable puts a ceiling to the price of the convertibles bonds,
given by the exercise price, . Mathematically, in the presence of callability, the value of a
convertible bond at each potential conversion date is max {CV min {
}}: the option to call
back takes away some of the optionality from the bondholders, who are, in e ect, forced to
convert, as soon as the price increases to a level beyond .
For the previous mechanism to work, the conversion value and the bond price cannot be
both continuous processes. Alternatively, we need to think in terms of a discrete time context.
Consider the following events triggering the convertible bond-holder to convert after the issuer
decides to call. First, the issuer calls as soon as
, such that max {CV min {
}} =
max {CV
}. Then, the convertible bond-holder decides to convert when CV
, regardless
of whether CV is larger than . For example, it may be that CV
, in which case conversion
would not have taken place without the issuer option to call at . Note that with this option
to call, we could well have that:
CV
613
c
by
A. Mele
If CV0
are continuous, these inequalities could not hold. If some
0 , and both CV and
point
, it so occurs that
, the issuer would call the bond, and give the bondholder
the option to convert. But the bondholder would not convert as due to continuity, CV would
still be below .
11.7.2 Callable bonds
11.7.2.1 Copying with credit risk
We can price callable bonds through trees in a way that the initial yield curve is tted without
errors, relying on the methodology in this chapter. One issue to take into account is the presence
of credit risk. We may proceed as follows.
(i) First, we populate a short-term rate tree through one of the models described in this
chaptersay, for example, through the Black, Derman and Toy (1990) model.
(ii) Second, we rely on the implied short-term rate process of the previous step and price a
callable coupon bearing bond without default risk. In each node, we compare the strike
price
with the rolled-back (ex-coupon) bond value, take the minimum of the two, add
the coupon to this minimum and nd, then, the market value of the callable bond at the
relevant node,
( )
( ) = min{
E ( +1 | )} + coupon
where ( ) denote the value of the callable coupon bearing bond at time and state ,
( )
( ) is the short-term process at and state , and
E ( +1 | ) is the rolled-back
ex-coupon value of the bond at and state . As usual, this rolled-back value is found
recursively, i.e., by discounting the risk-neutral expectation of the future values of the
coupon-bearing callable bond value.
(iii) Third, we correct for credit risk. The price of the callable bond in the previous step is
likely to exceed the market price, due to credit risk. One then proceeds with adding a
constant spread to the short-term rate process in step one. The resulting credit riskadjusted interest rate tree is used to implement step (ii). If the model-based callable bond
is valued less than the market, one re-calibrates the interest rate tree with a lower credit
spread, until convergence is achieved by which market and model prices of the callable
bonds are the same. The resulting spread is usually referred to as the option-adjusted
spread to emphasize it is determined while taking into account the optionalities regarding
the callable bonds.
At this point, we may price derivatives written on callable defaultable coupon bearing bonds
for example, options.9 We now illustrate a simple case, relating to the the pricing of a callable
bond without credit risk.
11.7.2.2 A numerical example: without credit
Assume that the discretely compounded six-month rate, or the short-term rate, evolves over
time according to the tree described in Figure 11.15.
9 Ho and Lee (2004) (Chapter 8, Section 8.3 p. 274-278) contain exercises on the pricing of options on callable bonds relying on
trees such as those described in this section.
614
c
by
A. Mele
uuu: r=4.75%
uu: r=4%
u: r=3.5%
uud: r=3.5%
r=3%
ud: r=3%
d: r=2%
udd: r=4.5%
dd: r=1.5%
ddd: r=1.25%
t=0
t = 0.5
t=1
t= 1.5
FIGURE 11.15.
Next, consider a bond expiring in two years, paying o coupon rates of 3% of the principal
of $1 every six months, and callable at any time by the issuer, at par value. Let this bond
be labeled BCX. Suppose that the prices of three zero coupon bonds expiring in one year,
eighteen months and two years are, respectively, 0 94632, 0 91876 and 0 89166. We can use
these market data to calibrate the risk-neutral probabilities of upward movements in the shortterm rate implied by the binomial tree in Figure 11.15, provided these risk-neutral probabilities
depend only on calendar time , not on the specic state of nature at time .
We assume that available for trading is also a conventional, (i.e. non-callable) bond maturing
in two years and paying coupons semiannually, at 3% of the principal of $1. We wish to
calculate the price movements of the non-callable coupon-bearing two year bond. We have,
1
$ (0 0 5) = 1 03 = 0 97087. Furthermore, as regards the zero expiring in one year:
1
1
+ (1
0
0)
$ (0 1) = 0 94632 = $ (0 0 5)
1 035
1 02
which solved for
$
delivers
1
(0 5 1 5) =
1 035
1
(0 1 5) = 0 91876 = $ (0 0 5) ( 0 (0 5 1 5) + (1
(0 5 1 5))
0)
= 0 97087 (0 40 (0 5 1 5) + 0 60 (0 5 1 5))
where
Solving for
leaves
$
1
+ (1
1
1 04
1
1)
1 03
1
(0 5 1 5) =
1 02
(0 2) = 0 89166 = $ (0 0 5) ( 0 (0 5 2) + (1
0)
= 0 97087 (0 40 (0 5 2) + 0 60 (0 5 2))
615
1
+ (1
1
1 03
(0 5 2))
1
1)
1 015
c
by
A. Mele
and:
1
1
(1 2) =
+ (1
2
1 04
1 0475
1
1
+ (1
(1 2) =
2
1 03
1 035
1
1
+ (1
(1 2) =
2
1 015
1 02
1
2)
1 035
1
2)
1 02
1
2)
1 0125
(11.74)
Given the market data and the previously calibrated risk-neutral probabilities, we now proceed with the calculation of the price of the callable coupon bearing bond. We discount the
expected cash ows, through the evaluation formula, min{ 1} + 0 03, where is the present
value of the future expected discounted cash ows promised at each node by a callable bond
with the same strike price . We have:
(i) At = 1 5 years,
uuu:
uud:
udd:
ddd:
1 03
= 0 98329 vs 1
wait, and the value of the callable bond is 0 98329.
1 0475
1 03
= 0 99517 vs 1
wait, and the value of the callable bond is 0 99517.
1 035
1 03
vs 1
exercise, and the value of the callable bond is 1.
1 02
1 03
vs 1
exercise, and the value of the callable bond is 1.
1 0125
03
03
+ 0 03 + 0 4 11035
+ 0 03 = 0 97889 vs 1 wait, and the value
uu: 1 104 0 6 110475
of the callable bond is 0 97889.
03
+ 0 03 + 0 4 (1 + 0 03) = 0 99719 vs 1
wait, and the value of
ud: 1 103 0 6 11035
the callable bond is 0 99719.
1
[0 6 (1 + 0 03) + 0 4 (1 + 0 03)] = 1 0285 vs 1
dd: 1 0015
the callable bond is 1.
616
c
by
A. Mele
1
[(0 7 0 97889 + 0 3 0 99719) + 0 03] = 0 98008 vs 1
u: 1 035
of the callable bond is 0 98008.
Finally, at the time of evaluation, we have that = 40%, and, then, the price of the callable
bond is:
1
(0 40 0 98008 + 0 60 1 + 0 03) = 0 99226
=
1 03
Naturally, the callable bond is valued less than the conventional bond (0 2) in Eq. (11.74):
the di erence is the value of the option given to the issuer to redeem these bonds, and arises
when the interest rates go su ciently downnegative convexity.
How would we proceed to price the BCX bond if the previous market data were unavailable?
In particular, suppose that (i) the risk-neutral probabilities of upward movements in the shortterm rate are: (i.a) unknown from time zero to 0.5 years; (i.b) 70%, from 0.5 to one year; and
(i.c) 60%, from one to 1.5 years; (ii) available for trading is a European call option written on
the BCX bond; (iii) this option, which quotes for $1 7226 10 3 , expires in 1.5 years, is struck
at $0 99000, and becomes worthless as soon as the underlying callable bond is called back by
the issuer. First, note that at the expiration, = 1 5 years, the payo s of the option are:
=0
= 0 00517
= 0 00000
1
(0 6 0 + 0 4 0 00517) = 1 9885 10
1 04
1
(0 6 0 00517 + 0 4 0) = 3 0117 10
=
1 03
=0
=
1
(0 70
+ 0 30
)
1 035
1
=
0 70 1 9885 10 3 + 0 30 3 0117 10 3 = 2 2178 10
1 035
= 0, by the sudden death assumption.
=
1
(
1 03
+ (1
617
)=
1
2 2178 10
1 03
c
by
A. Mele
where is the risk-neutral probability of an upward movement in the short-term rate during
the rst six months. We can solve for this , obtaining = 80%. Finally, given this probability,
we can calculate the price of the callable bond. We have:
=
1
(0 80 0 98008 + 0 20 1 + 0 03) = 0 98453
1 03
It is lower than the price calculated earlier, because the price of the option is giving more weight
(80%) than before (40%) to the occurrence of the state of the world where the interest rate
goes up.
11.7.3 Convertible bonds
11.7.3.1 Evaluation issues
Consider a three year convertible bond, which can be converted at any time into one share of
the underlying rms stock. The bond has a face value equal to 1, it is default-free, and pays
o a coupon of 3% of the face value every year, except the time at which it is issued. Moreover,
in each period, it pays o the coupon, regardless of whether it will be converted or not.
The price of the share is assumed to be una ected by any decision relating to the conversion
of the bond, and evolves over time as described by the following tree:
In the previous diagram, each period corresponds to one year, denotes the price of the
share, and = 12 is the constant risk-neutral probability of price movements. Assume, nally,
618
c
by
A. Mele
q=
q=
1
2
1
2
1
2
uu: S = 1.20
u: S = 1.10
S = 1.00
uuu: S = 1.30
uud: S = 1.10
ud: S = 1.00
d: S = 0.90
udd: S = 0.90
dd: S = 0.80
ddd: S = 0.70
t=0
t=1
t=2
t= 1.3
that the yield curve is at at 3%, discretely compounded, and that it will remain such over the
next three periods, and in each state of the world.
We proceed to calculate the conversion value of the convertible bond at each node of the tree.
We shall identify, then, the nodes where it is optimal for the bond-holder to convert. Finally,
we shall determine the value of the convertible bond at time = 0, as well as the value of
the option to convert. As for the conversion value, we know this is simply the product of the
conversion ratio times the current value of the outstanding stock, and equals CV = CR = ,
as the conversion ratio is one. To nd the current value of the convertible bond, we proceed
recursively, as explained earlier, and calculate, for each date and each node, max {CV
},
where
denotes the present value of the future cash ows of the convertible, in case of no
conversion at time . The payo s at time = 3 are:
uuu: CV =
= 1 30,
uud: CV =
= 1 10,
udd: CV =
= 0 90,
max {CV 1} + 0 03 = 1 03
ddd: CV =
= 0 70,
max {CV 1} + 0 03 = 1 03
We have:
max {CV 1 } + 0 03, 1 1 103 12 (
= 2 uu: CV = = 1 20,
1 1
(1 33 + 1 13) = 1 19420. Hence
= 1 23000, convert
1 03 2
619
)=
c
by
A. Mele
= 0 80,
max {CV
= 1 03000
3}
+ 0 03,
1 1
1 03 2
)=
1 1
1 03 2
) = 1.
)=
d: CV = = 0 90,
max {CV 2 } + 0 03, 2 1 103 12 (
1 1
(1 07850 + 1 03000) = 1 0235. Hence
= 1 0535
1 03 2
)=
1 1
(
0 03 2
)=
1 1
(1 15060 + 1 0535) = 1 0700
1 03 2
2
3 !
1
1
1
1
= 1 00000
+ 0 03
+
+
1 03
1 03
1 03
1 03
Therefore, the option to convert is worth 0 07000.
Next, assume that the convertible bond is also callable by the issuer, at any time, and at a
strike value of 1 02000, and if it is called, the bond-holder has the option to tender the bond or to
convert it into one share. This convertible, and callable, bond can be evaluated as in the previous
calculations, although the formula to use in each node is, now, max {CV min {
}}, with
= 1 02000. The payo s at time = 3 are, now:
uuu: CV =
= 1 30,
uud: CV =
= 1 10,
udd: CV =
= 0 90,
= 1 1 02}} + 0 03 = 1 03
ddd: CV =
= 0 70,
= 1 1 02}} + 0 03 = 1 03
We have:
= 2 uu: CV = = 1 20,
max {CV min {
= 1 103 12 (1 33 + 1 13) = 1 19420. Hence
bond is called, and then converted
ud: CV = = 1 00,
max {CV min {
= 1 103 12 (1 13 + 1 03) = 1 04850. Hence
bond is called, but not converted
1 1
1 02}} + 0 03, 1
(
+
)
1 03 2
= max {CV 1 02} + 0 03 = 1 23. The
1 1
1 02}} + 0 03, 2
(
+
)
1 03 2
= max {CV 1 02} + 0 03 = 1 05. The
dd: CV = = 0 80,
max {CV min { 3 1 02}} + 0 03,
= 1. Hence
= max {CV 1} + 0 03 = 1 03000
=1
1 1
1 03 2
1 1
max {CV min { 1 1 02}} + 0 03, 1
(
+
)=
u: CV = = 1 10,
1 03 2
1 1
(1 23 + 1 05) = 1 1068. Hence
= max {CV 1 02} + 0 03 = 1 13000. The
1 03 2
bond is called, and then converted
620
c
by
A. Mele
1 1
d: CV = = 0 90,
max {CV min { 2 1 02}} + 0 03, 2
(
+
)=
1 03 2
1 1
(1 050 + 1 03000) = 1 0097. Hence
= max {CV 1 0097} + 0 03 = 1 0397.
1 03 2
The bond is not called and is not converted
1 1
(
1 03 2
)=
1 1
(1 13000 + 1 0397) = 1 0533
1 03 2
As expected, the value of a convertible callable is less than that of the convertible, due to the
option given to the bond-issuers to call the bond.
621
c
by
A. Mele
1)
2)
11
3)
+1
21
0
22 + 1
31
32
0
0
33 + 1
(
(
(
1)
2)
3)
(11A.1)
1 . There
. We can use Eq. (11A.1) to invert for the prices of the zeros, =
for some coupons
is a mathematically equivalent inversion algorithm, a procedure known as bootstrapping, based on
be the price of a bond
the observation that the prices of the zeros can be solved for recursively. Let
be the price of the
that pays o coupons at on dates 1 2 , and the principal of $1 at . Let
can be estimated as follows:
zero maturing at . Then,
P 1
=1
= 1
(11A.2)
=
+1
where is the largest available maturity. It is straightforward to verify this formula using the example
= 3 in (11A.1).
for all . We dene the
To illustrate, suppose the bonds maturing at have xed coupons,
such that the price
is forced to equal 100%. The
par yield as in Eq. (11.3), as the xed sequence
following example shows how to use Eq. (11A.2) and extract zeros and, then, reconstruct a discretely
compounded yield curve.
P
Yield curve
Coupon
Maturity,
Zero price
=1
6 00%
1
0 9434
0 9434
6 00%
7 00%
2
0 8728
1 8162
7 04%
8 00%
3
0 7914
2 6076
8 11%
9 50%
4
0 6870
3 2946
9 84%
9 00%
5
0 6454
3 9400
9 15%
10 50%
6
0 5306
4 4706
11 14%
11 00%
7
0 4579
4 9285
11 81%
11 25%
8
0 4005
5 3290
12 12%
11 50%
9
0 3472
5 6762
12 47%
11 75%
10
0 2980
12 87%
Note that Eq. (11A.2) relies on the assumption that no maturities are missing. When some of
can be replaced with a linear
the maturity dates are not available, the required coupon rate
and
,
as
follows,
interpolation between
1
+1
+1
+1
1
1
+
+1
+1
1
The e ects of the interpolation should be visible near the missing maturitites.
B. Splines
Alternative to bootstrap are techniques that aim to cope with situations where the number of bonds is
less than the maturity dates we want to t. Suppose we observe bonds, where the -th bond entitles
622
, for
(
= 1
)=
c
by
A. Mele
=1
)+ ,
= 1
1
1
+ 3
(11A.3)
( )= 1+ 2
and
and are parameters. We provide interpretations of this parametrization in the next chapter,
where we also explain how this has been used in practice to forecast the yield curve.
C. No arb restrictions
Bond prices need to satisfy restrictions that prevent arbitrage. We use gures taken from Tuckman
(2002) (p. 8-12), and illustrate how an arbitrage opportunity can arise and be exploited in this context.
Data
Suppose that on some hypothetical date, say Februrary 15, 09, we observe Set I of bond prices in the
left panel of the following table, and that bootstrap leads to to the implicit zeros in the corresponding
right panel of the table. Also assumed is that we observe additional bond prices, those in Set II in the
lower part of the table. Are these additional prices, those in Set II, compatible with those in Set I, in
terms of arbitrage opportunities?
623
Bootstrapped zeros
Time to maturity
05
10
15
20
25
c
by
A. Mele
Implicit zero
(0 0 5) = 0 97557
(0 1 0) = 0 95247
(0 1 5) = 0 93045
(0 2 0) = 0 90796
(0 2 5) = 0 88630
matrix
of coupons, where each row of
gives the stream of the coupons promised by a
given asset. We know that the 1 vector of zeros , satises, =
. That is, assuming that the
matrix is invertible,
1
(11A.4)
=
Next, suppose there exists some asset that: (i) promises to pay:
=
and (ii) has a price,
, such that:
+ 100
(11A.5)
(11A.6)
where the vector of unknowns, , contains the number of assets in the synthesizing portfolio: by
purchasing the portfolio , one is entitled to receive
in the future, which we want to equal . The
solution to Eq. (11A.6) is:
1
(11A.7)
=
1
Accordingly, the value of this portfolio, say, is given by, = =
=
, where the
last equality follows by the zero pricing equation (11A.4), and the inequality holds by the inequality
(11A.5).
To summarize, we now have the following situation: (i) the asset we hold produces the cash ows
that are needed to pay out the coupons of the synthesizing portfolio we sold, and (ii) the price of
the asset we go long is less than the value of the portfolio we short. This situation is an arbitrage
opportunity, as initially claimed. We now use these insights to check whether arbitrage opportunities
exist and exploited, using the data in Tables 11.1 through 11.3.
624
c
by
A. Mele
First, we determine no-arbitrage prices of the bonds in Set II, using the implicit zeros extracted from
Set I. Denote these prices with 1 (for the six month 13.375%), 2 (for the two year 10.750%), 3
(for the 2.5 year 5.750%), and 4 (for the 2.5 year 11.125%). They are given by,
13 375
+ 100 (0 0 5)
1 =
2
10 750
[ (0 0 5) + (0 1) + (0 1 5)] + 10 2750 + 100 (0 2 0)
2 =
2
5 75
5 75
+ 100 (0 2 5)
3 = 2 [ (0 0 5) + (0 1 0) + (0 1 5) + (0 2 0)] +
2
11 125
[ (0 0 5) + (0 1 0) + (0 1 5) + (0 2 0)] + 11225 + 100 (0 2 5)
4 =
2
The next table provides the numerical values of these theoretical prices, comparing to their market
counterparts:
Set II: Treasury Bond prices
Coupon
Maturity
Market price
13 375%
8/15/09
104 080
10 750%
2/15/11
110 938
5 750%
8/15/11
102 020
11 125%
8/15/11
114 375
No-arb price
104 080
111 041
102 007
114 511
While there are no arbitrage opportunities for the 13.375% bond expiring in six months, the price
of the 10.750% bond expiring in 2 years is less than its no-arbitrage price: this bond trades cheap.
In contrast, the 2.5 year 5.750% bond trades rich, although the resulting arbitrage does not seem
to be quite sensible.
Second step: implementing the arbitrage
1
1
1
previous insights. We have,
= 4, and = 2 10 750 2 10 750 2 10 750 2 10 750 + 100 . We use
the rst four bonds in Set I to construct an arbitrage portfolio. In terms of the coupon matrix , we
have,
1
0
0
0
2 7 875 + 100
1
1
14
250
14
250
+
100
0
0
2
2
=
1
1
1
6
375
6
375
6
375
+
100
0
2
2
2
1
1
1
1
6
250
6
250
2
2
2 6 250
2 6 250 + 100
We implement the following trade: (i) buy 10.750% bonds expiring in 2 years, which cost 110 938 ;
(ii) create portfolios satisfying Eq. (11A.7),
1
=
= 0 0189 0 0197 0 0212 1 0218
If we short of these portfolios, then, by construction, the coupons we need to pay are exactly matched
by the coupons we receive from the 10.750% bonds expiring in 2 years. However, the market value
of the portfolios we short equals,
101 40
108 98
102 16
102 57
111 041
where the vector of the market prices, , is taken from Set I. Therefore, the gains from this trade are,
(111 041 110 938) = 0 103 . For example, by trading $1,000,000 at face value, i.e. = 10000,
then, arbitrage prots equal $1030.
625
c
by
A. Mele
)=
2)
In the second period, the value of the portfolio is random, as it depends on the development of the
short-term rate . Precisely, the value of the portfolio in the second period, is
( + ) = ( + 2 ) + (1 + ) with probability
() =
(1 + ) with probability 1
( )= (
2) +
We also know that in the second period, the value of the second zero is,
( + 1 ) with probability
( 1 ) =
(
with probability 1
1)
Next, we select and
state of nature, viz
1)
1,
in each
in each state.
Mathematically, this is tantamount to solving the following system of two equations with two unknowns
( and ),
( + ) = ( + 2 ) + (1 + ) = ( + 1 )
(11A.8)
+ (1 + ) = (
( )= (
2)
1)
The solution is,
=
(
(
+
+
1)
(
(
2)
1)
2)
1)
[ (
(
+
2)
2)
(
2 )] (1 + )
1)
2)
By construction, the previous portfolio, ( ), replicates the value of the second zero in the second
period. But if two assets (the portfolio, and the second zero) yield the same payo s in each state of
the nature, they must be worth the same, in the absence of arbitrage. Therefore, we must have,
=
2)
+ =
(1 + ) = (1 + )
1)
(1 + )
0(
)|
1)
or,
(
2)
(11A.9)
Next, let us gure out the prediction of the model in terms of the expected return it generates for
the price of the bond maturing at 2 , when (
) = ( ). To do this, multiply the rst equation in
(11A.8) by , and multiply the second equation in (11A.8) by 1
. Add the result for =
=
to obtain,
+ (1 + ) =
) (
( + 1 ) + (1
) (
( + 2 ) + (1
2)
1)
Replacing Eq. (11A.9) into the previous equation yields,
( + 2 ) + (1
) (
(1 + ) (
2)
+
=
(
) (
(1 + ) (
1 ) + (1
1)
626
2)
1)
c
by
A. Mele
2 ) + (1
) (
2)
2 )]
(1 + ) (
2)
2)
1 ) + (1
) (
1)
1 )]
(1 + )
1)
1)
The previous equation is easy to interpret. The numerators are the expected excess returns from
[ ( )] (1 + ) (
), where
[ ( )] is what the investors
holding the assets. They equal
) today, in the bond; and (1 + ) (
) is
expect to receive, the next period, by investing (
) today, in the MMA. The
what the investors expect to receive, the next period, by investing (
denominators constitute a measure of volatility related to holding the assets. The previous equation
then tells us that the Sharpe ratios (or the unit risk premiums) on the two zeros agree.
Let the Sharpe ratio on any zero be equal to some function of the short-term rate only (and
possibly of calendar time). This function, , does not clearly depend on the maturity of the zeros.
Then, we have,
) (
(1 + ) ( 2 ) = ( + 2 )
(
( + 2 ) + (1
2)
2)
=
2)
+
2)
[(
627
) ] (11A.10)
).
c
by
A. Mele
)=
Y1
=
1
1+
( )
(
(
)
)
(
(
1
) Y
1
)
1+
( )
=
, we have that,
)=
(
(
1
) Y 1+
)
1+
=
()
( )
(11A.11)
Eq. (11A.11) is a convenient representation of the bond price at a future date : it is the ratio of
the two current prices (
) and ( ), and a factor relating to the development of forward rates
1+ ( )
from the current time to time , i.e. 1+
=
1. Hence, once we model forward
( ) , for
rates, we have implications for bond price movements.
We normalize the time-line and set = 0. Redening = , Eq. (11A.11) reduces to,
)=
1
(0 ) Y 1 +
(0 )
1+
=
(0)
()
(11A.12)
+1
( + 1) = ln
= ln
= ln
= ln
+1 (
+1 )
+ 1)
+1 ( + 1
( )
ln
(
)
( + 1)
(
)
+ ()
( +1
)
( + 1 ( + 1))
+ (0)
( + 1)
(
(
+ 1)
+ 1)
( +1
[( + 1)
( + 1)] ln
where the rst equality and the third follow by the denition of +1 ( ), the second equality
holds by the denition of the jump in Eq. (11.27), the fourth equality follows by Eq. (11.37).
Hence, Eq. (11.37) holds at time + 1 in the occurrence of a positive price jump between time
and time + 1.
628
c
by
A. Mele
( )
= ln (
)
( + 1)
(
)
= ln
+ ()
( +1
)
( + 1) = ln
= ln
)+1
( +1
( +1
)
(
)
+ (0)
= ln
( + 1)
ln
1
)+1
[( + 1)
( +1
+ (0) + ln
(
(
+ 1)
+ 1)
( +1
)
( + 1)
) ln
] ln
where the rst four equalities follow by the same arguments produced in Case 1, the fth
equality holds by the relation ( ) = ( ) ( 1) in Eq. (11.33) and the last equality follows
by rearranging terms. Hence, Eq. (11.37) holds at time + 1 in the occurrence of a negative price
jump between time and time + 1.
629
c
by
A. Mele
References
Bernanke, B. S. and A. Blinder (1992): The Federal Funds Rate and the Channels of Monetary
Transmission. American Economic Review 82, 901-921.
Black, F. and M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal
of Political Economy 81, 637-659.
Black, F., E. Derman and W. Toy (1990): A One Factor Model of Interest Rates and its
Application to Treasury Bond Options. Financial Analysts Journal (January-February),
33-39.
Cox, J. C., S. A. Ross and M. Rubinstein (1979): Option Pricing: A Simplied Approach.
Journal of Financial Economics 7, 229-263.
Derman, E. and J. Kani (1994): Riding on a Smile. Risk 7, 32-39.
Dupire, B. (1994): Pricing with a Smile. Risk 7, 18-20.
Heath, D., R. Jarrow and A. Morton (1992): Bond Pricing and the Term-Structure of Interest
Rates: a New Methodology for Contingent Claim Valuation. Econometrica 60, 77-105.
Ho, T. S. Y. and S.-B. Lee (1986): Term Structure Movements and the Pricing of Interest
Rate Contingent Claims. Journal of Finance 41, 1011-1029.
Ho, T. S. Y. and S.-B. Lee (2004): The Oxford Guide to Financial Modeling. Oxford University
Press.
Hull, J. C. (2003): Options, Futures, and Other Derivatives. Prentice Hall. 5th edition (International Edition).
Hull, J. C. and A. White (1990): Pricing Interest Rate Derivative Securities. Review of
Financial Studies 3, 573-592.
McCulloch, J. (1971): Measuring the Term Structure of Interest Rates. Journal of Business
44, 19-31.
McCulloch, J. (1975): The Tax-Adjusted Yield Curve. Journal of Finance 30, 811-830.
Nelson, C.R. and A.F. Siegel (1987): Parsimonious Modeling of Yield Curves. Journal of
Business 60, 473-489.
Rubinstein, M. (1994): Implied Binomial Trees. Journal of Finance 49, 771-818.
Tuckman, B. (2002): Fixed Income Securities. Wiley Finance.
Vasicek, O. (1977): An Equilibrium Characterization of the Term Structure. Journal of
Financial Economics 5, 177-188.
630
12
Interest rates
12.1 Introduction
This chapter surveys empirical facts and models regarding the term-structure of interest rates
and derivatives based thereon. It di ers from the previous introductory chapter, as we now
largely rely on continuous-time methods while providing a systematic approach to a variety
of important topics. These topics range from the stylized facts such as the factors driving the
yield curve, their business cycle components, or bond returns predictability and volatility, to
more conceptual aspects regarding how we would need to think about duration in a random
environment, or the pricing details of interest rate derivatives such as bond options, puttable
and callable bonds, swaps, caps, oors, or swaptions, to mention a few.
We know from previous chapters that an important objective arising while pricing derivatives
is that we make sure that the price of the underlying assets is pinned down without errors. When
it comes to interest rate derivatives, this task is more challenging, because the yield curve relies
on risks that are typically not traded. Consider, for example, a model where the price of a zero
coupon bond is only driven by random movements of the short-term ratea one-factor model.
Let (
) be the price at time of a zero coupon bond expiring at time , when the
short-term rate is . The exact functional form of the pricing function (
) depends on
the assumed dynamics of the short-term rate and the market risk-appetite. Models of this kind,
and generalizations to multi-state variables, are known as models of the short-term rate, and
are discussed in Section 12.4.
Models of the short-term rate are very important because once they are made complex enough
to address the main facts we see in the data, they might perform a series of tasks. For example,
they can be used to forecast developments in xed income markets. They could also be used for
trading purposes should they reasonably point to market ine ciencies. Note that these models
lead to pricing errors, and it is actually the presence of these errors to justify their potential
use for trading purposes.
A second class of models that does not lead to pricing errors is that developed by Heath,
Jarrow and Morton (1992), which generalizes the Ho and Lee (1986) model described in the
previous chapter. In the previous chapter, we have seen three-based instances of additional noarbitrage models. This chapter provides a systematic continuous time treatment (in Sections
c
by
A. Mele
12.6 and 12.6). A principle underlying these models is that current bond prices need not be
modeled in the rst place. Rather, current bond prices are taken as primitives, with the modeling
focus being shifted to the ongoing development of forward rates, i.e. interest rates prevailing
today for borrowing in the future. There is a relation linking bond prices to forward rates. No
arbitrage then restricts the joint behavior of future bond prices and forward rates. We shall
emphasize the use of these models to price derivatives.
In more detail, the plan of this chapter is as follows. The next section provides denitions of
interest rates and markets, and foundational issues regarding two basic representations of bond
pricesone in terms of the short-term rate and another in terms of forward rates. Section 12.3
contains an introduction to a number of very important empirical topics, such as bond return
predictability, or the relation between the yield curve and the business cycle. Sections 12.4
and 12.5 deal with models of the short-term rate, and Section 12.6 with their perfectly tting
extensions, i.e. the extensions that make these models t the initial yield curve without errors.
Section 12.7 contains a treatment of models that t the yield curve, based on the Heath, Jarrow
and Morton (1992) framework. Section 12.8 is an introduction to the main interest rate derivatives, and describes how these assets can be priced relying on the models of the short-term rate
(and their perfectly tting extensions). Section 12.9 provides an alternative pricing framework,
known as market model, whereby derivatives are evaluated through Blacks (1976) pricers.
A number of appendixes provides technical details omitted from the main text. This chapter
does not assume default risk except in Section 12.4.6. Default risk is, instead, systematically
dealt with in the next chapter.
Let (
) the price at of a zero coupon bond expiring at , and consider the discretely
compounded interest rate for the time interval [
] introduced in Section 11.2.2.1 of the
previous chapter, and dened as:
(
Given
)=
1
1+(
) (
(12.1)
is obtained as:
lim (
Next, let be a risk-neutral probability, and E () denote the time conditional expectation
under . By the FTAP, there is no arbitrage if and only if (
) satises, for all
[ ],
(
)=E
(12.2)
The proof of Eq. (12.2) relies on arguments that are now quite standard in these lectures, but
its if part is provided again in Appendix 1, because it highlights a few key hedging arguments
that underlie it.
632
c
by
A. Mele
Given a set of dates { } =0 , a xed coupon bond pays o a known coupon stream, i.e. at ,
= 1 , and $1 at ; typically, the coupon paid at at compensates for the time-interval
1 . By the FTAP, the value of a xed coupon bond is
fcb
)=
)+
=1
)=
(12.3)
, and where the second equality follows by Eq. (12.1). By the FTAP, the
as of time of a oating rate bond is:
+1
frb
frb
()=
)+
=1
)+
=1
)+
1)
=1
0)
=1
=1
(12.4)
where the second line follows by Eq. (12.3) and the third line will be justied in a moment
(see Eq. (12.5) below). That is, a oating rate bond would quote at par at its rst reset date,
( 0 0 ) = 100%.1
frb ( 0 ) =
Regarding the third line of Eq. (12.4),
(
)=E
(12.5)
consider the following economic interpretation. Suppose that at time , $ ( ) are invested in
a bond maturing at time . At time , this investment will obviously pay o $1. And at time
, $1 can be further rolled over another bond maturing at time , thus yielding $ 1/ (
)
at time . In other words, an investment at equal to $ ( ), leads to a payo at equal
to $ 1/ (
), whence Eq. (12.5).2
1 This
property also holds in a market where the oating rates continuously pay o the instantaneous short-term rate . Indeed,
= , and let frb is solution to the partial di erential equation (12.87) in Section 12.6, with ( ) = , and boundary condition
frb ( ) = 1. Then, it can be veried that frb = 1 is solution to Eq. (12.87).
2 Mathematically, we have, by the Law of Iterated Expectations, that
let
( )
( )
=E
( )
633
F( )
c
by
A. Mele
Forward rates are interest rates that make the value of a forward rate agreement (FRA, henceforth) equal to zero at origination. Section 11.2.2.3 of the previous chapter provides the denition
of a forward rate agreement, which we re-state below for reasons claried in a moment. Forward rates as of time , for a forward rate agreement relating to a future time-interval [
], are
denoted with (
), and link to bond prices through a precise relation, derived in Section
11.2.2.3 of the previous chapter:
(
(
Clearly, the forward rate agreed at
to the same period:
)
=1+(
)
)=
(12.6)
(12.7)
Consider, next, a more general FRA, where a rst counterparty agrees: (i) to pay an interest
rate on a given principal at time , xed at some
6= (
), and (ii) to receive, in
exchange, the future interest rate prevailing at time
for the time interval [
], (
),
from a second counterparty. The time payo originated by this forward starting interest
rate swap is:
(
)[ (
)
]
(12.8)
It is the same as the P&L to a party who enters a FRA at time (with no costs) for the
time-interval [
], as a future borrower. Come time , the party shall honour the FRA by
borrowing $1 for the time-interval [
] at a cost of . At the same time, the party can lend
this very same $1 at the random interest rate (
). The time payo deriving from this
trade is, of course, the same as that in Eq. (12.8).
12.2.3 A second representation of bond prices
12.2.3.1 Prices as forward looking indicators
Bond prices can be expressed in terms of these forward interest rates, namely in terms of the
instantaneous forward rates. First, rearrange terms in Eq. (12.6) so as to obtain:
(
The instantaneous forward rate (
(
)=
(
(
(
) (
)
)
) is dened as
lim (
ln (
)=
(12.9)
It can be interpreted as the marginal rate of return from committing a bond investment for an
ln ( )
additional instant. To express bond prices in terms of , integrate Eq. (12.9), ( ) =
,
with respect to the maturity date , use the condition that ( ) = 1, and obtain:
(
)=
(12.10)
Eq. (12.10) suggests a natural modeling approach of the yield curve that emphasizes the
dynamics of the forward rates, and dealt in detail in Sections 12.5 and 12.6.
634
c
by
A. Mele
Consider the yield-to-maturity introduced in Section 11.2.2.2 of the previous chapter, dened
to be the function ( ) such that:
(
) (
(12.11)
)=
(12.12)
yields:
[ (
)]
(12.13)
Eq. (12.13) underscores the marginal nature of forward rates: the yield-curve, ( ), is
increasing in, decreasing in, or stationary at , according to whether ( ) exceeds, is lower,
or equal the spot rate for maturity . It is a simple but powerful result: it is intuitive that
forward rates should summarize information about the market expectations regarding future
interest rates, as explained in Section 12.3.1. For example, an increasing yield curve would
signal high rates to be expected in the future, under conditions explained in Section 12.3.1.
( )
)]
(12.16)
Note a very straight conclusion we can make from Eq. (12.16): short-term rates are expected
to be high at some future date (compared to the corresponding yield) if, and only if, the
yield curve is increasing in . One implication is that an increasing yield curve (i.e. one where
is always less than ( )) signals the expectation that future short-term rates will increase
and, similarly, an inverted yield curve signals the expectation that future short-term rates will
fall.
For example, in December 2012, the yield curve in the United Kingdom was substantially at
at less than 0.5% for maturities of less than three years, and then steadily increased to 2% for
maturities up to ten years. Taken at face value, the expectation hypothesis would simply imply
635
c
by
A. Mele
that in December 2012, the market did not expect UK short-term to move over the following
three years. Naturally, the market reasoning could have changed over new information.
A natural question arises as to whether in the data the forward rate for maturity is higher
than the short-term rate expected to prevail at time , viz
(
( )
(12.17)
It is an old issue. One possibility might be that in the presence of risk-averse investors.
Consider the Hicks-Keynesian normal backwardation hypothesis reviewed in Chapter 10.3 In
the context of this chapter, forward rates can be higher than expected spot rates as rms
demand long-term funds with fund suppliers preferring to lend at shorter maturity dates. In
this case, the market might be cleared by intermediaries, who require a liquidity premium to
be compensated for their risky activity of borrowing at short and lending at long maturities
intermediaries worry about their prots at time , thereby lending at higher rates than the
expected spots.4
Finally, consider the following denition of the term-premium, the di erence between the
spot rate and the future expected average short-term rate, for the same horizon, as follows:
TP (
( )
[ (
( )]
where the second equality follows by Eq. (12.12). We see that the expectation hypothesis, and
possible violations of it, bring content to the term-premium. The next section aims to provide
evidence on the expectation hypothesis, as linked to the predictability of bond returns.
12.3.2 Bond returns predictability
What does the empirical evidence suggest about the expectation hypothesis? How does the
expectation hypothesis link to bond returns? First, we explain under which probability should
the expectation hypothesis hold. Second, we explain the main issues regarding the empirical
evidence.
12.3.2.1 Forward-adjusted expectation hypothesis
In more advanced sections of this chapter (see Section 12.9), we shall explain that forward
rates cannot be unbiased expectations of the future spot rates, not even under the risk-neutral
probability. They only are such, under a certain probability, known as forward probability,
already introduced in Chapter 4. Section 12.8.3, for example, shows that,
(
)=E
( )
(12.18)
636
E [ ( )]
( (
E [ ( )])
c
by
A. Mele
sources of risk, which might actually prevent the expectation theory from holding, in practice.
To illustrate, let us elaborate on Eq. (12.18),
(
)=E
( )
=E ( ( )
=E ( )+
=
)
(
( )+
( )
( )
( )
(12.19)
where
( ) denotes the Radon-Nikodym derivative of the physical probability against the
risk-neutral; E and
denote time conditional expectation and covariance under the riskneutral probability, with
and
denoting the counterparts under the physical probability.
That is, forward rates might deviate from future expected short-term rates, because of riskaversion corrections (the second term in the last equality) and randomness in interest rates (the
third term in the last equality).
Note that we can determine these covariance terms through the Radon-Nikodym theorem.
Show that the second covariance
( ( )
) is zero, when goes to . Then,
as goes
+1
+ 1)
such that the expected change in the yield curve is negatively related to the expected excess
returns and positively related to the slope of the yield curve:
[ ( +1
)] =
+1 +
+ 1)
( ( + 1)) +
+1 ( + 1)
( + 1) +
[ ( )
( + 1)]
1
+1 ( + 1)
( + 1)
where the rst equality holds by the expectation hypothesis, and the second is Eq. (12.19).
Therefore, the sum of the last two terms in the last equality is zero, implying that
+1 is
zero. Veronesi (2010, Chapter 7) builds up a parametric example to illustrate these relations,
within an a ne model. In the Appendix, we illustrate how these relations work, analytically,
by hinging upon a simple but famous modelthat of Vasicek (1977). [In progress]
637
c
by
A. Mele
Empirically, we can test for the expectation theory, by running the following regression:
( +1
)=
[ (
+ 1)] + Residual
+1 =
+
( + 1) + Residual
for many maturities .
delivers statistically signicant and positive values of
Cochrane and Piazzesi (2005) go one step further and consider the following regressions:
+1 =
+ 1) +
5
X
+ Residual
=2
where
1,
5
ln
(
(
)
.
1)
+ Residual
+ 1) +
5
X
=2
where
is the common factor among the bond maturities
{1 5}. Moreover, they
argue that the predicting power of their factors is not destroyed, in sample, while conditioning
on the standard known to explain movements in the yield curve (see Section 12.3.5).
12.3.3 The yield curve and the business cycle
There is a simple prediction about the shape the yield-curve that we can make. By Jensens
E ( )
inequality, ( ) ( )
( ) = E[
]
. Therefore, the yield curve
R
1
satises: ( )
E ( ) . For example, suppose that the short-term rate is a martingale under the risk-neutral probability, viz E ( ) = . Then, the yield curve is bound to be:
( )
. That is, the yield curve is not increasing in time-to-maturity, , at least for small
maturities. Positively sloped yield curve, then, likely arise because the short-term rate is not
a martingale under the risk-neutral probability, which happens because of two fundamental,
and not necessarily mutually exclusive, reasons: (i) interest rates are expected to increase, (ii)
investors are risk-averse. On average, the US yield curve is upward sloping at maturity from
one up to ten years.
There is strong empirical evidence since at least Kessel (1965) or, later, Laurent (1988,
1989), Stock and Watson (1989), Estrella and Hardouvelis (1991) and Harvey (1991, 1993),
that inverted yield curves predict recessions with a lead time of about one to two years. Figure
12.1 illustrates these empirical facts through a plot of the di erence between long-term and
short-term yields on Treasuriesin short, the term spread.
638
c
by
A. Mele
FIGURE 12.1. This picture depicts the time series of the term spread, dened as the
di erence between the 10 year yield minus the 3 month yield on US Treasuries. Sample
data cover the period from January 1957 to December 2008. The shaded areas mark
recession periods, as dened by the National Bureau of Economic Research. The end of
the last recession was announced to have occurred in June 2009.
Naturally, there are recession episodes preceded by mild yield curve inversions. But the really
striking empirical regularity is the sharp movements of the term spread towards a negative
territory, occurring prior to any recession episode. Note, it is not really important whether it is
the short-term rate that goes up or the long-term rate that goes down. The empirical regularity
is that the term spread goes down and becomes negative prior to a recession. The explanations of
these statistical facts are challenging, and might hinge upon both (i) the conduct of monetary
policy and the expectations about it, and (ii) the risk-premiums agents require to invest in
long-term bonds. We discuss these two points below.
(i) The monetary channel :
(i.1) During expansions, monetary policy tends to be restrictive, to prevent the economy
from heating up. At the height of an expansion, then, short-term yields go up.
(i.2) Moreover, during recessions, monetary policy tends to keep interest rates low. At
the height of an expansion, agents might be anticipating an incoming recession and
expecting central banks to lower future interest rates. Therefore, at the height of
an expansion, future interest rates might be expected to lower. The expectation hypothesis in Eq. (12.15) would then predict that the slope of the yield curve would
639
c
by
A. Mele
decrease. Granted, in the previous subsection, we have just learnt that the expectation hypothesis does not hold empirically. Bond markets command risk-premiums.
However, a risk-premium channel would reinforce the conclusion that the slope of
the yield curve decreases during expansions, as argued in the next point.
(ii) The risk-premium channel : From Chapter 7, we know that risk-premiums are countercyclical, being high during recessions and low during expansions. The conditional equity
premium is countercyclical, and so is the long-bond premium.5 In fact, long-term yields
and equity expected returns are likely to be driven by the same state variables a ecting
the pricing kernel of the economy.6
Lets summarize. On the one hand, countercyclical monetary policy might be responsible
of the negative price changes a ecting short-term bonds. On the other, expectations about
countercyclical monetary policy as well as procyclical risk-appetite might be responsible for
positive price changes a ecting long-term bonds. These price movements, we have argued,
should occur at the height of an expansion. But the sample data we have are those where
expansions are followed by recessions. Whence, the statistical facts about the predictive content
of the yield curve, as we further formalize with a simple model in Section 12.4.3.7
Are these explanations plausible? It is interesting to note that these inversions did also use to
occur prior to the creation of the Federal Reserve system. The creation of the US Central Bank
might constitute a Natural Experiment to perform statistical inference about the importance
of the gaming between central banks and the market expectations about the future conduct of
monetary policy.
Note also that the inversion of the yield curve occurring in 2006 might have arisen due to a
strong demand for long-term bonds, as warned by some policy-makers at the time (see, e.g.,
the European Central Bank Monthly Bulletin, February 2006, p. 27). It is clearly challenging to
quantify the extent of this demand pressure, which might perhaps be coming from institutional
investors such as Pension Funds over their performamce of asset-liability management duties. It
is undeniable, though, that in 2006, the Federal Reserve was targeting higher and higher interest
rates to deal with concerns about ination generated by a previous loose policy following the
2001 recession, the Twin Towers attacks and, perhaps, the Corporate scandals in 2003 too. It is
an open question as to whether the markets thought that this increased tightening was marking
the end of an expansion, thereby feeding an expectation future interest rates would drop again
in the near future. Ironically, the sharp tightening of the FED policy at the time would carry
implications on nancial and economic developments such as the 2007 subprime crisis and the
crisis arising therefrom as explained in the next chapter.
5 An objection to this line of reasoning is that countercyclical risk-premiums might lead to expect future bond prices to decrease
over a future recession, thereby destroying the e ects of a procyclical short-term rate. In Section 12.3.3.2, we develop a model where
these e ects do not arise as soon as the e ects of countercyclical risk-premiums are assumed to be bounded.
6 That long term bonds and stock market are informally acknowledged to be tightly related is witnessed by a quite raw rule of
thumb, whereby a stock market correction, such as a crash say, is deemed to be imminent when the spread 30 year bond yield minus
the earning-price ratio is larger than 3%. This spread, which is usually around 1% or 2%and on average, zero, once corrected for
inationwas indeed larger than 3% in 1987 and in 1997.
7 There might be other channels. Inverted yield curves lead to negative margins for banks, which might then contribute to a
credit crunch, determined by a less aggressive attitude to lend due to the negative margins. This might depress demand, leading to
the expectation of an imminent recession. This expectation leads to a negative term spread due to the mechanism analyzed in the
main text, over a vicious cycle.
640
c
by
A. Mele
(i) Yields are highly correlated (say three year yields with four year yields, with ve year
yields, etc.), and suggest the existence of common factors driving all of them, discussed
in Section 12.3.5 below.
(ii) Yields are also highly persistent, and this persistence bears important consequences on
derivative pricing, as explained in Section 12.8.1.
(iii) The term-structure of unconditional volatility is downward sloping, a feature also rationalized in Section 12.8.1.
12.3.5 Common factors a ecting the yield curve
Which systematic risks a ect the entire term-structure of interest rates? How many factors are
needed to explain the variation of the yield curve? The standard duration hedging practice,
reviewed in detail in Chapter 11, relies on the idea that most of the variation of the yield curve
is successfully captured by a single factor that produces parallel shifts in the yield curve. How
reliable is this idea, in practice? This section reviews famous evidence that most of the variation
of the US yield curve is explained by just a few factors, interpreted as (i) a level factor, (iii)
a steepness factor, and (iii) a curvature factor. It is natural to expect that only a few factors
explain the yield curve: intuitively, there is no so much di erence between a 10Y zero and a
10Y+1day zero, and then, 10Y+2days, etc. Repeating this reasoning ad innitum suggests that
most of the yield curve should be likely driven by common sources of variation.
Litterman and Scheinkman (1991) demonstrate that most of the variation (more than 95%)
of the term-structure of interest rates can be attributed to the variation of three unobservable
factors, which they label (i) a level factor, (ii) a steepness (or slope) factor, and (iii)
a curvature factor. To disentangle these three factors, the authors make an unconditional
analysis based on a xed-factor model. Succinctly, this methodology can be described as follows.
Suppose that returns computed from bond prices at di erent maturities are generated by
a linear factor model, with a xed number of factors,
= +
1
+
1
(12.20)
1
where
is the vector of returns,
is the zero-mean vector of common factors a ecting the
returns, assumed to be zero mean, is the vector of unconditional expected returns, is a vector
of idiosyncratic components of the return generating process, and is a matrix containing the
factor loadings. Each row of contains the factor loadings for all the common factors a ecting
a given return, i.e. the sensitivities of a given return with respect to a change of the factors.
Each comumn of contains the term-structure of factor loadings, i.e. how a change of a given
factor a ects the term-structure of excess returns.
12.3.5.1 Methodological details
Estimating the model in Eq. (12.20) leads to econometric challenges, mainly because the vector of factors
is unobservable.8 However, there exists a simple method, known as principal
8 Suppose that in Eq. (12.20),
>+ .
(0 ), and that
(0 ), where is diagonal. Then,
, where =
The assumptions that
(0 ) and that
is diagonal are necessary to identify the model, but not su cient. Indeed, any
641
c
by
A. Mele
components analysis (PCA, henceforth), which leads to empirical results qualitatively similar
to those holding for the general model in Eq. (12.20). We discuss these empirical results in the
next subsection. We now describe the main methodological issues arising within PCA.
The main idea underlying PCA is to transform the original correlated variables into a set
of new uncorrelated variables, the principal components. These principal components are linear
combinations of the original variables, and are arranged in order of decreased importance: the
rst principal component accounts for as much as possible of the variation in the original data,
etc. Mathematically, we are looking for linear combinations of the demeaned excess returns,
= >
= 1
(12.21)
such that, for vectors > of dimension 1 , (i) the new variables are uncorrelated, and (ii)
their variances are arranged in decreasing order. The logic behind PCA is to ascertain whether
a few components of
= [ 1 ]> account for the bulk of variability of the original data.
>
>
>
Let
= [ 1
] be a matrix such that we can write Eq. (12.21) in matrix format,
>
=
or, by inverting,
=
> 1
(12.22)
Next, suppose that the vector ( ) = [ 1 ]> accounts for most of the variability in the
original data,9 and let >( ) denote a matrix extracted from the matrix > 1 through
the rst rows of > 1 . Since the components of ( ) are uncorrelated and they are deemed
largely responsible for the variability of the original data, it is natural to disregard the last
components of in Eq. (12.22),
>( )
( )
1
( )
If the vector
really accounts for most of the movements of , the previous approximation
to Eq. (12.22) should be fairly good.
Let us make more precise what the concept of variability is in the context of PCA. Suppose
that the variance-covariance matrix of the returns, , has distinct eigenvalues, ordered from
the highest to the lowest, as follows: 1
. Then, the vector
in Eq. (12.21) is the
eigenvector corresponding to the -th eigenvalue. Moreover,
= 1
( )=
Finally, we have that
RPCA
P
= P =1
=1
P
( )
= P =1
( )
=1
(12.23)
orthogonal rotation of the factors yields a new set of factors which also satises Eq. (12.20). Precisely, let
be an orthonormal
> > =
> . Hence, the factor loadings
matrix. Then, (
)(
)> =
and
have the same ability to generate the matrix
. To obtain a unique solution, one needs to impose extra constraints on . For example, J
oreskog (1967) develop a maximum
1 , where
likelihood approach in which the log-likelihood function is, 12
is the sample covariance matrix of
ln | | + Tr
, and the constraint is that >
be diagonal with elements arranged in descending order. The algorithm is: (i) for a given ,
maximize the log-likelihood with respect to , under the constraint that >
be diagonal with elements arranged in descending
order, thereby obtaining ; (ii) given , maximize the log-likelihood with respect to , thereby obtaining , which is fed back into
step (i), etc. Knez, Litterman and Scheinkman (1994) describe this approach in their paper. Note that the identication device they
describe at p. 1869 (Step 3) roughly corresponds to the requirement that >
be diagonal with elements arranged in descending
order. Such a constraint is clearly related to principal component analysis.
9 There are no rigorous criteria to say what most of the variability means in this context. Instead, a likelihood-ratio test is
most informative in the context of the estimation of Eq. (12.20) by means of the methods explained in the previous footnote.
642
c
by
A. Mele
(Appendix 4 provides technical details and proofs of the previous formulae.) It is in the sense
of Eq. (12.23) that in the context of PCA, we say that the rst principal components account
for RPCA % of the total variation of the data.
12.3.5.2 The empirical facts
The striking feature of the empirical results uncovered by Litterman and Scheinkman (1991)
is that they have been conrmed to hold across a number of countries and sample periods.
Moreover, the economic nature of these results is the same, independently of whether the
statistical analysis relies on a rigorous factor analysis of the model in Eq. (12.20), or a more
back-of-envelope computation based on PCA. Finally, the empirical results that hold for bond
returns are qualitatively similar to those that hold for bond yields.
Level
Slope
Curvature
Figure 12.2 visualizes the e ects that the three factors have on the movements of the termstructure of interest rates.
The rst factor is called a level factor as its changes lead to parallel shifts in the termstructure of interest rates. Thus, this level factor produces essentially the same e ects
on the term-structure as those underlying the duration hedging portfolio practice. This
factor explains approximately 80% of the total variation of the yield curve.
The second factor is called a steepness factor as its variations induce changes in the
slope of the term-structure of interest rates. After a shock in this steepness factor, the
short-end and the long-end of the yield curve move in opposite directions. The movements
of this factor explain approximately 15% of the total variation of the yield curve.
The third factor is called a curvature factor as its changes lead to changes in the
curvature of the yield curve. That is, following a shock in the curvature factor, the middle
of the yield curve and both the short-end and the long-end of the yield curve move in
opposite directions. This curvature factor accounts for approximately 5% of the total
variation of the yield curve.
643
c
by
A. Mele
Understanding the origins of these three factors is still challenging to nancial economists and
macroeconomists. For example, macroeconomists explain that central banks a ect the shortend of the yield curve, e.g. by inducing variations in Federal Funds rate in the US. However,
decisions taken by the Federal Reserve rely on current macroeconomic conditions. Therefore, the
short-end of the yield-curve likely links to macroeconomic developments. Instead, movements
in the long-end of the yield curve should primarily depend on market expectations and riskaversion surrounding future interest rates and economic conditions. Financial economists, then,
should expect to see the long-end of the yield curve as being driven by expectations of future
economic activity, and by risk-aversion.
Empirically indeed, Ang and Piazzesi (2003) demonstrate that macroeconomic factors such
as ination and real economic activity are able to explain movements at the short-end and the
middle of the yield curve. Interestingly, they show that the long-end of the yield curve is driven
by unobservable factors. However, it is not clear whether such unobservable factors are driven
by time-varying risk-aversion or changing expectations. A compelling lesson is that models of
the yield curve driven by only one factor are likely to be misspecied, due to the complexity of
roles played by many institutions participating in the xed income markets, and the links with
the macroeconomy that decisions taken by these instititions have.
12.3.5.3 Forecasting the yield curve
Chapter 11 explains that simple techniques are available to t the yield curve for the purpose
of mere statistical descriptions of the data (see Appendix 1 in Chapter 11). One, proposed by
Nelson and Siegel (1987), and reproduced here for convenience, postulates that the yield curve
can be modeled as:
1
1
( )= 1+ 2
+ 3
The three coe cients, 1 , 2 and 3 , can be interpreted in terms of the three factors reviewed
in this section. The coe cient 1 governs the level of the yield curve. The coe cient 2 relates
to the slope, as an increase in this coe cient increases short yields more than long yields.The
coe cient 3 shapes the curvature, as an increase in this coe cient has little e ect on very
short and very long yields, but increases the middle of the yield curve. Moreover, the coe cient
controls the exponential decay of the yield curve: small values of translate to slow decay
and can better t the curve at long maturities; large values of , instead, lead to a fast decay,
which helps t the short-end of the yield curve. Finally, determines where the loading on 3
achieves its maximum. Diebold and Li (2006) rely on this model, and estimate for each date,
and then use these estimated time series of
to forecast future values of
through vector
autoregressions and, then, the future yield curve.
c
by
A. Mele
bond prices, as explained in a discrete time setting in Chapter 11. This issues is similar to that
encountered in Chapter 10, where to price options in environments with random volatility, we
needed to replicate options through other options.
12.4.1 Models versus representations
The fundamental relation in Eq. (12.2),
(
)=E
(12.24)
suggests to model the arbitrage-free price , by assuming the short-term rate, , is an exogenously given process. For example, we can rely on a Brownian information structure, and
assume be the solution to a stochastic di erential equation such as:
= (
+ (
(12.25)
c
by
A. Mele
Suppose bond prices are solutions to the following stochastic di erential equation:
=
(12.26)
and
are some progressively measurable
where
is a standard Brownian motion in R ,
functions (
is vector-valued), and
(
). The exact functional form of
and
is not given, as in the BS case. Rather, it is endogenous and must be found as a part of the
equilibrium.
Based on general results given in Chapter 4, we can show that the price system in (12.26) is
arbitrage-free if and only if
= +
(12.27)
for some R -dimensional process satisfying some basic regularity conditions. Even if Eq.
(12.27) follows by standard no-arbitrage arguments, Appendix 1 derives it again to emphasize
how no arbitrage does specically leads when it comes to bond markets.
The meaning of Eq. (12.27) is better understood by replacing this equation into Eq. (12.26),
obtaining
=( +
is the short-term rate plus a termThe previous equation tells us that the growth rate of
premium equal to
. In the bond market, there are no obvious economic arguments enabling
us to sign term-premia. Empirical evidence suggests that term-premia may take both signs. But
term-premia would be zero in a risk-neutral world, where bond prices would, then, satisfy:
=
R
where =
+
is a Brownian motion under the risk-neutral probability, .
To illustrate the derivation of Eq. (12.27) when = 1, consider the dynamics of the value
of a self-nanced portfolio in two bonds and a money market account
= ( 1(
where
setting
)+
+(
)+
2(
1 1
2)
2
1
1
=
2+(
1
{z
|
)
}
Note that 2 can always be chosen such that the value of this portfolio appreciates at a rate
strictly greater than : just set sign ( 2 ) = sign ( ). Therefore, to rule out arbitrage, = 0, or
1
=
1
646
2
2
c
by
A. Mele
That is, the Sharpe ratio for any two bonds has to equal a process , say, and Eq. (12.27)
follows. Clearly is independent of the two maturity dates, 1 or 2 , which were indeed arbitrary. It is natural that is independent of any bond maturity, as this is unit price of risk
require to compensate for randomness in the short-term rate.
The two functions
and
in Eq. (12.26) can be determined through Itos lemma. Let
(
) be the price at of a bond maturing at when the state at is the rational
bond pricing function. Because is solution to Eq. (12.25), Itos lemma implies that:
1 2
=
+
+
+
2
where subscripts denote partial derivatives.
Comparing this equation with Eq. (12.26), and identifying drifts and di usion terms,
+
1
2
Now replace these functions into Eq. (12.27) to obtain the the bond price satises the following
partial di erential equation (PDE, henceforth), for all and
[ )
+
1
2
(12.28)
The idea, here, is to replicate the price of a bond expiring at some time 1 , say 1
(
1 ),
with a self-nanced portfolio comprising a money market account and a second bond expiring
at time 2
1 . This approach is the continuous-time equivalent to the pricing approach
in Chapter 11. It is, also, the interest rate counterpart to option evaluation with stochastic
volatility in Chapter 10. Let, then, be the value of our self-nanced portfolio, = 2 + ,
where
is the number of bonds maturing at 2 to include in the portfolio, 2 = (
2 ),
and
is the dollar amount in the money market account. Since the portfolio is self-nanced,
by the usual arguments, we have that
2
=
+
+ 2
(12.29)
= 2+
where
1 2
2
. And, obviously,
1
(12.30)
Let the initial value of the portfolio match the bond price. Then, comparing the di usive terms
in Eq. (12.29) and Eq. (12.30), we nd the delta to be:
=
(
(
1 )/
2 )/
1
2
2
2
=
+
=
+
=
647
c
by
A. Mele
1
1
2
2
Eq. (12.28) shows that the bond price, , depends on both the drift of the short-term rate,
, and the risk-aversion correction, . Let us elucidate the circumstances leading to this fact.
They link to market incompleteness.
Consider a benchmark complete market model, Black-Scholes, in which the option is redundant, given the initial market structure. As pointed out in Chapter 11, in markets such as Black
& Scholes, the stock price is the source of randomness, and markets are trivially completed by
the very same asset.
By contrast, the bond price is not the risk. Rather, the bond draws value from the interest rate risk, as if it was a derivative on , where forms, alone, an incomplete market
structure. For this reason, the bond price cannot be determined in a preference-free format.
Similarly, in equity markets with stochastic volatility (see Chapter 10), two sources of randomness arise, the stock price and its stochastic volatility, such that a portfolio comprising a stock
and one additional option is needed to hedge against one given optionand no preference-free
formulae are possible. Let us mention another example. We are facing a situation similar to
those encountered in Part II of these lectures, where given a certain dividend risk, the stock
price could not, then, expressed in a preference-free fashion: the dividend in that context was
the risk, and the stock price was like a derivative written on that risk.
In Chapter 2, we learnt indeed that given an incomplete market, there might be securities
that could be introduced to make markets complete. For example, a market with two securities
and three states is incomplete. Yet it may be completed once a third security is introduced,
which together with the rst two, can generate any consumption bundle in all the possible
states of the worldwe need that the payo matrix of the three securities be no-singular. Does
this mean that the third security can be priced in a preference-free format? Of course not.
By construction, the payo s promised by the third security cannot be replicated by trading
the rst two. In other words, the third security is not redundant, for otherwise it would not
complete the initial three-state/two security market. And if it is not redundant, its price reects
648
c
by
A. Mele
the services it provides the society with and thus, it cannot be preference-free. An hypothetical
fourth security could, however, be replicated by trading the three securities, and would thus be
preference-free.
The market in this section shares similarities with this simple example, but with qualications. As mentioned, the bond completes the market, and its price cannot be replicated through
the short-term rate, because the short-term rate is not traded. However, we can replicate a second bond price by trading the rst bond (and a money market account). But the second bond
price cannot be expressed in a preference-free format: there are actually no bonds that could
be expressed in a preference-free format, for the very reason that the rst bond cannot! The
situation in the context of this section is actually involved. Issues arise not only because we
try to replicate a bond with another bond. Suppose, indeed, that we try to replicate an option
written on this bond with the very same bond. We could not achieve a preference-free solution
either. By the reasoning in the previous sections, we need to make sure that the volatility of
the bond price is matched to that of the option price. Granted, we can achieve this. However,
the volatility of the bond price is not exogeneous, as it depends on the price sensitivity with
respect to , which is obviously not preference-free.
Finally, the reason the bond price depends on and is intimately related to the previous
explanations. In this market, uncertainty is generated by an untraded risk, , such that the
initial market structure has one untraded risk, , and zero assets, as explained. But because the
short-term rate is not a discounted martingale under , its drift does not equal to = 2
under but, rather
, thereby leading to Eq. (12.28). All in all, the bond price depends
on the specic functional forms of , and .
This dependence on risk-appetite might appear as a kind of hindrance to practitioners, but
it may actully be a source of value, for its potential to inform on agents risk-appetite : once
two functions ( ) are estimated, the yield curve could be inverted to infer the implied ,
which could help policy makers to take more informed decisions about how to a ect the yield
curveadmittedly though, the task of estimating ( ) is far from trivial.
By specifying ( ) and identifying the risk-premium , the PDE in Eq. (12.28) can then
be solved, either analytically or numerically. Choices concerning the exact functional form of
and are often made on the basis of analytical or empirical reasons. In Section 12.4.4,
we introduce the rst, famous models in which
and have a particularly simple form.
Section 12.4.5 discusses analytical merits but major empirical drawbacks of these models. In
Section 12.4.6 we provide a very succinct description of models exhibiting jump (and default)
phenomena. We, rst, highlight how models of the short-term rate could be used to set up a
metric of duration when this concept is ore di cult to quantify than in the contexts outlined
in Chapter 11.
12.4.3 Stochastic duration
Duration is a measure of risk tailored to capture the notion of xed income volatility. Cox,
Ingersoll and Ross (1979) hinge upon models of the short-term rate and introduce the notion
of stochastic duration, generalizing the notion of modied duration discussed in Chapter 11.
Suppose the price of a zero-coupon bond is a function of the short-term rate only, (
).
Dene the basis risk as the semi-elasticity of the bond price with respect to the short-term rate,
(
(
(
)
649
)
)
c
by
A. Mele
Naturally, we want to make sure that the measure of duration for a zero.coupon bond equals
time-to-maturity. Therefore, cannot be a measure of duration: except in the trivial case in
which is constant, does not equal
for a zero. The idea underlying stochastic duration
of a given bond is to search for the time-to-maturity
of an hypothetical zero-coupon bond
such that its basis risk is the same as that of the given bond (e.g., a coupon bearing bond or a
callable bond), viz
(
)
(
)=
(
)
where (
) is the price of the given bond that delivers its face value at time , if no
events preventing this occur prior to time (say, default or the exercise of the optionalities
embedded into the institutional details of the bond). Stochastic duration of this bond is dened
as the time-to-maturity
of the hypothetical zero-coupon bond:
(
)
1
(
)=
(12.31)
(
)
1
where
is the inverse function of ( ) with respect to time to maturity . It is immediate
to verify that for a zero, (
)=
, Moreover, the stochastic duration, (
),
collapses to the modied duration introduced in the previous chapter (see Section 11.4) once
the short-term rate is a constant.10
Vasicek (1977) derives a model of the yield curve assuming the short-term rate is a continuoustime and mean-reverting process with constant basis point volatility, solution to:11
= (
(12.32)
where , and are positive constants. This model is more sensible than that of Merton
(1973), where the short-term rate is an arithmetic Brownian motion. The intuition underlying
the importance of mean-reversion is as follows. Suppose, rst, that = 0, in which case,
= +
(12.33)
, the
If the current level of the short-term rate = , it will be locked-in at forever. If,
short-term rate shall steadily increase, and converge to as
. Likewise, the short-term
rate shall converge to when
. The speed of convergence of to the long-term value
depends on the magnitude of : the higher , the higher the speed of convergence to .
In the general case, 6= 0, the solution to Eq. (12.32) is,
Z
(
)
= +
(
) +
where the integral has to be understood in the Itos sense. The interpretation of this solution
is similar to that in the determinist case, in that the short-term rate now uctuates around
10 Indeed,
(
) , and
1(
note that if is constant, then, (
)=
)= .
that Mertons (1973) seminal paper is also framed in a context with random interest rates solved in closed form. In
Mertons model, the short-term rate is an arithmetic Brownian motion.
11 Note
650
c
by
A. Mele
its central tendency . In other words, shocks are absorbed at a speed depending on the
magnitude of , leading the short-term rate to display a mean-reverting behavior. Indeed, the
conditional expectation of is the same as that in Eq. (12.33),
(
( | ) = +
Moreover, the conditional variance of
(12.34)
is:
2
( | )=
2 (
Finally, it can be shown that is normally distributed, with expectation and variance given by
the two functions given above.
To solve for the entire term-structure of interest rates, we need to make assumptions about
the risk-premium, . A closed-form expression for the bond price obtains, once we assume is
a constant. Indeed, by replacing a constant risk-premium and the functions ( ) = (
)
and ( ) = into Eq. (12.28), and denoting
+ (
1
2
for all (
R[
),
(12.35)
)=
(12.36)
for two functions and to be determined. Now suppose the guess in Eq. (12.36) is true. By
replacing the partial derivatives of into Eq. (12.35) leaves the equation 1 ( ) + 2 ( ) = 0,
where for all ,
1
( )
That is, 0 =
conditions (
1
(
)=
where
)
1
)+
1
2
and
( )
(
)
2
1
(
) 2
(
)
(
)
(
)
=
1
1
4 3
1 2
2
The term-structure of spot rates predicted by this model is, by the denition in Eq. (12.11),
=
)=
(12.37)
c
by
A. Mele
The Vasiceks model can be used to illustrate aspects of the expectation hypothesis and the
predicting power of the term spread mentioned in Section 12.3. First, we express the yield
curve in Eq. (12.37) in a way that is more convenient to interpret. We mentioned earlier that
the short-term rate is conditionally
normally distributed. Now, by Lamberton and Lapeyre
R
(1997, Chapter 6), the term
is also conditionally normally distributed, and then, by
Eq. (12.2) and Eq. (12.11),
Z
Z
1
1 1
(
)=E
(12.38)
var
2
The second term in Eq. (12.38) reects Jensens inequality e ects. It equals
(
)
2
2
1 1
1
1
=
var
1
1
3
2
2
(
)
4(
)
) 2
Z
Z
Z
1
1
1
(
) E
=
( ) +
E
(12.39)
where:
1
( )
E
= + (
)
(
(
1
(
)
)
(
(
)
)
1
1
Eq. (12.39) says that long-term rates reect expectations of the future short-term rate and
risk-premiums terms, dened as the average expected return on the bond.
We can rely on this simple framework to describe some of the business cycle properties of the
yield curve. We assume that a single state variable , is capable to track some business cycle
conditions, and is solution to the following stochastic di erential equation,
= (
for three positive constants, , and . Next, suppose that: (i) the nominal short-term rate
is procyclical, in that
+
, for two positive constants
and , and that (ii) riskpremiums are countercyclical in that,
( )
0
1 , for two additional constants, where
0,
to
ensure
countercyclicality,
and
might
in
principle take any sign, although it is
1
0
reasonable to assume that 0
0, which would imply that the constant portion of the riskpremium is positive anyway. We shall return to the sign of 0 in a moment.
Given the assumptions made so far, the short-term rate is solution to Eq. (12.32), with
parameters
+ and
. While the risk-premium is time-varying, being equal to
1
( )
(
),
the
short-term
rate is still conditionally normally distributed under the
0
652
c
by
A. Mele
risk-neutral probability, and the yield-curve can be solved in closed form, with a solution like
that in Eq. (12.36), and an approximation like that in Eq. (12.39). It is instructive to calculate
the decomposition in Eq. (12.39) predicted by this model. We have,
Z
(
)
1
1
(
) E
= +(
)
(
)
where
and
0
1
(
)
1
( +
1 (
)
)
(
)
0
=
is the unconditional expectation of the procyclical variable , taken under
where
1
the risk-neutral probability. According to this model, the term spread is the product of two
terms. The rst is negative when
0, and in this case, the model formalizes explanations
given in Section 12.3.3. Suppose, once again, that interest rates are procyclical, in that
0.
Before a peak, i.e. when
, the yield curve is upward sloping. After this peak is achieved
and, then,
, the probability the economy would enter into a recession becomes more
likely, given the mean-reverting nature of , and the yield curve becomes inverted, as nominal
rates are procyclical,
0, and countercyclical risk-aversion is mild enough to guarantee
that
0. In other words, at the peak of an expansion, i.e. when a recession is most likely,
we expect that interest will lower due to their procyclicality,
0, whence the yield curve
inversion. Long term interest rates are driven by expectations of future short-term rates.
Note that if 1 had to be so large to make
0, the model would generate the wrong
predictions, with an inverted yield curve during the rising part of a boom, not the descending
part. The mechanism would be that during a peak, we would expect that the future short-term
rate would be low, but risk-aversion to be so high, to dwarf expectations e ects and push future
prices down, to an extent that would compensate for the procyclical e ects generated by the
short-term rate.
Naturally, this model is very simple, driven as it is by only one factor, i.e., the business cycle
variable . Its merit is that it makes a sharp prediction regarding the slope of the yield curve:
a positive slope occurs before a peak,
, and a negative slope after the peak,
,
similarly as in the data. This model thus isolates the business cycle component of the yield
curve that relates to its inversions. The crucial point is that the model is silent as regards the
business cycle variable . If we knew , we could use it to forecast the business cycle in the
rst place.
Vasiceks model su ers from two main drawbacks. First, the short-term rate is normally distributed. This circumstance might be mitigated when is low, compared to , in which case
653
c
by
A. Mele
the probability the short-term rate takes negative values can be small. At the same time, even
a small probability of a negative interest rate might lead to severe mispricings when it comes
to pricing interest rate derivatives, due to nonlinearities induced by optionality, as pointed
out by Dybvig [cite reference]. Section 11.5.3 of the previous chapter displays numerical examples where small changes in assumptions can lead to quite substantial changes in the price of
derivatives.
The second drawback, related to the rst, is that the short-term rate volatility is independent
of the level of the short-term rate. It might be argued that short-term rates changes become
more and more volatile as the level of the short-term rate increases, a phenomenon usually
referred to as the level-e ect.
Cox, Ingersoll and Ross (1985) (CIR, henceforth) propose a model that addresses these two
drawbacks at once, by assuming that the short-term rate is solution to,
= (
The CIR model is also referred to as square-root process to emphasize that the di usion
function is proportional to the square-root of . This feature makes the model address the levele ect phenomenon. The evidence about the level-e ect is further discussed below (see Section
12.4.5). Moreover, this property prevents from taking negative values. Intuitively, when
wanders just above zero, it is pulled back to the stricly positive region at a strength of the
order
= .12 The transition density of is noncentral chi-square. The stationary density
of is a gamma distribution. The expected value is as in Vasicek.13 However, the variance is
di erent, although its exact expression is really not important here.
CIR formulate a set of assumptions (e.g., preferences), leading to a risk-premium function
=
, where is a constant. By replacing this, ( ) = (
) and ( ) =
into the
partial di erential equation (12.28), one gets (similarly as in the Vasicek model), that the bond
price function takes the form in Eq. (12.36), but with functions and satisfying the following
di erential equations:
0=
and 0 =
)=
(
2
(
+ )(
1
(
2
+ )(
(
)+( +
) = 0 and
)= (
! 2 2
1) + 2
p
2+2 2
=
(
+
)+
1
2
)=
=
2
(
+
+ )(
)
(
1
1) + 2
This model has been the reference in the industry for many years, and still now, many models
are multidimensional extensions of the basic CIR model, as reviewed in Section 12.5.3.
1 2
12 This is only intuition. The exact condition under which the zero boundary is unattainable by
is
. See Karlin and
2
Taylor (1981, vol II chapter 15) for a general analysis of attainability of boundaries for scalar di usion processes.
13 The expected value of linear mean-reverting processes is always as in Vasicek, independently of the functional form of the
di usion coe cient. This property follows by a direct application of a general result for di usion processes given in Chapter 6
(Appendix A).
654
c
by
A. Mele
Models that are analytically tractable are certainly quite valuable. Vasicek and CIR models
do lead to closed-form solutions, because they have a linear drift, among other things. Is the
empirical evidence consistent with linear mean-reversion of the short-term rate? This issue
is controversial. In the mid 1990s, three papers by At-Sahalia (1996), Conley et al. (1997)
and Stanton (1997) produce evidence of nonlinear mean-reverting behavior. For example, AtSahalia (1996) estimates a drift function that has the following form:
( )=
2
2
(12.40)
corresponding to a nonlinear di usion function. Figure 12.3 reproduces this function using the
parameter values in his Table 4, and relating to the sample period from 1983 to 1995. Similar
results are reported in the other papers. To illustrate the action the short-term rate dynamics
are under, Figure 12.3 also depicts a linear drift, obtained with the parameter estimates of
At-Sahalia (1996) (Table 4), and relating to a model with a CEV di usion.
drift
0.005
0.004
0.003
0.002
0.001
0.000
-0.001
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
short-term rate r
-0.002
-0.003
-0.004
-0.005
FIGURE 12.3. Nonlinear mean reversion? The solid line is the drift function in Eq. (12.40),
estimated by At-Sahalia (1996), and relating to a parametric model with a nonlinear
di usion function. The dashed line is the estimated linear drift relating to a model with
CEV di usion.
The nonlinear drifts in Figure 12.3 might lead bond prices to exhibit unusual properties,
though. As explained in Chapter 7 (Appendix 5), bond prices are concave in the short-term
rate if the risk-neutralized drift function is su ciently convex (Mele, 2003). While the results
in Figure 12.3 relate to the physical drift functions, the point is nevertheless important as riskpremiums should look like quite unusual to destroy the nonlinearities of the short-term rate
under the physical probability.
655
c
by
A. Mele
The compelling lesson from Figure 12.3 is that under the nonlinear drift dynamics, the
short-term rate behaves in a way that can at least be roughly comparable with that it would
behave under the linear drift dynamics. However, the behavior at the extremes is dramatically
di erent. As the short-term rate moves to the extremes, it is pulled back to the center in a
very abrupt way. At the moment, it is not clear whether these preliminary empirical results are
reliable or not. New econometric techniques are currently being developed to address this and
related issues.
One possibility is that such single factor models of the short-term rate are simply misspecied.
For example, there is strong empirical evidence that the volatility of the short-term rate is timevarying, as we shall discuss in the next section. Moreover, the term-structure implications of a
single factor model are counterfactual, since we know that a single factor cannot explain the
entire variation of the yield curve, as explained in Section 12.3.5. We now describe more realistic
models driven by more than one factor.
12.4.5 The Monetary Experiment and interest rate volatility
One of the motivation underlying the early adoption of the CIR model in the industry was the
property of the model to predict interest rate volatility to increase with the level of interest rates.
Is this a robust empirical feature? It is a di cult topic, with the answer relying on particular
historical episodes. Certainly the episodes of FED QE after 2010 when interest were extremely
low were accompanied by a suppressed volatility. And, during the Monetary Experiment of
the Federal Reserve occurred between October 1979 and October 1982, both interest rates and
interest rate volatility were high.
Figure 12.4 depicts the time series behavior of the nominal short-term rate, as measured
by the three month TB rate,
P12as well as the volatility of its changes, calculated as Vol 14
1
6 , where
|, and
is the short-term rate as of month .
=1 | +1
12
Figure 12.5 plots a scatterplot of the short-term rate basis point volatility, Vol , against , for
6 % , where
two sampling periods.
For later use, dene percentage volatility as Vol%
P12
1
+1
%
. Interest rates are expressed in percentage, and
=1 ln
12
One clear property is that interest rate volatility appears to be countercyclical, spiking as it
does in most of the NBER recession episodes. A natural explanation is that the FED is more
aggressive decreasing interest rates in bad times than increasing them in good.
We previously discussed the level e ect, dened as the statistical relation arising when
the volatility of the short-term rate increases in the level of the short-term rate. One possible
explanation of these episodes is that when liquidity is erratic, interest rates are high reecting
a liquidity risk-premium. But precisely because of erratic liquidity, interest rates are also very
volatile in such periods. A simple statistical model that could help deal with these facts is one
in which the short-term rate has stochastic volatility, as follows:
= (
) +
1
p
(12.41)
2
= (
) +
(
1
1 +
2 )
are standard Brownian motions under the physical probability, is a factor a ecting
where
in interest rate volatility, and remaining notation refers to models parameter. In particular,
: | | 1 is the instantaneous correlation between
and
. This model generalizes the one
14 This
calculation follows that implemented to measure aggregate equity volatility in Section 7.2 of Chapter 7.
656
c
by
A. Mele
factor models in the previous section, in which the yield curve is only driven by the short-term
rate. If
0, the instantaneous rate volatility increases with the level of the interest level. If
the correlation coe cient
0, interest rate volatility is also partly related to sources of
volatility not directly a ected by the level of the interest rate.
Yet the empirical evidence underlying Figure 12.5 does not lend much support to the level
e ect. Rather, it seems that interest rate volatility to be quite at against the level of rates.
The exception, of course, occurred over the Monetary Experiment, when the FED target was
money supply, rather than interest rates. Over this period, high volatility of money demand
mechanically translated to high interest rate volatility through market clearing. Additionally,
monetary base over this period was kept deliberately low as an attempt to ght against ination.
Whence, both interest rate volatility and interest rates were very high. One additional reason for
the high nominal rates at the time might link to a compensation for high ination volatilitynot
only high ination.
FIGURE 12.4. This picture depicts the time-series behavior of the 3 month TB rate
(top panel, in percent) and the rolling, basis point volatility of the 3 month TB rate
changes, Vol (bottom panel), over the sampling period from 1957:01 through 2008:12.
BP volatility is expressed in percentage terms, such that a value of one in the graph is
the same as 100 basis points.
657
c
by
A. Mele
10
12
14
16
FIGURE 12.5. This picture plots the basis point volatility of the short-term rate changes,
Vol (on the vertical axis), against the level of the short-term rate (on the horizontal
axis), for the sampling period spanning 1957:01 through 2008:12 (top panel), and for a
more recent sample spanning 1990:01 through 2008:12 (bottom panel).
0.5
10
12
14
16
0.5
FIGURE 12.6. This picture plots the percentage volatility of the short-term rate Vol%
(on the vertical axis), against the level of the short-term rate (on the horizontal axis), for
the sampling period spanning 1957:01 through 2008:12 (top panel), and for a more recent
sample spanning 1990:01 through 2008:12 (bottom panel).
658
c
by
A. Mele
Figure 12.5 is also suggestive of a change in regime that possibly occurred over a more recent
past. From 1990 on, interest rate volatility does not necessarily appear to positively link to rate
levels, and there is evidence of the opposite. Figure 12.6 suggests that percentage volatility
could actually be inversely related to the short-term rate, and over the more recent sample
periods too.
All in all, evidence is mixed regarding the relation between the level of interest rates and
interest rate volatility. The interesting property of the model in (12.41) is that it allows interest
rate volatility to uctuate, driven by its own source of risk, 2 . In the next section, we review
models of the yield curve with stochastic volatility in a more systematic fashion. We end the
ongoing section with a succinct account of how to model the yield curve in the presence of
jumps.
12.4.6 Short-term rates as jump-di usion processes
Ahn and Thompson (1988) extend the CIR model to one where the short-term rate is a jumpdi usion process. In general, suppose that the short-term rate is a jump-di usion process:
=
( )
+ ( )
+ ( )S
where
and are under the risk-neutral probability, and
is, then, a jump-adjusted riskneutral drift. The bond price (
) is solution to,
Z
+
(
)+
[ ( + S
)
(
)] ( S)
(12.42)
0=
supp(S)
)+
X
=1
supp(S)
[ ( + S
)]
( S)
where
is the number of jump types. However, to simplify the exposition, we just set = 1.
To identify risk-premiums related to jumps, we simply note that
= , where is the
intensity of the short-term rate jump under the physical distribution, and
is the risk-premium
demanded by agents to be compensated for the presence of jumps.
Next, consider a defaultable bond. Assume the event of default is a Poisson process with
intensity , and that in the event of default at , the bondholder receives a recovery payment
( ), which could be deterministic or dependent on the short-term rate.15 Let be the random
time of default, and dene a state variable with the following features:
0 if
=
1 otherwise
15 Chapter 13 contains an account of this approach to modeling defaultable bonds, known as reduced-form approach. This
approach is distinct from a structural, in which the default event is modeled regarding the books of the issuer. The derivations
in this section are based on the partial di erential equations in Mele (2003).
659
c
by
A. Mele
= ( ) + ( )
=S
, where S
(12.43)
+
( 0
)+ ( )[ ( 1
)
( 0
)]
0=
=
for all
[
pre
( + ( ))
( 0
( + ( ))
)=E
+E
) + ( ) ( )
(12.44)
( + ( ))
( ) ( )
(12.45)
where E [] is the expectation taken with resepct to only the rst equation of system (12.43).
Du e and Singleton (1999, Eq. (10) p. 696) provide a slightly di erent evaluation formula
than Eq. (12.45), dening a percentage loss process
[0 1] : = (1
) , which inserted
into Eq. (12.44) leaves a partial di erential equation, the solution of which is, by Feynman-Kac,
pre
( + ( ))
(
)=E
Finally, pre is decreasing in the default intensity , in the following sense. Consider two
markets and where the default intensities are
and , and assume that the coe cients
of
are independent of . The pre-default bond price function in market is (
)
=
, and satises:
0=
+ (
)
=
+
with the usual boundary condition. Assuming that = , subtracting these two equations,
and rearranging terms, shows that the price di erence
(
)
(
)
(
)
satises,
0=
(
)+
(
)
+
( + )
( )
with boundary condition,
(
) 0 whenever
Appendix 3 of Chapter 7.
c
by
A. Mele
How does volatility a ect the yield curve? Consider the following two-period example. In the
rst period, the short-term rate is and in the second, it is either = + or =
with equal
probability, where
0. The price of a two-period bond is ( ) = ( )/ (1 + ), where
( ) = (1/ (1 + )) is the discount factor expected to prevail at next period. By Jensens
inequality, ( )
1/ (1 + ()) = 1/ (1 + ) = ( 0). That is, two-period bond prices
increase upon activation of randomness. More generally, two-period bond prices are always
increasing in the volatility parameter in this example, as illustrated by Figure 12.7.
The intuition underlying Figure 12.7 is standard: the bond price is decreasing and convex in
the short-term rate, such that the price is increasing in the interest rate volatilitythe price
drop in bad times (i.e. when the interest rate increases), is less than the price increase in
good times.16
1
a
m(r,d) (a A)/2
m(r,d) (b B)/2
B
A
r d
r d
r d
r d
FIGURE 12.7. If the risk-neutralized interest rate of the next period is either = + or
=
with equal probability, the discount factor 1/ (1 + ) is either or with equal
probability. Hence ( ) = [ 1/ (1 + )] is the midpoint of
. Similarly, if volatility is
0
0
, (
) is the midpoint of
. Since
, it follows that ( 0 )
( )
such that the two-period bond price satises ( ) = ( )/ (1 + ) satises: ( 0 )
.
( ) for 0
These properties are due to the assumption that the expected short-term rate is independent
of . They may well break down in alternative settings. For example, consider a market in which
an upward rate movement is more likely than a downward. As a second example, consider a
multiplicative setting, in which either = (1 + ) or = / (1 + ) with equal probability.
16 This property relates to the theory of mean-preserving spreads and convex payo s explained in Chapters 7 and 10. Let
() = 1/ (1 + ) denote the random discount factor, such that
7
( ) is increasing and concave and, hence,
0
00 =
( 00 ( ))
( 0 ( )), just as Figure 12.7 illustrates.
661
c
by
A. Mele
It can be shown that in these two examples, bond prices are decreasing in volatility for short
maturities, and increasing for longer, a property originally illustrated by Litterman, Scheinkman
and Weiss (1991). Below, it is argued that, due to risk-aversion, changes in the expected shortterm rate may well depend on the volatility parameter, . Then, at short maturities, riskaversion dominates the convexity e ects in Figure 12.7, whereas convexities dominate over
longer maturities. We now build on this intuition and explain the relation between interest rate
volatility and the yield curve.
12.5.1.2 Two-factor models
In the CIR model, the instantaneous volatility of the short-term rate is stochastic, depending
as it does on the level of , which is obviously stochastic. However, empirical evidence suggests
that the short-term rate volatility depends on some additional factors, as discussed in Section
12.4. A natural extension of the CIR model is one where the instantaneous volatility of the
short-term rate depends on (i) the level of the short-term rate, similarly as in the CIR model,
and (ii) some additional random component. This additional component is what we refer to as
the stochastic volatility of the short-term rate. It is the term-structure counterpart to the
stochastic volatility extension of the Black and Scholes (1973) model (see Chapter 10).
Fong and Vasicek (1991) develop the rst model in which the volatility of the short-term rate
is stochastic. They assume that the short-term rate
is solution to
=
=
(
(
)
)
+
+
(12.46)
How does the factor a ect the yield curve? Consider the basic Vasicek (1977) model. Naturally, this model assumes that volatility is constant, yet it could be used to develop intuition
on Eqs. (12.46) and possibly other stochastic volatility models. It is possible to show that Eq.
(12.36) implies that
Z
Z
(
)
1
2
=
(
) +
(
)
(12.47)
where (
) is as in Section 12.4.4.1. Eq. (12.47) shows that if
0, the whole termstructure is decreasing in , the short-term rate volatility. That is, bond prices increase in , a
662
c
by
A. Mele
conclusion that parallels that for options, where option prices are increasing in the volatility of
the asset price. As explained in Chapter 10, this property arises through the optionality of the
contractsay the convexity of a European call price with respect to the asset price.
But the interesting properties arise in the empirically relevant case,
0.17 In this case, the
( )
sign of
depends on both convexity and slope e ects. Convexity e ects, those relatR
2 (
)
2
= (
) 2(
), arise through the term
(
ing to the second partial
2
(
)
) . Slope e ects, those relating to
= (
) (
), arise, instead, through
R
the term
(
) . If is negative, and su ciently large in absolute value, slope e ects
dominate convexity e ects, and the term-structure can actually increase in . For intermediate
values of , the term-structure can be both increasing and decreasing in . At short maturities,
the convexity e ects in Eq. (12.47) are typically dominated by slope e ects, and the short-end of
the term-structure can be increasing in . At longer maturity dates, however, convexity e ects
are more important and, sometimes, dominate slope e ects.
More generally, changes in interest rate volatility are not mean-preserving spreads for the riskneutral distribution, as Eq. (12.47) illustrates for the Vasicek model. In a world with complete
markets, say Black-Scholes, the asset underlying the derivative contracts is traded. In the case
under study, the short-term rate is not a traded risk. Therefore, its risk-neutral drift depends
on volatility through risk-adjustements: to illustrate, in the Vasicek example, this dependence
arises through the risk-premium parameter, .
While Eq. (12.47) relies on a model with constant volatility, the reasoning underlying its
interpretation holds even when volatility is random.18 In particular, suppose that the riskpremium required to bear the interest rate risk is negative and su ciently large in absolute
value. In this case, slope e ects may dominate convexity ones at any maturity date, such
that the whole yield curve, now, could be always increasing in volatility. Let us provide some
intuition. It is reasonable to expect that in bad times (i.e., when interest rate volatility is high)
risk-premium e ects dominate over convexity, such that the yield curve shifts up following an
increase in volatility. However, in good times, we would expect that convexity dominates, with
the yield curve being decreasing in volatility. Thus, in these examples, if risk-premiums are
su ciently sensitive to volatility, we would expect that in good times, when volatility is small,
convexity e ects dominate and the yield curve lowers as volatility increases. In bad times, when
volatility is high, we would expect that risk-premium e ects dominate, such that the yield curve
increases following an increase in volatility.
To illustrate, consider the Vasicek model again, and assume that the risk-premium is = 3 ,
for some constant . This functional form of the risk-premium ensures that the risk-premium
is quite small when is small, although then it substantially increases in bad times, i.e. when
gets larger and larger. With this risk-premium, Eq. (12.47) is:
Z
Z
1
(
)
2
2
=
(
) +
(
)
(12.48)
That is, risk-premium e ects become more and more relevant as increases. The previous
equation reveals that we may also dene a threshold value for such that convexity e ects are
exactly o set by risk-premium e ects. Eq. (12.48) shows that for each time to maturity
,
(
)
there exists a value of depending on
, say (
) such that the partial
= 0.
663
c
by
A. Mele
RT
We might go on and dene an average value of (
), say
T 1 0 ( ) , where T
denotes the highest time-to-maturity we want to consider. This threshold value, , is the one
that might lead to a denition of what good or bad times can bein terms of the term-structure
implications of a volatility shock.
How do we interpret these properties in light of the factors dynamics reviewed in Section
12.3? Clearly, the very short-end of yield curve is not a ected by movements in volatility, as
lim
(
) = , for all . Moreover, these models predict that lim
(
)
= , where is a constant and, hence, independent of . Therefore, movements in the shortterm volatility can only a ect the middle portion of the yield curve. For example, if the riskpremium required to bear the interest rate risk is negative and su ciently large, an upward
movement in can produce an e ect on the yield curve qualitatively similar to that depicted
in Figure 12.2 (Curvature panel), and would thus roughly mimic the curvature factor that
we reviewed in Section 12.3.
12.5.2 Three-factor models
We need at least three factors to explain the entire variation of the yield curve. A natural
extension of the model in Eqs. (12.46) is one in which the drift of the short-term rate contains
some predictable component, , such that the yield curve is driven by the following three factor
model:
= (
) +
1
= (
) +
(12.49)
2
= ( ) +
3
where
664
c
by
A. Mele
in Figure 12.2 (Slope panel). However, this interpretation is restrictive, as factor analysis
reveals that the short-end and the long-end of the yield curve move in opposite directions after
a change in the steepness factor. Here, instead, a change in the short-term rate only modies
the short-end (and, perhaps, the middle) of the yield curve and, hence, does not produce any
variation in the long-end curve.
12.5.3 A ne and quadratic term-structure models
12.5.3.1 A ne
The Vasicek and CIR models predict that the bond price is exponential-a ne in the shortterm rate . This property is the expression of a general phenomenon. Indeed, it is possible
to show that bond prices are exponential-a ne in if, and only if, the functions and 2 are
a ne in . Models that satisfy these conditions are known as a ne models. More generally,
these basic results extend to multifactor models, where bond prices are exponential-a ne in
the state variables.20 In these models, the short-term rate is a function ( ) such that
( )=
where
to.
is a constant,
is a vector, and
= (
( )
where
is a -dimensional Brownian motion, is a full rank
rank diagonal matrix with elements,
q
( )( ) =
+ >
= 1
(12.50)
matrix, and
is a full
(12.51)
for some scalars and vectors . Langetieg (1980) develops the rst multifactor model of this
kind, in which = 0.
Next, Let
( ) be a diagonal matrix with elements
(
1
if Pr{ ( )( ) 0 all } = 1
( )( )
( )( ) =
0
otherwise
and set,
( )=
( )
( )
(12.52)
for some -dimensional vector 1 and some matrix 2 . Du e and Kan (1996) explained
in a comprehensive way the benet of this model. In their formulation 2 = 0 , and the bond
price is exponential-a ne in the state variables . That is, the price of the zero has the following
functional form,
(
) = exp ( (
)+ (
) )
(12.53)
for some functions
and (0)( ) = 0.
and
of time to maturity,
(0) = 0
20 More generally, we say that a ne models are those that make the characteristic function exponential-a ne in the state variables.
In the case of the multifactor interest rate models of the previous section, this condition is equivalent to the condition that bond
prices are exponential a ne in the state variables.
665
c
by
A. Mele
The more general functional form for in Eq. (12.52) has been suggested by Du ee (2002).
The rationale is the following. Du ee explains that in bond markets, risk-premiums seem to
relate to both volatility and level of the fundamentals. In this model, risk-premiums reduce
to ( ) ( ) = 2 ( ) 1 + 2 . Thus, the inclusion of the term 2 allows one to model the
statistical relations linking risk-premiums to fundamentals. Interestingly, bond prices still have
an exponential a ne form, just as in Eq. (12.53). When 2 = 0 , we say that the model
is completely a ne, and essentially a ne, otherwise. The clear advantage of these a ne
models, then, is that they considerably simplify statistical inference, as explained in Section
12.5.5 below.
Ang and Piazzesi (2003) and Hordahl, Tristani and Vestin (2006) (HTS, henceforth) introduce
no-arbitrage regressions, to model the relations linking macroeconomic variables to the yield
curve. In their models, the factors are taken to be a discrete-time version of Eq. (12.50), where
some components of are observable, and others are unobservable. The observables relate to
macroeconomic factors such as ination or industrial production. The authors, then, study how
all these factors a ect the yield curve, predicted by a pricing equation such as that in Eq.
(12.53). While HTS have a structural model of the macroeconomy, Ang and Piazzesi (2003)
have a reduced-form model.
Reduced-form model can be exposed to the critique that some of the parameters are not
variation-free. [Explain what variation-free parameters are, in mathematical statistics] For example, in the simple Lucas economy of Part I, we know that the short-term rate
is =
+ 12 2 (1 + ), so by change the risk-aversion paramter, , a change in the interest rate should arise as a result. This simple example shows that the parameters related to
risk-aversion correction in Eq. (12.52) are not free, in that tilting them has an e ect on the
parameters of the factor dynamics in Eq. (12.50). At the same time, reduced-form model o er
a great deal of exibility, as they do not restrict, so to speak, the model to track any market
or economy such as the Lucas economy, say. Moreover, we can always nd a theoretical market
supporting the no-arb market underlying the reduced-form model. No-arb regressions such as
those in AP give the data the power to say which parameter constellation make the model likely
to perform, without imposing theoretical restrictions which the data might, then, be likely to reject. For example, the Lucas model, while clearly illustrates that some of the parameters are not
variation-free, can be simply wrong, and might impose unreasonable restrictions on the data.
For no-arb models, instead, cross-equations restrictions arise through the weaker requirement
of absence of arbitrage.
12.5.3.2 Quadratic
A ne models are known to impose tight conditions on the structure of the volatility of the
state variables. These restrictions arise to keep the square root in Eq. (12.51) real valued. But
these constraints may hinder the actual performance of the models. There exists another class
of models, known as quadratic models, that partially overcome these di culties.
12.5.4 Unspanned stochastic volatility
Are xed income markets incomplete? Mele and Obayashi (2015) argue that xed income
volatility is quite distinct from equity. Consequently, many of the investable products on the
popular gauge of equity volatility, VIX (see Chapter 10), could only be poor surrogates for
exposure to xed income volatility. The uniqueness of xed income volatility has actually
been widely acknowledged in the literature. It is well-known since at least Collin-Dusfresne and
666
c
by
A. Mele
Goldstein (2002) that xed income market volatility does not appear to be priced only based
on existing xed income assets.
Simply, the authors showed that straddle returns on caps and foors (see Section 12.8) cannot
be explained by changes in the term structure of swap rates, but by other factors. That is,
existing xed income assets (such as bonds) do not help hedge rate volatility. The models
proposed by Collin-Dusfresne and Goldstein to address these issues are known as leading to
unspanned stochastic volatility (USV, in the sequel). The reason for this terminology is the
following. Consider any of the stochastic volatility models reviewed in Chapter 10, in which a
stock price
is solution to
=
(12.54)
, and a volatility
under the physical probability, for some constant , a Brownian motion
process ( ) 0 possibly driven by
and other Brownian motions. One example of these models
is the celebrated Hestons (1993) model, in which 2 is a square root process.
As we know from Chapter 10, these models are typically understood to describe a situation
of incomplete markets. Collin-Dusfresne and Goldstein (2002) propose to extend this notion to
xed income markets, by modeling bond prices in an incomplete markets setup. The idea is
simple. If bond markets have stochastic volatility and are still incomplete, bond prices should
satisfy dynamics in which their instantaneous returns have stochastic volatility, similarly as in
Eq. (12.54) for the equity case. At the same time, bond prices exposure to volatility should be
zeroed, just as a stock price exposure to its own volatility is zero in the context of Chapter 10.
In the context of the short-term rate models of this section, the latter condition is met when
(
)=0
(12.55)
where is now a constant. Second, the value of the elasticity parameter is important. If = 0,
the short-term rate process is the Gaussian one proposed by Vasicek (1977). If = 12 , we obtain
the square-root process of Cox, Ingersoll and Ross (1985). As we know, the transition density
of is Gaussian in the Vasicek market, and a noncentral chi-square in the CIR case. Therefore,
in both Vasicek and CIR markets, we may write down the likelihood function of the di usion
667
c
by
A. Mele
process. Therefore, ML estimation is possible in these two cases. In more general cases, such
those in the next section, one needs to go for simulation methods, such as those described in
Chapter 5. However, we could still estimate multifactor a ne models through ML.
12.5.5.2 More general models
Estimating the model in Eqs. (12.41) is certainly instructive. Yet a more important question
is to examine the term-structure implications of this model. More generally, how would the
estimation procedure outlined in the previous subsection change if the task is to estimate a
Markov model of the term-structure of interest rates? There are three steps.
Step 1
Collect data on the term structure of interest rates. We will need to use data on three maturities,
say a time series of riskless 6 month, 5 year and 10 year yields.
Step 2
Let us consider the three-factor model in Eqs. (12.49) of Section 12.5.2, where the three Brownian motions
are now allowed to be correlated. The bond price predicted by this model
is:
(12.56)
(
)
(
)=E
R the risk-neutral
respect to the physical probability is exp 12 k k2
Z , for some vector Brownian
m
motion Z, and
(
; ), for some vector-valued function m and some parameter
m
vector . The function
makes risk-adjustment corrections depend on the current value of the
state vector [
], which makes the model Markov, thereby simplifying statistical inference.
To summarize, the issue is now one where we need to estimate both the physical parameter
vector and the risk-adjustment parameter vector . Next, we consider the yield curve in
correspondence of three maturities,
(
ln
=1 2 3
(12.57)
c
by
A. Mele
)=
( ;
)+B( ;
)[
]>
=1 2 3
(12.58)
where ( ;
) and B ( ;
) are some functions of the maturity
(B is vector valued),
and generally depend on the parameter vector (
). Once Eqs. (12.49) are simulated, a time
series of yields
is then straightforward to determined based on Eq. (12.58).21
12.5.5.3 Filtering (and trading)
Once we estimate the models parameters, we could attempt to infer the state [
]> . For
example, we could invert Eq. (12.58) given an observation of three yields having three di erent
maturities. The main conceptual di culty with this approach is that the estimates of the state
[
]> rely on the maturities we choose. Changing maturities likely leads to di erent ltered
states. The usual, and admittedly pragmatic, assumption is that Eq. (12.58) only holds with
some additional observation/model error, and proceed, then, to nd the state that minimizes
the error variance using all the available maturities. This procedure delivers the state for each
observation, and it is attractive, as it exploits market-based information summarized by the
cross-section of yields.
Such a pricing-model-based procedure to lter the state has the potential to be used, in
practice, to implement forecasting exercises. We can t a VAR to the time-series of the ltered
values of [
]> and, then, use this VAR to produce forecasts of the state and, then, forecasts
of yields using Eq. (12.58), similarly as we could do in the case of options on equities in markets
with stochastic volatility (see Chapter 10).
669
c
by
A. Mele
This section surveys the rst attempts to deal with these issues within a continuous-time
framework, with of one them actually being the continuous-time version of the Ho and Lee
(1986) model in Chapter 11. In Sections 12.6.1, we explain the main issue while relying on a
benchmark option pricing formula (derived in section 2.7) and in Section 12.6.2, we provide
details regarding two benchmark models.
12.6.1 Fitting the yield-curve, perfectly
Let again (
) be the price of a zero coupon bond maturing at some . By no arbitrage,
the price of a European call option on this bond, struck at
and expiring at
, is:
h
i
(
) E
( (
)
)+
In Section 12.8, we show that
(
)=
[ (
[ (
] (12.59)
where
is a the forward probability for maturity
(see Eq. (12.93)).
The bond option price in Eq. (12.59) depends on theoretical prices, (
) and (
),
not market prices. This issue is problematic to sell-side institutions while engaged in intermediating derivatives: as explained many times, the need arises in this case to simultaneously
match the yield curve at the time of evaluation. This section describes models that t the yield
curve without errors, which we call perfectly tting models. These models are simply a more
elaborated, continuous-time version of the no-arbitrage models introduced in Chapter 11. They
predict that the price of any bond, say a bond expiring at some , is, of course, random, at
time
, but also exactly equal to the current market price, that of time . Finally, and
naturally, this price must be arbitrage-free. Aim of this and subsequent sections is to show
how to achieve this task by augmenting the models seen in the previous sections with a set of
innite dimensional parameters.
A nal remark. Section 12.8 explains that at least for the Vasicek model, the option price
in Eq. (12.59) does not explicitly depend on because it only depends on (
) and
(
). So why do we look for perfectly tting models in the rst place? Wouldnt it be
enough, then, to just replace the theoretical prices (
) and (
) with the market
values, say $ ( ) and $ ( )? This way, the model is perfectly tting. Apart from being
logically inconsistent (you would have a model predicting something generically di erent from
prices), this way of proceeding also has practical drawbacks.
Section 12.8 reveals indeed that option pricing formulae for European options, might well
agree in notation with those relating to perfectly tting models. However, in Section 12.8.6,
we explain that as we move towards more complex interest rate derivatives, say options on
coupon bearing bonds and swaption contracts, the situation becomes dramatically di erent.
Finally, some maturity dates might not be actually traded at some point in time. For example,
$
( ) might not be observed and still, we might well be interested in the pricing of exotic
products requiring knowledge of $ ( ). An intuitive procedure to deal with this di culty is
to interpolate across the traded maturities. In fact, the objective of perfectly tting models is
to allow for such an interpolation while preserving absence of arbitrage.
The next two sections discuss two specic, old, and yet very famous examples of perfectly
tting models: (i) the Ho and Lee (1986) model, and (ii) one generalization of it, introduced by
Hull and White (1990). In Section 12.7, we move on towards a general model-building principle
that includes these two models as special cases.
670
c
by
A. Mele
Ho and Lee (1986) originally set their model in discrete-time, which is analyzed in the context
of Chapter 11 along with alternative models. The model below, represents the di usion limit
of the original Ho & Lee model, as put forward in Section 11.6.7 of Chapter 11, in which the
short-term rate
is solution to,
=
(12.60)
is an innite dimensional
where is a Brownian motion under , is a constant, and
parameter, which we need to pin down the initial, observed yield curve, as we now explain. The
reason we refer to
as innite dimensional is that
is taken to be a continuous function of
calendar time
. We assume this function is known at whence, parameter.
Clearly, Eq. (12.60) is an a ne model. Therefore, the bond price takes the following form,
and
(
)=
)=
(12.61)
1
6
)3
)=
Let $ ( ) denote the instantaneous, observed forward rate. By matching the instantaneous
forward rate (
) predicted by the model to $ (
) yields:
$
)=
)=
ln
1
2
)2 +
(12.62)
R
Because ( ) = exp(
( ) ), the drift term
satisfying Eq. (12.62) guarantees
an exact t of the yield curve. By di erentiating Eq. (12.62) with respect to , leaves
=
2
22
)+ (
), or:
$(
=
)+
(12.63)
(0 ) +
1
2
2 2
(12.64)
Moreover, by Eq. (12.62), and Eq. (12.60), the instantaneous forward rate satises,
(
)=
(12.65)
The predictions of this model are the continuous-time counterparts to the original, discrete-time
version of Ho & Lee, introduced in Section 11.6.6 of the previous chapter. In Section 12.7.3,
they will be shown to be a particular case of a more general framework known as HJM.
22 To verify that
is indeed the tting parameter we are searching for, we replace Eq. (12.63) into Eq. (12.62) and verify indeed
that Eq. (12.62) holds as an identity.
671
c
by
A. Mele
(12.66)
1
)=
2
and
(
)=
1
1
(12.67)
(12.68)
By reiterating the same reasoning produced to show (12.63), one shows that the solution for
is:
2
2 (
)
=
(
)
+
(
)
+
1
(12.69)
$
$
2
)=
all
(12.70)
underlies the modeling approach started by Heath, Jarrow and Morton (1992) (HJM, henceforth). Given Eq. (12.70), this approach takes as a primitive the stochastic evolution of the entire
structure of forward rates, not only the special case of the short-term rate, = lim
( )
( ). The goal is to start with Eq. (12.70), take the initial observed forward rates ( ( )) [ ]
as given, and, then, nd the no-arb, cross-equation restrictions on the stochastic behavior of
( ( )) ( ] , for any
[ ].
672
c
by
A. Mele
By construction, the HJM approach allows for a perfect t of the initial term-structure. This
point can be illustrated quite simply, as the bond price (
) is,
(
)=
=
=
=
=
(
(
(
(
(
(
(
(
)
)
)
)
)
)
)
)
(
(
)
)
(
) +
( (
))
(12.71)
The key points of the HJM methodology are (i) to take the current forward rates ( )
as given (i.e., equal to those in the market) and, then, (ii) to model the future forward rate
movements,
( )
( )
Therefore, the HJM methodology takes the current term-structure as perfectly tted, as we we
observe both ( ) and ( ). In contrast, the approach to interest rate modeling in Section
12.4 is to model the current bond price ( ) through assumptions regarding the dynamics
of the short-term rate. Instead, tting the initial term-structure is critical for market making
purposes, as we explained in the previous section and in Chapter 11.
Finally, note that the bond price representation in Eq. (12.71) leads to a modeling perspective
that is the continuous-time counterpart to that underlying the discrete time Ho & Lee model
(see Chapter 11). Indeed, below we shall show that the continuous-time version of the Ho &
Lee model (see Eq. (12.65)) is a special case of HJM framework.
12.7.2 The model
12.7.2.1 Primitives
We assume information is Brownian, such that for any given , the instantaneous forward rate,
( (
)) [ ] , satises,
(
)=
+ (
(12.72)
The next step is to derive restrictions on that rule out arbitrage. Let
We have
Z
= ( )
(
( )) =
(
)
(
673
R
)
) .
c
by
A. Mele
(
(
. By Itos lemma,
)
1
=
(
)+ (
)
2
2
)
2
1
(
)
=
(
)+ (
) + (
)
(
)
(
)
2
R
+
is a -Brownian motion, and satises:
where =
2
1
(
)= (
) + (
)
(12.74)
2
By di erentiating the previous relation with respect to gives us the arbitrage restriction that
we were looking for:
Z
(
)= (
)>
+ (
(12.75)
Z
= 2( ) + ( ) +
2(
where
2(
)=
2(
+
Z
)> + (
{z
2(
2(
) (
)> +
}
(12.76)
+ (
(12.77)
(12.78)
Eq. (12.76) reveals that the short-term rate is in general non-Markov. A special case of Eq.
(12.76) is the Ho and Lee (1986) model, where ( ) = , a constant, such that, by Eq.
(12.75), ( ) = 2 (
)+
, consistently with Eq. (12.64). The Hull and White (1990)
model is dealt with in the next section.
Note that regarding the main objective of these modelspricingwe do not need to be
concerned with estimating any risk-premium, i.e., (). We only need to consider the riskneutral dynamics of the left corner in the forward rate surface, that is, those of the short-term
rate . By Eq. (12.77) and Girsanov theorem, these are given by
Z
Z
+ ( )
= 2( ) +
2( ) +
)
(12.79)
2(
where 2 ( ) is dened in Eq. (12.78), and denotes as usual a Brownian motion under the
risk-neutral probability.
Eq. (12.79) can be easily simulated for the purpose of the evaluation of exotic products.
674
c
by
A. Mele
Naturally, HJM models are not distinct from the short-term rate models of Section 12.4. Under
embeddability conditions, HJM can be turned into short-term rate modelsa property known
as universality of HJM models.
12.7.4.1 Markovianity
Under which conditions do HJM models predict the short-term rate to be Markov? This question
naturally links to the early literature reviewed in Section 12.4, where the whole yield curve is
driven by a scalar Markov processthe short-term rate. Carverhill (1994) and Ritchken and
Sankarasubramanian (1995) study conditions under which the original state vector can be
enlarged such that the resulting augmented state vector is Markov and at the same time,
includes the short-term rate as a component. The resulting model quite resembles some of the
short-term rate models surveyed in Section 12.4. In these models, the short-term rate is not
Markov, yet it is part of a system that is Markov. We now illustrate these points within the
simple Markov scalar case.
Assume the forward-rate volatility is deterministic and takes the following form:
(
=
=
=
2(
)+
2(
)+
2(
)+
1(
) 2 ( ) all
(12.80)
satises
)=
Z
Z
2(
0
2(
0
2(
2(
)
+
2( )
2(
0
2(
)
(
2 )
1(
2(
1(
+ (
)
)
+ (
)
+ (
Done. This is Markov. Precisely, the condition in Eq. (12.80) ensures the HJM model predicts
the short-term rate is Markov. Mean reversion, then, obtains assuming that 20 ( ) 0 for all
. For example, take to be a constant, and:
1(
)=
2(
)=
R
(
)
(
)
, and the price volatility is
(
) = 1
. This is
such that ( ) =
the Hull-White model discussed in Section 12.4, of which the Ho and Lee model is a particular
case, namely for = 0.
12.7.4.2 Short-term rate reductions
We prove everything in the Markov case. Let the short-term rate be solution to:
= (
+ (
where is a -Brownian motion, and is some risk-neutralized drift function. The rational
bond price function is (
), and the forward rate implied by the model is:
(
ln
)=
675
c
by
A. Mele
1
+
2
) to be consistent with the solution to Eq. (12.73), it must be the case that
) + (
) (
) (
)+
1
(
2
)2
(12.81)
and
(
)= (
(12.82)
In particular, the last condition can only be satised if the short-term rate model under consideration is of the perfectly tting type.
12.7.5 Stochastic string shocks models
The rst papers are Kennedy (1994, 1997), Goldstein (2000) and Santa-Clara and Sornette
(2001). Heaney and Cheng (1984) are also very useful to read.
12.7.5.1 Stochastic singularity
Let
)=[
1)
2 )]
1)
2,
2)
=1
and,
(
2)
1)
k (
)k k (
2 )]
)>
(
(
1)
1 )k k (
=1
k (
+ (
)k (
2)
2 )k
(12.83)
)
)
+ (
One drawback of this model is that the correlation matrix of any ( + )-dimensional vector
of forward rates is degenerate for
1. Stochastic string models overcome this di culty by
modeling the correlation structure (
1
2 ) for all 1 and 2 in an independent way, rather
than implying it from a given -factor model (as in Eq. (12.83)). In other terms, within the
HJM methodology, one uses the functions to model both volatility and correlation structure
of forward rates. The outcome might not be a good model, in practice. Instead, stochastic string
models have two separate functions with which to model volatility and correlation.
The starting point is a model where the forward rate is solution to,
(
where the string
)=
+ (
c
by
A. Mele
) is continuous in
) is continuous in ;
(iii)
(iv)
(v)
)) =
1)
)) = 0;
;
(
2 ))
2)
(say).
Properties (iii), (iv) and (v) make Markovian. The functional form for is crucially important to guarantee this property. Given the previous properties, one can easily derive a key
property of forward rates. We have
p
( (
)) = (
)
(
2)
1)
2 ))
1)
2)
1)
2)
2)
2)
As claimed before, we now have two separate functions with which to model volatility and
correlation.
12.7.5.2 No-arbitrage restrictions
= ( )
( ) =
where as usual,
(
(
)
=
)
=
1
2
(
(
( (
. But
. We have
Z
[ ( )
) = exp (
)]
). Therefore,
)
1
)+
2
))
satises:
Z
(
1)
2)
2)
where T denotes the set of all risks spanned by the string , and
family of unit risk-premia.
In absence of arbitrage,
0 = [ ( )] =
drift
+ drift
+
By exploiting the dynamics of
Z Z
1
(
)=
2
is the corresponding
and ,
(
1)
2)
677
2)
c
by
A. Mele
=
=
)=
) (
) (
) (
) (
with respect to
By di erentiating
(
we obtain,
) (
+ (
(12.84)
1
7 Vol
ln ( )
where ( ) is the price of a zero with maturity equal to . It is instructive to see what this
volatility looks like, for a concrete model. Consider the Vasicek model in Eq. (12.32). We know
678
c
by
A. Mele
( )
( )
( )=
( ) Vol ( )
(12.85)
where Vol ( ) = 2 is the long-run volatility of the short-term rate. For example, if = 0 2
and = 0 03, then Vol ( ) 4 7%. Given the previous values for and , Figure 12.5 depicts
the term-structure of volatility, i.e. Eq. (12.85).
Vol(R)
0.045
0.040
0.035
0.030
Maturity (years)
Figure 12.8 illustrates how the term-structure of volatility decreases over the maturity of the
zero, attaining its maximum at Vol ( ) 4 7%. It is natural, as the yield curve in this model
attens out, converging towards a constant long-term value, the asymptotic interest rate, as we
say sometimes.
Despite this, the volatility of bond returns can be much higher, as we now illustrate. We need
to gure out the dynamics of the bond price, for the Vasicek model. By Itos lemma,
( )
= ( )
( )
+(
( )
( ))
(12.86)
Compare Eq. (12.86) with Eq. (12.85). The main di erence between these two equations is
that the right hand side of Eq. (12.85) is divided by , which makes Vol ( ( )) decreasing in .
(Otherwise, Vol ( ) and have roughly the same order of magnitude.) The main point is that
the yield ( ) is, simply, an average return achieved once a bond is purchased and held until
679
c
by
A. Mele
its expiry. This average return is progressively less volatile as time to maturity gets larger, and
becomes constant, eventually. The return
does, instead, measure the capital gains achieved
while trading the bond. The volatility of these capital increases with time to maturity. Even
if is very small, the bond returns volatility in Eq. (12.86) can be quite high. Suppose, for
example, that is close to zero, in which case Vol
, which is 15% for a ve year zero.
These properties are illustrated by Figure 12.6, which depicts Eq. (12.86) when the parameter
values are = 0 2 and = 0 03.
0.18
Vol(dP/P)
0.16
0.14
0.12
0.10
0.08
0.06
0.04
0.02
0.00
Maturity (years)
FIGURE 12.9. The dashed line depicts the bond return volatility, Vol
, arising when
the persistence parameter = 0, and the solid line is the bond return volatility for = 0 2.
The high persistence of the short-term rate, as measured by a low value of , makes long
maturity bond returns quite volatile. Intuitively, this high persistence implies that a shock in
the short-term rate has long lasting e ects on the future path of the short-term rate. This
makes the short-term rate very volatile in the long-run, which makes the value of long maturity
zeros very volatile as a result. Intuitively, interest rates exhibit inertia: (i) it takes a number
of shocks to move interest rates away from their equilibrium paths and so, short-term bonds
are not volatile; and (ii) it takes time for interest rates to absorbe shocks and so, medium/long
-term bonds are volatile. For example, Figure 12.7 depicts the dynamics of the three month
rate and those of the three months into ve years forward swap rate, an interest rate that refers
to relatively higher maturities, as explained in Section 12.8.7 below (see Eq. (12.106)). The
forward swap rate is orders of magnitude more volatile than the short-term rate.
680
c
by
A. Mele
These facts are conrmed by the implicit (not implied) option-based volatility. In Section
12.8.4, Eq. (12.97), we show that this volatility is,
s
(
)
2 (
)1
1
=
Vol
Vol
2 (
)
As gets small, Vol tends to (
), which increases with the bonds time to maturity
left at its expiration,
.
The previous reasoning does, of course, still hold in the more realistic case of a three-factor
model, such as that in Eqs. (12.49). In that case, as explained,
is large and is small:
the short-term rate is quite persistent because it mean-reverts, quickly, to a persistent process,
which we denoted as . Naturally, in such as a three-factor model, Eq. (12.86) does not hold
anymore, as we should add two more volatility components, related to stochastic volatility, ,
and the persistent process . However, the bond return volatility would be boosted by the high
persistence of .
12.8.2 Hypothetical continuous payo s
Interest rate derivatives could be priced in a very elegant fashion once we assume payo s are
paid out continuously. Let denote the price of any such derivative, and be the instantaneous
payo paid by it, a function of calendar time and . Consider any model of the short-term rate
in Section 12.4, and to simplify, assume that = 1, such that
in Eq. (12.25) carries all
information. By the FTAP, is solution to the following partial di erential equation:
+
1
2
, for all (
681
R[
(12.87)
c
by
A. Mele
0=E
But since
( ))
( ) is known at time ,
E
( )E
(12.88)
=E (
where
( )
(
682
( )
(12.89)
c
by
A. Mele
, as follows,
(12.90)
( ) equals
(12.91)
That is, forward prices are martingales under the forward probabilitywhence, the expression, forward martingale probability, which we shall shorten to forward probability to simplify
the presentation. Naturally, and as usual, future prices are martingales under the risk-neutral
probability, not the forward, as explained in Chapter 10.
The forward probability is a useful tool, which helps pricing interest-rate derivatives, as we
shall explain in detail below. It was introduced by Geman (1989) and Jamshidian (1989), and
further analyzed by Geman, El Karoui and Rochet (1995). The appendix provides additional
details. Appendix 2 relates forward prices to their certainty equivalent, and Appendix 3 deals
with additional technical details. We now rely on this probability to facilitate the calculation
of options on bonds.
12.8.4 European options on bonds
12.8.4.1 A bond option pricing formula
Let be the expiration date of a European call option on a zero-coupon bond, and
the
expiration date of the bond. We consider a simple model of the short-term rate with = 1,
such that the price of a zero is (
), with the usual notation. Consider the price of an
option on this bond, maturing at and with strike equal to . It equals:
h
i
(
)=E
( (
)
)+
(12.92)
Finding a closed-form solution for this price looks formidable. Note indeed that
is solution to a partial di erential equation, subject to the boundary condition
(
) =
+
( (
)
) , where (
) is also solution to another partial di erential equation. Relying on the forward probability allows simplifying this problem.
Let us, then, elaborate on Eq. (12.92). Note that the main issue we encounter is that the
payo ( (
)
)+ depends on , and yet the discounting factor,
, would
also obviously depend on the realization of the short-term rate. As discussed in Chapter 4,
it is a general issue arising whilst evaluating xed-income instruments, because interest rates
are obviously random in this context. These issues are overcome by turning the risk-neutral
expectation in (12.92) into the forward.
683
c
by
A. Mele
=E
=
)E
)E
)I
(I
[ (
)E
)E
(
)
(I
[ (
(12.93)
))+
( (
)+ +
Taking the risk-neutral, discounted expectations of both sides of this equation leaves,
E
=E
=E
23 By
R
R
))
( (
( (
+
)
+
+
)
+
R
(
( )
( ( )
)Iexe = E
( )
684
Iexe E
( )
=E
( )
Iexe
c
by
A. Mele
where the last equality follows by the same argument leading to Eq. (12.93). Therefore, we have
the put-call parity relation:
Put (
where Put ( ; (
, expiring at time
)
)
) = Call (
)+
(12.94)
Suppose that the short-term rate is solution to the Vasicek model considered in Section 12.4
(see Eq. (12.32)), such that under the risk-neutral probability,
= (
( (
)
)+
E
=
[ (
[ (
(12.95)
where
denotes the -forward probability.
In Appendix 8, we show that the two probabilities in Eq. (12.95) can be evaluated by the
changes of numeraire described in Section 12.8.3, such that the solution for (
) is:
(
(
(
(
(
) =
) =
)
)
)
)
1
2
1
2
[ (
[ (
)]2
[ (
[ (
]=
where
2
( 1)
Z
[ (
[ (
under
)]
)]
(12.96)
)]2
]=
)]2
(
21
under
. Therefore, simple algebra
)
2 (
ln
)
(
1 2
2
)2
(12.97)
2
It is a very elegant formula. It resembles Black & Scholes, although the inputs to the volatility
function link to both the instantaneous volatility, , and the speed of mean-reversion, , of the
short-term rate. We provided the economic interpretation of this dependence in Section 12.8.1.
12.8.4.4 Perfectly tting extension
Brigo and Mercurio (2006) survey a number of perfectly tting models that go well beyond
that in the previous section. The simplest relies on the Hull and White (1990) model in Eq.
(12.66) of Section 12.6.3. Note that while, formally, the solution to Eq. (12.92) is the same as
in the previous section, the value of more complex derivatives depends on whether we use or
not a perfectly tting extension, as Section 12.8.6 explains in further detail.
685
c
by
A. Mele
This section relies on a continuous time model to illustrate a few, key properties of callable and
puttable bonds. As explained in Chapter 11 (Section 11.8.1) callable bonds are assets that give
the issuer the right to buy them back at certain times and predetermined prices; puttable bonds,
instead, give the investor the right to sell them back to the issuer at certain times and strikes.
Chapter 11 explains the pricing mechanims of these assets within a discrete time framework
in which the option exercise is of the American type, relying on binomial trees. This section
only considers European-style options (and zero coupon bonds), but leads to clear predictions
and analytical evaluation formule. For simplicity, we consider non-defaultable, and zero coupon,
bonds.
Consider, rst, callable bonds maturing at , and let
be the strike at which they can be
called. Suppose that the date of exercise, if any, is some future time
. Repeating some
of the reasoning in Chapter 11, assume exercise, in which case the issuer can buy its bonds
back at and re-issue a zero-coupon bond at better market conditions, , where denotes as
usual the price of a non-callable bond. The di erence,
, is just a net gain for the issuer.
Therefore, the callable bond is worth just
when
. Instead, if
, the issuer does
not have any incentives to exercise and, then, the value of the callable bond is just that of a
non-callable bond. Therefore, the callable bond is worth when
. To sumup, the value
at of a callable bond is min {
(
)}. It easy to see that,
min {
}=
max {
0}
Therefore, we see that the price of a callable bond with maturity date , equals the price of
a non-callable bond with the same maturity date , minus the value to call the bond, which
equals the price of an hypothetical option on the non-callable bond, struck at .
We can apply these insights to price a callable option in a concrete example. Consider, for
example, the short-term rate in the Vasicek model. Then, if the short-term rate is at time ,
the value as of time of the non-defaultable zero coupon bond maturing at time , callable at
time
, at a strike price equal to , is,
callable
)=
Call (
(12.98)
where (
) is the value of the non-callable zero maturing at time , and Call (
)
is the value of a call option on the non-callable -zero, maturing at time and having a strike
price equal to .
Eq. (12.98) shows that the presence of the option to call the bond raises the cost of capital
for the issuer.
In the context of the Vasicek model, the solution to
(
) in Eq. (12.98) is given by
the Jamshidians (1989) formula in Eq. (12.95), which we now use below. Figure 12.8 depicts the
behavior of the price of the callable bond in Eq. (12.98), callable ( 0
), as a function of the
short-term rate, , when the exercise price = 0 65, option maturity is = 0 5, and the bond
maturity is = 10. Finally, to evaluate Eq. (12.98), we make use of the closed-form solution in
Eq. (12.36), and the parameter values = 0 2, = 0 06, = 0 03, = 1 7146 10 2 .
686
c
by
A. Mele
0.70
0.65
0.60
0.55
0.50
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
short-term rate
FIGURE 12.11. Negative convexity. Solid line: the price of a callable bond. Dashed line:
the price of a non-callable bond. The price of a callable bond exhibits negative convexity
with respect to the short-term rate.
As Figure 12.8 illustrates, the convexity of the non-callable bond price is destroyed by the
convexity of the price of the option embedded in the callable bond. Intuitively, as the shortterm rate lowers, callable and non-callable bond prices increase. However, the price of callable
bonds increases less because as the short-term rate decreases, bond prices increase and, then, the
probability the issuer will exercise the option increases. As a result, the risk-neutral distribution
of the callable bond price becomes markedly shifted towards the strike price, = 0 65, implying
a progressively lower decay rate for the bond price as the short-term rate gets small.
What is the duraton of a callable bond? Roughly, a ve year bond with xed coupons issued
when interest rates are relatively high might resemble, so to speak, a three year conventional
bond, as a likely decrease in the interest rates would lead the bond-issuer to reedem its debt at
the strike price. To formalize this intuition, we determine the stochastic duration of the callable
bond predicted by this model, using Eq. (12.31). For the Vasicek
model,( the) semi-elasticity of
1
the non-callable bond price with respect to is (
)= 1
; its inverse with
respect to time-to-maturity is
1
( )=
ln (1
Therefore, the stochastic duration for the callable bond predicted by the Vasicek model is, by
Eq. (12.31):
1
(
)
(
)
(
)=
ln 1 +
(
)
(
)
where subscripts denote partial derivatives.
[In progress]
Next, we proceed with pricing the puttable bond. As explained in the previous chapter,
Section 11.8, the payo at the expiration of the bondholders right to tender the bonds is:
max {
}=
+ max {
687
0}
c
by
A. Mele
where
is the price of a non-puttable bond. We can use, again, the Vasicek model to price
the previous payo . The price at of a non-defaultable zero-coupon bond maturing at time ,
puttable at time
, at a strike price equal to , when the short-term rate is , is:
puttable
)=
) + Put (
) = Call (
)+
where (
) is the value of the non-puttable zero maturing at time ; Put (
) is
the value of a put option on the non-puttable zero maturing at , maturing at , struckable at
; and the second equality follows by the put-call parity of Eq. (12.94), with Call (
)
dened as in Eq. (12.98).
[In progress]
12.8.6 Options on xed coupon bonds
For simplicity, we shall ignore issues regarding coupon accruals, and assume the expiration date
of these options occurs at any of the reset dates. Therefore, the payo of an option maturing
at 0 on a xed coupon bond paying o at dates 1
is:
!+
X
+
( fcb ( 0
)
) =
( 0
)+
( 0 )
(12.99)
=1
Evaluating the expectation of the payo in Eq. (12.99) is somehow problematic: the maximum
between zero and a sum is obviously not in general the same as the sum of the maxima between
zero and each element of the sum. Even with a model in which bond prices are log-normal, the
sum of log-normals is not log-normal. However, this issue can be dealt through a very well-know
trick, described next.
Consider any of the models of the short-term rate reviewed in Section 12.4, in which the
price of a zero is some function ( ) = (
). Assume that (
) is decreasing in
24
, such that (under additional conditions) there exists a unique value of , say ( ), which
solves the following equation:
(
)+
( )
)=
(12.100)
=1
X
X
( 0 0 )
=
( (
=1
=1
( )
!+
))
where = , = 1
1, and = 1 + .
Next, note that the terms ( 0 0 )
( ( ) 0 ) have all the same sign for all .25
Therefore, the payo in Eq. (12.99) is
!+
X
X
( 0 0 )
=
( ( 0 0 ) K ( ))+
K ( )
( ( ) 0 )
=1
=1
(12.101)
24 Bond prices are indeed always decreasing in the short-term rate in all one-factor stationary, Markov models of the short-term
rate. However, this is not a general property in multi-factor models (see Mele, 2003).
25 Suppose that
( 0 0 1)
(
. Hence ( 0 0 2 )
(
0
1 ). By Eq. (??),
0
2 ), etc.
0
688
c
by
A. Mele
Each term of the sum in Eq. (12.101) can be evaluated as an option on a pure discount bond
with strike price equal to ( ( ) 0 ), where the threshold ( ) is found numerically.
The device to reduce the problem of an option on a xed coupon bond to a problem involving
the sum of options on zero coupon bonds was invented by Jamshidian (1989).
The price of the call on the xed coupon bond is, therefore
Call (
0;
fcb (
)=
X
=1
Call (
0;
) K ( )
(12.102)
where
Call (
1
ln
K (
) (
0;
0)
+ 12
K ( )
2
)=
q
1
(
2 ( 0
K ( )
1
)
0)
)=
The price of a put can then be determined through the put-call parity in Eq. (12.94).
Why are perfectly tting models so important, in practice? Suppose that in Eq. (12.100), the
critical value is determined through Vasiceks model. This assumption is attractive because it
leads to evaluate the payo in Eq. (12.101) through the Jamshidians formula of Section 12.8.4.
However, this way to proceed does not ensure that the yield curve is perfectly tted.
The natural alternative is to use the corresponding perfectly tting extension, the Hull and
White model in Section 12.8.4, i.e. Eq. (12.66), and use this price to calibrate in Eq. (12.100).
Note, now, the importance of a perfectly tting model. As mentioned in Section 12.8.4, both
Jamshidian and its perfectly tting extension agree regarding the price of an option on a zero.
However, Jamshidian and its perfectly tting extension would assign di erent values to options
on coupon bearing bonds, because they would lead to di erent values for
in Eq. (12.100)
and, hence, di erent values for the ctitious strikes (
) in Eq. (12.101).
0
12.8.7 Interest rate swaps
A Savings and Loan (S&L, henceforth) is an institution that extends mortgage, car and personal
loans to individual members, nanced through savings. During the 1980s through the beginning
of the 1990s, these forms of cooperative ventures entered into a deep and persistent crisis, leading
to a painful Government bailout of about $125b under George H.W. Bush administration.
There are many causes of this crisis, but one of them was certainly the rise in short-term rates
arising as a result of ination and the attempts at ghting against itthe so-called Monetary
Experiment mentioned in Section 12.4.5. But banking is risky precisely because it involves
lending at horizons longer than those relating to borrowing, and S&L banking was not an
exception to such modus operandi. Certainly, interest rate swaps could have helped copying
with the inversion of the yield curve of the time. We now examine the pricing details of this
derivative in detail.
12.8.7.1 Forward rate agreements again
Interest rate swaps are baskets of forward rate agreements in a sense. Consider a forward rate
agreement in which the xed rate does not clear its value at origination. Denote its value at
origination with FRA (
; ), where
is the debt-servicing period, and is the xed
689
c
by
A. Mele
)=E
=E
=E
)( (
) (
(
!
(1 + (
(1 + (
)
(
)
(12.103)
where the third line holds by the denition of and the fourth follows by Eq. (12.5) given in
Section 12.2.
Alternatively, note that the LIBOR, (
), while known at , is only paid o at , such
that the value of 1 + (
) (
) (to be delivered at ) is simply one at and, obviously,
( ) at . That is, the value of (
) (
) (to be delivered at time ) is ( )
( )
at , whence, the fourth equality.26
Finally, replace the basic denition of the forward rate in Eq. (12.6) into Eq. (12.103), which
leaves:
FRA (
; )=(
)( (
)
) ( )
(12.104)
As is clear, FRA can take on any sign, and is exactly zero when
= (
), where
(
) solves Eq. (12.6). Interest rate swaps are those where payment exchanges repeatedly
occur over a given time horizon known as the swap tenor, as explained below.
12.8.7.2 Forward starting swaps
An interest rate swap is simply an exchange of interest rate payments. One counterparty exchanges a xed against a oating interest rate payment. The oating payment is typically a
short-term interest rate. For example, the counterparty receiving a oating interest rate payment has good (or only) access to markets for variable interest rates, but wishes to pay
xed interest rates. Alternatively, the counterparty receiving a oating interest rate wants to
hedge against changes in short-term rates, as it might have been the case for S&L institutions
during the 1980s. The counterparty receiving a oating interest rate payment and paying a
xed interest rate irs has a payo equal to,
1
( (
irs )
at time , = 1 , where
1
1 as usual.
Each of these payments is a FRA really, and can be evaluated as in the previous section. By
convention, we say that the swap payer is the counterparty who pays the xed interest rate
irs , and that the swap receiver is the counterparty receiving the xed interest rate
irs .
With a dedicated interest swap of this kind, a S&L institution would have locked-in the yield
curve:
specically, in this stylized
example, the payo for the nancial
long
long institution would
be
( 1 )
( 1 ) + 1( ( 1 )
( 1 )
1
irs ) =
1
irs , where
690
c
by
A. Mele
( 1 ) is the interest rate gained over long-term assets. Naturally, if short-term interest
rates had to go down, relative to irs , a S&L
beneted from the
institution would not have
increased long-term/short-term spread, 1 long ( 1 )
( 1 ) . But clearly insuring
against yield curve inversions is the thing to do, if yield curve inversions lead to bankruptcy and
bankruptcy is costly. We shall see, below, that other products exist, such as caps or swaptions,
which ensure against the upside while at the same time freeing up the downside.
The value at of a forward starting interest rate swap (i.e. a swap starting at
) payer,
irs ( ) say, is:
irs
()=
X
=1
( (
X
FRA(
irs ) =
irs )
(12.105)
=1
irs )
( (
irs )
The forward swap rate sw is the value of irs such that irs ( ) = 0. Simple calculations yield:
P
) (
)
(
)
( 0)
1
=1 P1 (
= P
(12.106)
sw ( ) =
)
)
1 (
1 (
=1
=1
irs
=1
=1
)(
sw
()
irs )
=1
PVBP (
)(
sw
()
irs )
(12.107)
where PVBP ( 1
) is the so-called swap Present Value of the Basis Point, i.e. the
present value impact of one basis point move in the forward swap rate at .
12.8.7.3 Marking to market
While a forward starting swap is costless at origination, its marking to market updates are
calculated as follows. Suppose to enter at into a forward starting swap originated at time .
27 To cast this problem in terms of continuous time swap exchanges and, then, PDEs, we set
0 as a boundary condition,
irs ( )
and ( ) =
, where plays the same role as irs above. Then, if the bond price ( ) is solution to Eq. (12.87), the following
function, irs ( ) = 1
( )
( ) , does also satisfy Eq. (12.87).
691
c
by
A. Mele
( )
X
E
1( (
(
))
sw
=1
=1
= PVBP (
( (
)(
sw
sw ( ) + sw ( ))
( )
sw
PVBP (
sw
()
( ))
( (
)+
= 1
Floors are dened in a similar way. They are baskets of single oorlets that pay o
( 1 ))+ at time , = 1 .
1(
We will only discuss caps. By the FTAP, the value cap of a cap as of time is:
cap (
)=
=1
( (
)+
(12.108)
Analytical solutions to this problem can be found relying on models of the short-term rate.
First, we use the standard denition of simply compounded rates given in Section 12.2 (see Eq.
(12.1)), viz
) = ( 1 1 ) 1, and rewrite the caplet payo as follows:
1 (
1
(
)+ =
1
(
(1
(1 +
) (
))+
We have,
cap
()=
=1
X
=1
1
1
(1
(1 +
1
(K
K
) (
692
))
))+
K = (1 +
(12.109)
c
by
A. Mele
where the last equality follows by a simple calculation.28 For the models of Jamshidian or Hull &
White, bond prices are such that the cap price in Eq. (12.109) can be expressed in closed-form.
Indeed, Eq. (12.109) makes clear a cap is a basket of puts on zero coupon bonds, with strikes
K . As such, it can be priced in closed form, using the models in Sections 12.8. We have:
cap
()=
X 1
Put (
K
=1
1;
) K
(12.110)
where Put () satises the put-call parity in Eq. (12.94), and, by the pricing formulae in Section
12.8.4,
Call (
ln
(
(
1;
)
1)
+ 12
) K )= (
q
2 (
1
=
2
(
2
1)
)=
(12.111)
Naturally, caps on interest rates, which are nothing but baskets of calls, are portfolios of puts
on xed coupon bonds, due to the inverse relation between prices and interest rates.29
12.8.9 Swaptions
Let us elaborate further on the example of the S&L institution in the previous sections. The
benets for a S&L institution to buy caps is to be protected against upward movements in
the short-term rates while ensuring the downside is freed up. These benets arise, so to speak,
period per period in that, a cap is a basket of options with di erent maturities. A swaption
works di erently, in that the optionality kicks in all together.
Suppose at time , the S&L institution is still concerned about future inversions of the yield
curve and, therefore, anticipates it might need to go for going long a swap payer at some future
date. At the same time, the institution might fear that in the future, swap rates will be lower
relative to some reference strike. Swaptions allow to free up such a downside risk, being options
to enter a swap contract on a future date. Let 0 be the maturity date of this option. Then,
come time 0 , the payo for a payer swaption is the maximum between zero and the value of
a payer interest rate swap at 0 , irs ( 0 ), viz
!+
!+
X
X
FRA ( 0
; irs )
=
)
( 0 )
( irs ( 0 ))+ =
1
1( ( 0
1
irs )
=1
=1
(12.112)
28 By
[1
)]+
=E
=E
E
=E
( )
(1
(1
(1
K
(
))+ F
))+ F
))+ F
=E
(1
))+
29 We might also price caps and oors through the partial di erential equation (12.87), after setting
( )=(
)+ (caps) and
( )=(
)+ (oors), for some strike . However, this type of contracts, where payo s are paid continuously in time, is highly
stylized, and does not exist in the markets.
693
c
by
A. Mele
X
0
swpn ( ) = E
1( ( 0
1
= E
"
irs )
=1
irs
!+ #
!+ #
(12.113)
=1
( ) = Put (
0;
fcb
) 1 )
where Put () satises the put-call parity in Eq. (12.94). By the pricing formulae in Section
12.8.4,
Call (
0;
fcb
) 1 )=
irs
1 Call (
0;
)+Call (
0;
=1
where Call ( 0 ; (
)
to Eq. (12.100) for = 1.
), and
solution
1+
( ) =
694
c
by
A. Mele
( )
( )
= 0
, is solution
(12.114)
(
(
)
+1 )
= ln (1 +
(12.115)
The logic we follow, now, is the same as that underlying the HJM representation of Section
12.7. We wish to express the volatility of bond prices in terms of the volatility of forward rates.
To achieve this task, we rst assume that bond prices are driven by Brownian motions and
expand the L.H.S. of Eq. (12.115) (step 1). Then, we expand the R.H.S. of Eq. (12.115) (step
2). Finally, we identify the two di usion terms derived from the previous two steps (step 3).
Step 1: Let
is solution
to:
=
)=
(12.116)
(
(
)
+1 )
1
k
2
k2
31 Brace,
2
+1 k
+(
+1 )
(12.117)
Gatarek and Musiela (1997) formulate a model in terms of the spot simply-compounded LIBOR interest rates. Because
( ) = ( ), the two derivations are essentially the same.
32 It is well-known that lognormal instantaneous forward rates are problematic as they imply the money market account is illbehaved. Sandmann and Sondermann (1997) provide a succinct overview on how this problem is handled with simply-compounded
forward rates.
695
c
by
A. Mele
ln (1 +
) =
=
1+
(
)2
!
1 2 2 k k2
2 (1 +
)2
1+
)2
+
1+
(12.118)
Step 3: By Eq. (12.115), the di usion terms in Eqs. (12.117) and (12.118) have to be the same.
Therefore,
( )
+1 (
)=
( )
[
]
1+
By summing over , we get the following no-arbitrage restriction applying to the volatility
of the bond prices:
1
X
( )
( )
(12.119)
0( ) =
1+
=0
Eq. (12.119) is, thus, a restriction to the general HJM framework. In other words, assume
the instantaneous forward rates are as in Eq. (12.72) of Section 12.7. As shown in Section 12.7,
the bond price volatility is, then, given by Eq. (12.116). But if we also assume that simplycompounded forward rates are solution to Eq. (12.114), then, the bond price volatility needs
to equal that in Eq. (12.119). Comparing Eq. (12.116) with Eq. (12.119) produces,
Z
1
X
( ) =
( )
1+
0
=0
The practical interest to restrict the forward-rate volatility dynamics in this way lies in the
possibility to obtain closed-form solutions for some of the interest rates derivatives surveyed in
Section 12.8.
12.9.3 Applications to derivative evaluation
12.9.3.1 Forward rates as martingales
Forward rates are martingales under the forward probability. Regarding the continuously compounded notion, we have, by the denition of the instantaneous forward rate, the usual change
of probability and standard regularity conditions,
(
ln
)=
=
=E
=E (
)
)
(
( )
=E
( )
=E
( (
696
))
)
!
c
by
A. Mele
=E
( (
+1
+1 ))
=E
+1
( (
+1 ))
( (
)
(
))
(
):0=E
!
(
)=E
( (
=E
(12.120)
+1 ).
To show Eq.
) satises,
))
(12.121)
)=
=1
=1
)E
=1
( (
( (
( (
)+
)+
where E
(12.122)
ability
; the rst equality is Eq. (12.108); and the second equality folloows by a change of
probability, from the risk-neutral to the forward.
A key point explained above is that
) is a martingale under
for
1
1(
1
[
1 ] (see Eq. (12.121), such that by Eq. (12.114),
1 is solution to:
1
( )
( )
= 1
1]
under
. Therefore, the cap price in Eq. (12.122) reduces to the Blacks (1976) formula
discussed in Chapter 10 (see Section 10.4.4 and Appendix 2 to Chapter 10), once we assume
is deterministic:
E
( (
)+ =
where
1
ln
1 2
2
697
1)
1(
(12.123)
c
by
A. Mele
+
irs ( 0 )) = PVBP 0 (
)(
+
PVBP 0 (
irs )
sw ( 0 )
0
(
)
=
E
PVBP 0 ( 1
swpn
= PVBP (
where E
by:
)E
sw
)(
is:
)=
+
irs )
sw ( 0 )
=1
+
irs )
sw ( 0 )
(12.124)
denotes the expectation taken under the so-called forward swap probability, dened
R 0
PVBP 0 ( 1
)
sw
=
PVBP ( 1
)
F 0
sw
It is easy to see that E
= 1, by using the denition of PVBP 0 ( 1
), the
F 0
0
)=E
( 0 ) , as anticipated in Chapter 4. As also menpricing equation, (
tioned in Chapter 4, swap is also sometimes referred to as annuity probability.
The key point underlying this change of probability is that the forward swap rate swap is a
33
and clearly, positive. Therefore, it must satisfy:
swap -martingale,
swap
( )
=
sw ( )
sw
sw
sw
( )
0]
(12.125)
sw
( );
ln
) =
irs
sw ( ) + 1
2
irs
sw
12.9.3.4 Inconsistencies
()
R
( )
2
sw
irs
( )
swap
swap (
)] = E
swap
(
0)
PVBP (
)
=E
)
( )
( (
PVBP (
698
0)
(
)
))
( 0)
PVBP (
)
=
)
swap (
c
by
A. Mele
market models. A couple of tricks that seem to work in practice. The best known is based on
a suggestion by Rebonato (1998), to replace the true pricing problem with an approximating
pricing problem where sw is deterministic. That works in practice, but in a world with stochastic volatility, we should expect that trick to generate unstable things in periods experiencing
highly volatile volatility. See, also, Rebonato (1999) for an essay on related issues. The next
section suggests to use numerical approximation based on Montecarlo techniques.
12.9.3.5 Numerical approximations
Suppose forward rates are lognormal. Then, we can price caps using Blacks formula and we
proceed to price swaptions relying on the general HJM framework, as summarized by its restrictions in Eq. (12.119), and Montecarlo integration, as follows. By a change of probability,
"
!+ #
X
0
)
) ( 0 )
swpn ( ) = E
1( ( 0
1
=1
0 )E
( (
) (
!+
=1
where ( 0
), = 1 , can be simulated under 0 .
1
Details are as follows. We know that
(
) is solution to,
1
1
1
1(
(12.127)
( )
1
X
1+
=0
0(
))
( )
where the second line follows from Eq. (12.119). Replacing this into Eq. (12.127) leaves:
1
1
1
X
=0
1+
( )
1(
= 1
These can easily be simulated with the methods described in any standard textbook of this
kind, such as that of Kloeden and Platen (1992).
Market practice quoting conventions rely on volatility surfaces stemming from the market models of the previous section, rather than those in Section 12.8.7-12.8.9. The models in Section
12.8.7-12.8.9 could actually be exploited to produce volatility surfaces, albeit indirectly, after
699
c
by
A. Mele
calibration of the two parameters and , as Eq. (12.110) indicates. However, it is easier to
provide volatility surfaces in the rst place, through the models of this section. Quite simply,
practitioners use Eq. (12.123) and quote volatilities such that the market price of a cap equals
to the value predicted by Eq. (12.123) using the desired implied volatility . In Eq. (12.123),
p
=
()
1
for some
$
cap
(; )=
=1
where
$
cap
Black76 ( 1 ; ) =
1
1
1
Given
0=
) Black76 (
ln
+ 12 2
=1
) [Black76 (
Black76 (
= 1
)]
where
is the latest available maturity, and =
( ). The values of ( ) amount
1
to what is typically referred to as the term structure of caps volatilities.
12.10.1.2 Swaptions
As for swaptions, the situation is much simpler. The market practice is to quote swaptions
through standard implied vols, i.e. those vols IV such that, once inserted into Eq. (12.126),
delivers the swaption market price:
swpn (
) = PVBP (
) Black76 (
sw
( );
irs
IV )
1
1 00loc 12 ( + )
(
1
) = loc 2 ( + ) 1 +
)2 +
iv (
24 loc 2 ( + )
where the omitted terms are likely to be numerically negligible for practical purposes.
700
c
by
A. Mele
What are the dynamics of the market smile implied by this local volatility model? To illustrate, consider what happens to the rst term in the previous expansion, say iv (
), when
the forward increases from
to
+
,
1
+
)
+
+
) = iv ( +
)
iv (
loc 2 (
In other words, provided loc is decreasing, the local volatility model predicts that as the
forward
increases, the skew moves to the left, which might contradict market behavior. For
example, let us assume the local volatility function is, loc ( ) = 0 04 1 2 . The left panel
of Figure 12.12 plots the implied volatility iv (
) for
= 3% (solid line) and
= 4%
(dashed line).
30
R =3%
R =3%
Rn=4%
Rn=4%
35
30
Implied volatility (SABR model), in %
Rn=5%
25
20
25
20
15
10
15
4
Strike, in %
4
Strike, in %
FIGURE 12.12. The left panel depicts the approximated implied volatility iv (
)
1
2
predicted by a local volatility model with loc ( ) = 0 04
.Solid and dashed lines
equal to 3% and 4%, respectively. The right panel depicts the
correspond to values of
approximated implied volatility iv (
; ) in Eq. (12.129) predicted by the SABR
model in Eq. (12.128), with
= 0 02, = 0 5, = 0 5, = 0 5, and = 1.Solid,
dashed and dotted lines correspond to values of
equal to 3%, 4% and 5%, respectively.
Hagan, Kumar, Lesnieki and Woodward (2002) (HKLW in the sequel) consider a richer model,
which they call SABR for Stochastic, in which
satises,
(
=
1
p
(12.128)
2
=
+
1
1
2
where
are two standard Brownian motions under the market probability, , and are
constants, and
is interpreted as the initial condition for the unobserved stochastic volatility
701
c
by
A. Mele
component of the forward. Note that the model allows the forward and its volatility to be
conditionally correlated with instantaneous correlation equal to .
HKLW show that the implied volatility predicted by this model is,
2
(1 )2
2 3 2 2
1
1+
+ 4(
+ 24
+
24 (
)1
)(1 ) 2
; )=
iv (
2
4
( )
(
)(1 ) 2
1 + (1 24 ) ln2
+ (11920) ln4
+
(12.129)
where,
p
1 2 + 2+
(1 ) 2
(
)
ln
( ) ln
1
The right panel of Figure 12.12 depicts the approximated implied volatility predicted by the
SABR model obtained with hypothetical parameter values. The model can x the counterfactual
behavior of the skew predicted by a local vol model: as
increases, the implied volatility
shifts to the right while at the same time generating a downward-sloping backbone, dened
as the curved traced by the at-the-money volatility as the forward varies.
The reason for a downward-sloping backbone is the coe cient
1. HKLW also show the
origins of the skew (i.e., the asymmetric smile) predicted by their model, due to (i) a coe cient
1
1, which makes the instantaneous volatility in Eq. (12.128),
, decreasing in
, and
(ii) a
0, which makes the transition density of the log-changes in
skewed towards the
left, as in classical explanations of Heston (1993) given in the equity case (see Chapter 10).
Finally, the volatility of volatility parameter helps determine the curvature of the skew. The
implied volatility shifts up as
increases (option prices increase with volatility in this model
(see Chapter 10), and so does implied volatility.
The SABR model is widely used in the market practice, especially while modeling the swaption skew. Note, however, that the model does not allow for a perfect matching of all available
swaption prices, which by construction the local volatility can, at least theoretically. Finally,
the comparative statics exercise in Figure 12.12 regards a change in
. Because
is correlated with volatility, an alternative comparative statics exercise is one in which both the forward
changes and volatility change in accordance with their assumed correlation (see Bartlett, 2006).
Figure 12.13 shows that in this case with negative correlation, an increase in the forward accompanied by a decrease in the volatility (consistent with the negative correlation) implies the
skew shifts toward the left although the backbone (dened below) is still downward sloping.
702
c
by
A. Mele
30
35
Rn=3%
Rn=3%
Rn=4%
R =3.5%
n
Rn=4%
30
25
20
25
20
15
10
15
4
Strike, in %
4
Strike, in %
FIGURE 12.13. The left panel is the same as the left panel in Figure 12.12. The right panel
depicts the approximated implied volatility iv (
; ) in Eq. (12.129) predicted by
the SABR model in Eq. (12.128), obtained with the parameters values used in Figure
12.12. Solid, dashed and dotted lines correspond to values of
equal to 3%, 3 5%
and 4%, respectively. The values of
in corrispondence of the three values of
are
= 0 02, and then, two increments obtained consistently with the negative correlation
in Eq. (12.128),
=
.
703
c
by
A. Mele
) satisfying:
= 1
(12A.1)
where
is a Brownian motion in R , and
and
(
is vector-valued) are such that there exists
a strong solution to the previous system. The value of a self-nancing portfolio in these
bonds and
a money market account satises:
1 )+
+ >
= >(
where
is some portfolio, 1
is a
=[
2]
>
=[
2]
>
Now suppose there exists a portfolio such that > = 0. This is an arbitrage opportunity if there
1 6= 0. (Use as usual, when
1
0, and
when
exist events for which at some time,
1
0: the drift of
will then be appreciating at a deterministic rate that is strictly greater
than .) Therefore, arbitrage is ruled out if:
>
) = 0 whenever
>
=0
is orthogonal to
In other terms, there is no arbitrage as soon as every vector in the null space of
1 , or when there exists a in R and satisfying a few integrability conditions, and such that
1 =
, or
=
= 1
(12A.2)
In this case,
=( +
= 1
R
R
R
>
1
= exp(
k k2 ). It is easy to show, now, that
Next, dene =
+
,
2
by Girsanovs theorem the discounted bond price is a martingale under . Indeed, dene for a generic
)
, and:
, (
)
(
(
( )
, under
( )
) = E ( ( )) = E [
(
)] = E
| {z }
=1
or
(
)=
=E
, all
704
c
by
A. Mele
2
2
In other terms, the Sharpe ratio of any two bonds must be identical. Eq. (12A.2) is used several
times in this chapter. In Section 12.4, the market primitive is the short-term rate, solution of a
and
are derived via It
os lemma. In Section 12.7,
multidimensional di usion process, and
and
are restricted by a model for the forward rates.
705
c
by
A. Mele
(
)
=E
is known at
(
=E
=E
or
=E (
( )
(12A.3)
706
c
by
A. Mele
( ) as:
( )=
E
( )
E
[
], and in particular, ( ) = 1.
Therefore, E[
F ] = E[ ( )| F ] = ( ) all
We demonstrate these claims under a slightly di erent angle. Let us consider the price dynamics of
),
a zero-coupon bond in Eq. (12A.1), (
)
(
=
and
where we have dened
Under the risk-neutral probability
R
where =
+
By Itos lemma,
is a
))
.
,
-Brownian motion.
( )
=
( )
( ) = 1.
1
2
k (
)k
))
Under the usual integrability conditions, we can use the Girsanovs theorem and conclude that
Z
+
(
)>
(12A.4)
.
is a Brownian motion under the -forward probability
Assuming for example that the driving state variable is the short-term rate, we have that the drift
of the same short-term rate is lower and that of the bond price is higher, due to negative
under
bond price volatility,
0.
Finally, note that for all and non-decreasing sequences of dates { } =0 1 ,
Z
(
= 0 1
= +
)>
Therefore,
1
)>
>
1)
707
= 1 2
(12A.5)
c
by
A. Mele
1 ))
>
1
s.t.
where
lead to,
1)
>
1
1,
=1
=0
where is a Lagrange multiplier. The previous condition tells us that must be one eigenvalue of
( 1) =
the matrix , and that 1 must be the corresponding eigenvector. Moreover, we have
>
=
which
is
clearly
maximized
by
the
largest
eigenvalue.
Suppose
that
the
eigenvalues
of
1
1
. Then,
are distinct, and let us arrange them in descending order, i.e. 1
(
1)
>
2 ))
>
2
s.t.
= 1 and
>
2
=0
where
( 2 ) = 2> 2 . The rst constraint, 2> 2 = 1, is the usual identication constraint. The
second constraint, 2> 1 = 0, is needed to ensure that 1 and 2 are orthogonal, i.e. ( 1 2 ) = 0.
The rst order conditions for this problem are,
0=
where is the Lagrange multiplier associated with the rst constraint, and is the Lagrange multiplier
associated with the second constraint. By pre-multiplying the rst order conditions by 1> ,
0=
>
1
where we have used the two constraints 1> 2 = 0 and 1> 1 = 1. Post-multiplying the previous
>
>
expression by 1> , one obtains, 0 = 1> 2 1>
1 =
1 , where the last equality follows by
>
= 0. So the rst order conditions can be rewritten as,
1 2 = 0. Hence,
(
=0
P
( ) = Tr ( ) = Tr
=1
Hence, Eq. (12.23) follows.
708
>
>
= Tr
= Tr ( )
>
c
12.15. Appendix 5: A few analytical details regarding the Hull and White modelby
A. Mele
12.15 Appendix 5: A few analytical details regarding the Hull and White
model
As in the Ho and Lee model, the instantaneous forward rate (
) predicted by the Hull and White
model is as in Eq. (12.62), where functions 2 and 2 can be easily computed from Eqs. (12.67) and
(12.68) as:
2(
)=
2(
2(
2(
)=
2
(
)
(
)
(
)
(
=
(
)
+
+
+
1
$
2
2
(
)
(
)
(
=
1
1
(
)
+
+
(
)
+
$
$
2 2
which reduces to Eq. (12.69) after using simple algebra.
709
c
by
A. Mele
and
() =
(12A.6)
where and are constants. We derive the dynamics of , compare them with , and formulate some
basic claims regarding the expectation theory. We have:
Z
= ( )+
( ) + (
)
where
(
)= (
Hence,
+ (
1
2
)+
)+
Finally,
= (
and since
|F ) =
1
2
)+
)+
)+ (
)+
,
(
|F ) = (
)+
1
2
and
)=
exp(
)) and
() =
2 (
where
(
)=
Finally,
(
|F ) = (
)+
= (
)+
|F )
) for any .
710
= 0, it always holds
c
by
A. Mele
We now embed the Ho and Lee model in Section 12.6.2 in the HJM format. In the Ho and Lee model,
=
where is a
where
2(
)=
1 2
(
2
) and
) =
)
2(
)=
2(
2(
2(
12 (
)+
=
)+
12 (
) +
2(
Next, we embed the Vasicek model in Section 12.6 into the HJM format. The Vasicek model is:
=(
where is a
) . By Eqs. (12.81),
2(
where
= 1 1
)=
2(
2(
)=
2(
)=
)+
12 (
)+
2(
2(
)
) ,
2(
12 (
and
;
2
) =
) +(
2(
)=
Naturally, this model can never be embedded within a HJM model because it is not of the perfectly
tting type. In practice, condition (12.82) can never hold in the simple Vasicek model. However, the
model is embeddable once is turned into an innite dimensional parameter `
a la Hull and White (see
Section 12.4).
711
c
by
A. Mele
)=
2)
1
2
R
(
(
1)
2)
2)
+
(
712
), where
2)
2)
)+
)
))
c
by
A. Mele
{
2
(
(
(
(
)
=
)
)
),
+(
under
) under
(12A.7)
as well as under
. We aim to
as well as under
)
=
)
)
=
)
),
)
) (
[ (
)]
(12A.8)
)
into Eq. (12A.8),
)
=
)
)
=
)
(
(
(
(
(
)
(
)
)
=
)
)
=
)
1
2
1
2
[ (
[ (
713
)]2
)]2
[ (
[ (
)]
)]
c
by
A. Mele
References
At-Sahalia, Y. (1996): Testing Continuous-Time Models of the Spot Interest Rate. Review
of Financial Studies 9, 385-426.
Ahn, C.-M. and H.E. Thompson (1988): Jump-Di usion Processes and the Term Structure
of Interest Rates. Journal of Finance 43, 155-174.
Ang, A. and M. Piazzesi (2003): A No-Arbitrage Vector Autoregression of Term Structure
Dynamics with Macroeconomic and Latent Variables. Journal of Monetary Economics
50, 745-787.
Balduzzi, P., S. R. Das, S. Foresi and R. K. Sundaram (1996): A Simple Approach to Three
Factor A ne Term Structure Models. Journal of Fixed Income 6, 43-53.
Bartlett, B. (2006): Hedging Under SABR Model. Wilmott Magazine July/August, 68-70.
Black, F. (1976): The Pricing of Commodity Contracts. Journal of Financial Economics 3,
167-179.
Black, F. and M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal
of Political Economy 81, 637-659.
Brace, A., D. Gatarek and M. Musiela (1997): The Market Model of Interest Rate Dynamics.
Mathematical Finance 7, 127-155.
Brigo, D. and F. Mercurio (2006): Interest Rate ModelsTheory and Practice, with Smile,
Ination and Credit. Springer Verlag Finance (2nd Edition).
Brunnermeier, M. (2009): Deciphering the Liquidity and Credit Crunch 2007-08. Journal of
Economic Perspectives 23, 77-100.
Carverhill, A. (1994): When is the Short-Rate Markovian? Mathematical Finance 4, 305-312.
Cochrane, J. H. and M. Piazzesi (2005): Bond Risk Premia. American Economic Review 95,
138-160.
Collin-Dufresne, P. and R. S. Goldstein (2002): Do Bonds Span the Fixed-Income Markets?
Theory and Evidence for Unspanned Stochastic Volatility. Journal of Finance 57, 16851729.
Conley, T. G., L. P. Hansen, E. G. J. Luttmer and J. A. Scheinkman (1997): Short-Term
Interest Rates as Subordinated Di usions. Review of Financial Studies 10, 525-577.
Cox, J. C., J. E. Ingersoll and S. A. Ross (1979): Duration and the Measurement of Basis
Risk. Journal of Business 52, 51-61.
Cox, J. C., J. E. Ingersoll and S. A. Ross (1985): A Theory of the Term Structure of Interest
Rates. Econometrica 53, 385-407.
Dai, Q. and K. J. Singleton (2000): Specication Analysis of A ne Term Structure Models.
Journal of Finance 55, 1943-1978.
714
c
by
A. Mele
Diebold, F.X. and C. Li (2006): Forecasting the Term Structure of Government Bond Yields.
Journal of Econometrics 130, 337-364.
Du e, D. and R. Kan (1996): A Yield-Factor Model of Interest Rates. Mathematical Finance
6, 379-406.
Du e, D. and K. J. Singleton (1999): Modeling Term Structures of Defaultable Bonds.
Review of Financial Studies 12, 687-720.
Estrella, A. and G. Hardouvelis (1991): The Term Structure as a Predictor of Real Economic
Activity. Journal of Finance 46, 555-76.
Fama, E. F. and R. R. Bliss (1987): The Information in Long-Maturity Forward Rates.
American Economic Review 77, 680-692.
Fong, H. G. and O. A. Vasicek (1991): Fixed Income Volatility Management. The Journal
of Portfolio Management (Summer), 41-46.
Geman, H. (1989): The Importance of the Forward Neutral Probability in a Stochastic Approach to Interest Rates. Unpublished working paper, ESSEC.
Geman H., N. El Karoui and J. C. Rochet (1995): Changes of Numeraire, Changes of Probability Measures and Pricing of Options. Journal of Applied Probability 32, 443-458.
Goldstein, R. S. (2000): The Term Structure of Interest Rates as a Random Field. Review
of Financial Studies 13, 365-384.
Hagan, P. S. and D. E. Woodward (1999): Equivalent Black Volatilities. Applied Mathematical Finance 6, 147-157.
Hagan, P. S., D. Kumar, A. S. Lesniewski, and D. E. Woodward (2002): Managing Smile
Risk. Wilmott Magazine, September, 84-108.
Harvey, C. R. (1991): The Term Structure and World Economic Growth. Journal of Fixed
Income 1, 4-17.
Harvey, C. R. (1991): The Term Structure Forecasts Economic Growth. Financial Analysts
Journal May/June 6-8.
Heaney, W. J. and P. L. Cheng (1984): Continuous Maturity Diversication of Default-Free
Bond Portfolios and a Generalization of E cient Diversication. Journal of Finance 39,
1101-1117.
Heath, D., R. Jarrow and A. Morton (1992): Bond Pricing and the Term-Structure of Interest
Rates: a New Methodology for Contingent Claim Valuation. Econometrica 60, 77-105.
Heston, S. L. (1993): A Closed Form Solution for Options with Stochastic Volatility with
Applications to Bond and Currency Options. Review of Financial Studies 6, 327-344.
Ho, T. S. Y. and S.-B. Lee (1986): Term Structure Movements and the Pricing of Interest
Rate Contingent Claims. Journal of Finance 41, 1011-1029.
715
c
by
A. Mele
Hordahl, P., O. Tristani and D. Vestin (2006): A Joint Econometric Model of Macroeconomic
and Term Structure Dynamics. Journal of Econometrics 131, 405-444.
Hull, J. (2003): Options, Futures, and Other Derivatives. Prentice Hall. 5th edition (International Edition).
Hull, J. and A. White (1990): Pricing Interest Rate Derivative Securities. Review of Financial
Studies 3, 573-592.
Jamshidian, F. (1989): An Exact Bond Option Pricing Formula. Journal of Finance 44,
205-209.
Jamshidian, F. (1997): Libor and Swap Market Models and Measures. Finance and Stochastics 1, 293-330.
Joreskog, K. G. (1967): Some Contributions to Maximum Likelihood Factor Analysis. Psychometrica 32, 443-482.
Karlin, S. and H. M. Taylor (1981): A Second Course in Stochastic Processes. San Diego:
Academic Press.
Kennedy, D. P. (1994): The Term Structure of Interest Rates as a Gaussian Random Field.
Mathematical Finance 4, 247-258.
Kennedy, D. P. (1997): Characterizing Gaussian Models of the Term Structure of Interest
Rates. Mathematical Finance 7, 107-118.
Kessel, R. A. (1965): The Cyclical Behavior of the Term Structure of Interest Rates. National
Bureau of Economic Research Occasional Paper No. 91.
Kloeden, P. and E. Platen (1992): Numeric Solutions of Stochastic Di erential Equations.
Berlin: Springer Verlag.
Knez, P. J., R. Litterman and J. Scheinkman (1994): Explorations into Factors Explaining
Money Market Returns. Journal of Finance 49, 1861-1882.
Lamberton, D. and B. Lapeyre (1997): Introduction au Calcul Stochastique Applique `a la
Finance. Paris: Ellipses.
Langetieg, T. (1980): A Multivariate Model of the Term Structure of Interest Rates. Journal
of Finance 35, 71-97.
Laurent, R. D. (1988): An Interest Rate-Based Indicator of Monetary Policy. Federal Reserve
Bank of Chicago Economic Perspectives 12, 3-14.
Laurent, R. D. (1989): Testing the Spread. Federal Reserve Bank of Chicago Economic
Perspectives 13, 22-34.
Litterman, R. and J. Scheinkman (1991): Common Factors A ecting Bond Returns. Journal
of Fixed Income 1, 54-61.
Litterman, R., J. Scheinkman, and L. Weiss (1991): Volatility and the Yield Curve. Journal
of Fixed Income 1, 49-53.
716
c
by
A. Mele
Longsta , F. A. and E. S. Schwartz (1992): Interest Rate Volatility and the Term Structure:
A Two-Factor General Equilibrium Model. Journal of Finance 47, 1259-1282.
Mele, A. (2003): Fundamental Properties of Bond Prices in Models of the Short-Term Rate.
Review of Financial Studies 16, 679-716.
Mele, A. and F. Fornari (2000): Stochastic Volatility in Financial Markets: Crossing the Bridge
to Continuous Time. Boston: Kluwer Academic Publishers.
Mele, A. and O. Obayashi (2015): The Price of Fixed Income Market Volatility. Springer Verlag
Finance (forthcoming).
Merton, R. C. (1973): Theory of Rational Option Pricing. Bell Journal of Economics and
Management Science 4, 141-183.
Miltersen, K., K. Sandmann and D. Sondermann (1997): Closed Form Solutions for Term
Structure Derivatives with Lognormal Interest Rate. Journal of Finance 52, 409-430.
Nelson, C.R. and A.F. Siegel (1987): Parsimonious Modeling of Yield Curves. Journal of
Business 60, 473-489.
Rebonato, R. (1998): Interest Rate Option Models. Wiley.
Rebonato, R. (1999): Volatility and Correlation. Wiley.
Ritchken, P. and L. Sankarasubramanian (1995): Volatility Structure of Forward Rates and
the Dynamics of the Term Structure. Mathematical Finance 5, 55-72.
Sandmann, K. and D. Sondermann (1997): A Note on the Stability of Lognormal Interest
Rate Models and the Pricing of Eurodollar Futures. Mathematical Finance 7, 119-125.
Santa-Clara, P. and D. Sornette (2001): The Dynamics of the Forward Interest Rate Curve
with Stochastic String Shocks. Review of Financial Studies 14, 149-185.
Stanton, R. (1997): A Nonparametric Model of Term Structure Dynamics and the Market
Price of Interest Rate Risk. Journal of Finance 52, 1973-2002.
Stock, J. H. and M. W. Watson (1989): New Indexes of Coincident and Leading Economic
Indicators. In: Blanchard, O. J. and S. Fischer (Eds.): NBER Macroeconomics Annual
1989, MIT Press, 352-394.
Stock, J. H. and M. W. Watson (2003): Forecasting Output and Ination: The Role of Asset
Prices, Journal of Economic Literature 41, 788-829.
Vasicek, O. (1977): An Equilibrium Characterization of the Term Structure. Journal of
Financial Economics 5, 177-188.
Veronesi, P. (2010): Fixed Income Securities: Valuation, Risk and Risk Management. John
Wiley and Sons.
717
13
Risky debt and credit derivatives
13.1 Introduction
This chapter deals with the pricing of securities that carry credit risk. It examines the main
conceptual approaches to deal with credit risk as well as how this risk can be transferred through
dedicated credit derivatives. It is instructive to review the historical reasons leading up to the
creation and trading of these derivatives. The next subsection contains such a succinct account;
Section 11.3.2 provides a roadmap to this chapter.
[In progress]
13.1.1 A brief history of credit risk and nancial innovation
During the mid 1980s, a market begins to develop regarding the rst interest rates derivatives
reviewed in the previous chapter. This market would grow to an extent that during the late
1980s already, the appetite for these derivatives would proliferate and also lead to additional
and fairly complex products, arising through innovation and competition. It is natural: nancial
innovation is relatively easy to imitate, which leads banks to increasing creativenessincreasing
creativeness is needed to keep the innovators initial competitive advantage as long as possible.
The early 1990s were extraordinary years. On the one hand, interest rates were low amid
concerns the U.S. economy had not still recovered from 1991 recession. On the other hand, capital market volatility was quite muted. Low interest rates and low capital market volatility are
natural drivers for motivating the introduction of new derivatives that aim to boost investors
returns.1 But the nancial turmoil in 1994 brought the interest rate climate to suddenly change
while some of these products would produce large losses. These losses would trigger a call for
regulation by public opinion and certain policy makers even while the ISDA (International
Swaps and Derivatives Association) would debate that more regulation would destroy market
creativity.
Regulatory pressures would vanish by the mid 1990s, when the market started to innovate
again amid a general consensus that derivative risks could be controlled through market discipline, not regulation. Swap markets recovered. They did so slowly though, as these derivatives
1 Examples
of products introduced at the time are LIBOR squared, inverse oaters, or power options, sponsored by JPMorgan.
c
by
A. Mele
13.1. Introduction
were already in the end of the innovation cycle: the process of imitation had led swap-related
derivatives to become a mass product, with prot margins having been eroded in the meantime. The market was ready for a new major innovation wave.
Credit risk was the next innovation stream. Global institutions (e.g., JPMorgan, Credit Suisse, Bankers Trust) soon realized that borrower defaulting was a source of substantial risk
that could be so conveniently re-allocated through dedicated derivatives. Similarly as classic
derivatives transfer market risk, credit derivatives could transfer market risk. Institutions such
as JPMorgan had additional motivations to innovate in this space, given the vast pools of loans
contained in its books: importantly, these loans required too many reserves and were therefore
expensive.
One solution was to proceed with securitization. Securitization is a process by which some
illiquid assets (say some loans) are gathered (packaged) into a common pool that backs the
issuance of new securities aimed to display an enhanced liquidity obtained through packaging,
credit and liquidity enhancements. These new securities are, in fact, derivatives written on the
initial illiquid assets. Two leading examples of this process include the securitization of mortgages and receivables. Financial institutions nd the securitization process attractive, as they
can carve out certain items in their balance sheet, thus boosting their return on investments;
moreover, by securitizing assets, less capital is needed to meet capital requirements standards.
For example, the accounts receivables of a corporation may be used to back the issue of commercial paper known as asset-backed commercial paper. A well-functioning securitization system
is a way (not the only way) to transfer and trade credit risk.
Global institutions would then repackage loans into derivatives, in a way that default risk
and/or part of the securitized loans could be transferred to outside investors. Note that credit
derivatives were also a regulatory mitigation device, partly useful as a response to regulation.
The underlying ideas were (i) to turn loans into derivatives that could be sold, and (ii) to create
new insurance products such as credit default swaps. At the very beginning, derivatives were just
designed to have single loans as the underlying. Afterwards, the idea emerged to create structures organized in derivatives bundles, with cash ow indexed to baskets of loansthe ancestors
to collateralized debt obligations (CDOs). For example, JPMorgan created Bistro (Broad
Index Secured Trust O ering), a structure relying on a variety of assets, ranging from corporate debt to student loans; ABN-Amro created similar structures (Heineken and Amstel).
During this innovation process, competition increased and prot margins fell again, leading to
renewed motivation for additional innovation.
719
c
by
A. Mele
13.1. Introduction
Year
US ABS
(Outstanding)
Global CDO
(Issuance)
US Agency MBS
(Issuance)
US Agency CMO
(Issuance)
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
1084
1230
1381
1507
1814
2111
2700
2945
2599
2326
2034
68
78
83
87
157
251
455
430
62
4
7
474
1086
1447
2131
1015
983
923
1189
1169
1734
1420
586
1455
2019
2762
1379
1345
1240
1471
1339
2022
1885
110
100
20
90
15
80
70
10
60
5
50
40
30
Europe
U.S.
U.K.
10
1985 1990 1995 2000 2005 2010
20
ABX AAA
ABX BB
10
2006
2007
2008
2009
Left hand side panel: U.S. and European House Price Change (year-to-year, in percentage). Right
hand side panel: Indexex of CDS on U.S. Mortgage-Related Securities Prices. Source: IMF, Global
Financial Stability Report, April 2008.
720
c
by
A. Mele
The response to increased competition was the creation of structured products referenced
to riskier assets. Importantly, during the mid 1990s, derivatives teams begun to interact with
teams managing loans extended to borrowers with poor credit historysubprime mortgages.
Subprime loans would begin to be securitized and structured into CDOs with global nancial
institutions involved (e.g., Merrill Lynch or UBS). The mortgage banking model had shifted
from one of buy to hold to one of originate and distribute. The subprime crisis erupted in
2007, and sent the global nancial system into nearly two years of turmoil, in a situation where
the payments and settlement system was jeopardized.
The mechanics of the 2007-2008 crisis are well-understood. While low interest rates helped
sustain the boom in the housing market, the originate-and-distribute model had operated in a
way that lending standards were not in line with market expectations (see Section 13.6.2). The
subprime mortgage market would sink amid increasing interest rates and rapidly decelerating
housing prices. The shadow banking system, which helped sustain the originate-and-distribute
model, was actually an important piece of the crisis: the uncertainties related to the entities
nancing this system triggered a sharp liquidity dry-up and, then, a credit crunch, followed by
a drop in the real economic activity, which magnied the credit crunch, over a spiral.
[In progress]
13.1.2 Plan of the chapter
This chapter reviews conceptual approaches to the pricing of defaultable securities as well as the
basic mechanics of derivatives that transfer credit risk. Section 13.2 reviews classical irrelevance
results: the capital structure does not matter for the value of a rm. This result should sound like
a remainder for some of the subsequent developments in this chapter. For example, Section 13.3
deals with structural approaches to debt evaluation, relying on assumptions that are at times
consistent and at other times inconsistent with those underlying classical irrelevance results.
Section 13.3 also deals with reduced-form approaches by which default risk is modeled as an
exogenously given event. Section 13.4 reviews the main credit derivatives that aim to re-allocate
credit risk, such as credit default swaps or securitized obligations.
These lectures have never really dealt with issues regarding risk-management. Section 13.5
contains both introductory discussions regarding risk-management in general and some details
of credit risk management. It also discusses regulatory developments over the relatively recent
history. Section 13.6 discusses a few more details regarding the 2007-08 nancial crisis, which
originally erupted when losses mounted in credit markets, and then spread to the overal economy. It is an exemplary case study (very well utilized in the literature) that illustrates how
a shock in capital markets can a ect global developments over a vicious circle: an important
instance of endogenous risk of the type dened and discussed in general terms in Chapter 8 of
these lectures.
c
by
A. Mele
a constant for all the unlevered rms belonging to sector . Naturally, the value of the rm is
=
, say. Next, consider a levered rm operating within the
equal to the value of equity,
-th sector. This rm issues debit with nominal value equal to such that is value denoted as
, equals the sum of equity and debt,
=
+ . In the absence of any market frictions,
we have the following irrelevance result:
Theorem 13.1 (Modigliani & Miller theorem). In the absence of arbitrage and frictions, the
market value of any rm is independent of its capital structure and is given by dicoutning its
expected prots at the discount rate appropriate to its class:
= , for any rm
{
}
in class .
In other words, the return on investment (ROI), dened as = , is the same for two rms
that earn the same expected prot , regardless of the capital structure. Naturally, the ROE
and ROI are the same for the unlevered rm.
The proof of Theorem 13.1 can proceed by applying the modern tools reviewed in Chapter
2 through 4. For sake of completeness, we use the original Modigliani and Miller arguments,
which are very simple. Consider two rms: a rst, unlevered and a second, levered. They both
earn the same expected prot, . Suppose to purchase the shares of the unlevered rm and
borrow the same amount of money issued by the levered rm. In the absence of arbitrage or
any frictions, the value of this portfolio should equal the value of the levered rm, which is
possible as soon as the value of the levered and the unlevered rm are the same.
Mathematically, given an arbitrary
(0 1), we do the following trade: (i) we buy
=
+
=
of the unlevered rm; (ii) we sell
= shares of the levered rm. These
two trades make the balance of the position worth
+
=
, and so (iii) we
borrow
at the interest rate , to make this initial position worthless. This portfolio yields:
(i) +
, due to the purchase of the shares of the unlevered rm, (ii)
(
), due
to the sale of the shares of the levered rm, which of course has to pay interests on its debt,
and (iii)
, arising to honour the debt we are making to build up the worthless portfolio.
(
)
=
1 . If
, we have
Summing up, the prots are
an arbitrage opportunity as we may make money out of a worthless portfolio, and if
,
we have an arbitrage as well, as we could reverse the positions of the worthless portfolio. So we
need to have that
=
=
= .
[As mentioned, Theorem 13.1 can be proved through the modern tools in Chapters 2 through
4]
We have:
= ROI
ROE =
. Therefore,
=
ROI ( +
= ROI + (ROI
If the nancial conditions of the rm do not a ect the interest rate on debt, the ROE is
. This situation arises when the arbitrage
increasing in the leverage ratio, , provided ROI
arguments underlying Theorem 13.1 assume no-arbitrage trades can be implemented with a
cost of borrowing money equal to that of the rm. In the presence of market frictions such
722
c
by
A. Mele
as asymmetric information between borrowers and lenders, this needs not to be the case. For
example, debt markets might be concerned about the size of the leverage ratio. Assume, for
example, that = ( ), where = , and in particular that ( ) = 0 03 . Then, we have that:
ROE = ROI + (ROI 0 03 ) . The picture below depicts the behavior of ROE as a function
of , assuming that ROI = 5% and that the risk-free rate in case of no such frictions is = 3%.
ROE
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
Leverage ratio
The solid line depicts the ROE for a rm sustaining a cost of debt independent of the
leverage ratio, with ROI = 5% and = 2%. The dashed line is the ROE for a rm that
has a cost of debt increasing in the leverage ratio , ( ) = 0 03 .
Consider the rm with cost of capital depending on the current leverage rato, . For a low
level of , the ROE increases with , so as to magnify the di erence ROI 0 03 through the
multiplying e ect (ROI 0 03 ) . However, for higher leverage ratios, the di erence ROI 0 03
becomes thinner and thinner, and an increase in then leads to marginally lower ROE. In this
example, there is an interior value for the leverage ratio that maximizes the ROE, which is,
approximately, = 0 83.
c
by
A. Mele
Equity ( )
(Shares)
Assets ( )
Debt ( )
(Bonds)
Therefore, we have the accounting identity: Assets = Equity + Debt, or
=
When debt expires, debt-holders receive the minimum between the nominal value of debt and
the value of the assets the rm can liquidate to honour debt. Debt-holders are senior claimants.
Equity holders are juniors, i.e., they are residual claimants to the rms assets.
We use these basic insights and illustrate the rst approach to the modeling of the riskstructure of interest ratesthe Merton-KMV approach. In this approach, equity is the same as
a European call option written on the rms assets, with expiration equal to the debt expiration,
and strike equal to the nominal value of debt. The current value of debt equals the value of the
assets minus the value of equity, i.e. the value of a risk-free discount bond minus the value of
a put option on the rm with strike price equal to the nominal value of debt, as shown by Eq.
(13.3) below.
Merton (1974) uses the Black and Scholes (1973) formula to derive the price of debt. The
main assumption underlying this model is that the assets of the rm can be traded, and that
their value
satises2
=
+
(13.1)
where is a Brownian motion under the risk-neutral probability,
is the instantaneous
standard deviation, and is the short-term rate on riskless bonds.
Let
be the nominal value of debt, be time of expiration of debt;
the debt value as
of at time
. As argued earlier, shareholders are long a European call option, and the
bond-holders are residual claimants. Mathematically,
0}
(13.2)
Equity at
}=
max {
0}
(13.3)
Put on the rm
, where
) +
is the instantaneous cash ow to the
. For example, one could take
to be a geometric Brownian
, forever, but were just ignoring this complication.
724
c
by
A. Mele
Concavity: The value of debt, instead, is decreasing in the rm asset volatility, as we shall show
in detail in the next section.
1.2
1.0
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
2.2
A_T
FIGURE 13.1. Dashed line: the value of equity at the debt maturity, , max {
0},
. Solid line: the value of debt at maturity,
plotted as a function of the rm asset value,
min {
} as a function of
. Nominal value of debt is xed to
= 1.
13.3.1.1 Merton
The current value of the bonds equals the current value of the assets, 0 , minus the current
value of equity. The current value of equity can obtained through the Black & Scholes formula,
as equity is a European call option on the rm, struck at . By Eq. (13.2), and standard
risk-neutral evaluation, the current value of debt, 0 , is,
where
3 For
1)
ln (
0/
)+
1
2
(13.4)
E(
E[
0)
0]
E [ max {
0}|
0]
1)
where the last equality follows by the Black & Scholes formula. Eq. (13.4) follows after rearranging terms in the previous equation.
725
c
by
A. Mele
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
A_0
FIGURE 13.2. Solid line: the no-arbitrage bound, min { 0 }, depicted as a function
of 0 , when the nominal value of debt is xed to
= 1. Dashed line: the bond value
predicted by the Mertons model when = 1, = 3% and = 20%, annualized. Dotted
line: same as the dashed line, but with a higher asset volatility, = 40%.
Bond prices are decreasing in the asset volatility as bad outcomes are exaggerated on the
downside, due to the concavity properties depicted in Figure 13.1. Note that the property that
the term structure of interest rates increases with the volatility of the fundamentals is quite
sharp here. In Chapter 12 (Section 12.3.4.1), it was argued that the relation between the yield
curve and the volatility of the fundamentals (i.e. the volatility of the short-term rate) was quite
complex, as it depends on which of two e ects dominatea convexity and a risk-premium
e ect. In bad times, it should be the risk-premium e ect to dominate, thereby leading to a
positive link between the volatility of the fundamentals and the yield curve. In good times,
a convexity e ect would lead the yield curve to be negatively related to the volatility of the
fundamentals. Instead, the prediction in this section is quite neat: the term structure of interest
rates always increases with the volatility of the fundamentals. Naturally, this prediction relies
on a channel that is completely distinct from the risk-premium channel discussed in Chapter
12.
The term structure of interest rates is dened as usual as:
1
0
ln
= + s0
where
s0 =
ln
1)
(13.5)
We usually refer to s0 as the term-spread for a given xed maturity, and to the mapping
maturity-spreads as the risk-structure of interest rates.
Figure 13.3 depicts the spread predicted by this model. Credit spreads shrink to zero as
time-to-maturity becomes smaller and smaller. This property of the model stands in sharp
contrast with the empirical behavior of credit spreads, which are high even for short-maturity
726
c
by
A. Mele
bonds. This property arises because the model is driven by Brownian motions, which have
have continuous sample paths, such that given a rm asset value
, the probability of
bankruptcy, arising when hits
from above, approaches zero very fast as time-to-maturity
goes to zero. Because credit spreads reect default probabilities, as explained in detail below
(see Eq. (13.9)), credit spreads shrink to zero quickly as time-to-maturity approaches zero.
Naturally, one might end up with credit spreads su ciently high at short maturities, by
assuming the rm asset value is su ciently small. For example, in Figure 13.3.1, credit spreads
are high at short maturities, when = 1 1. However, even with = 1 1, credit spreads are
still zero at very short maturities. More fundamentally, requiring such a small value for is
problematic. Firms with such a low asset value would command a much higher spread than
that in Figure 13.3.1. All in all, the Brownian motion model in this section lacks some source of
risk driving the behavior of short-term spreads. In Section 13.3.2, we will show that this issue
can be addressed assuming that rms default can be triggered by jumps.
Spread
300
200
100
Time to maturity
FIGURE 13.3.1. The term structure of spreads, s0 , in basis points, predicted by Mertons
model, obtained with initial asset values 0 = 1 1 (solid line), 0 = 1 2 (dashed line),
and 0 = 1 3 (dotted line). The short-term rate, = 3%, and asset volatility is = 0 20.
Nominal debt
= 1.
Naturally, the term-structure of credit spreads has a rather di erent shape, when the current
rm asset value is below , as depicted in Figure 13.3.2. In this case, the probability the rm
defaults is close to one when time to maturity is close to zero, such that the spreads would then
be arbitrarily large as we get closer and closer to maturity. For visualization purposes, Figure
13.3.2 is truncated to only include values of the spreads for maturities higher than one year.
727
c
by
A. Mele
Spread
3000
2500
2000
1500
1000
500
1
Time to maturity
FIGURE 13.3.2. The term structure of spreads, s0 , in basis points, predicted by Mertons
model, obtained with initial asset values 0 = 0 9 (solid line), 0 = 0 8 (dashed line),
and 0 = 0 7 (dotted line). The short-term rate, = 3%, and asset volatility is = 0 20.
Nominal debt
= 1.
1
2
, then,
}=
I{
728
I{
c
by
A. Mele
where I{E} is the indicator function, i.e. I{E} = 1 if the event E is true and I{E} = 0 if the event
E is false. Second, we have,
0
E( )
I{
E
=
=
E I{
| Default) Q (Default) +
[E (
Q (Survival)]
(13.6)
where E ( | Default) is the expected rm asset value given the event of default, Q (Default) is
the probability of default, and Q (Survival) = 1 Q (Default) is the probability
the rms does
not
The last
E
I{
} =
equality
default.
follows
by the Law of Iterated Expectations,
E E
I{
= E I{
|
) = E I{
| Default) .
}
} E(
} E(
Comparing Eq. (13.6) and Eq. (13.4) reveals that for the Mertons model,
Q (Survival) =
( 2)
(13.7)
ln (
0/
)+
1
2
(13.8)
Pr(surv)
1.0
0.9
0.8
0.7
0.6
0.5
0.0
0.1
0.2
729
0.3
0.4
sigma
c
by
A. Mele
FIGURE 13.4. Probability of survival for a given rm predicted by the Mertons model,
( 2 ), depicted as a function of the asset volatility, . Firm asset value is xed at 0 = 1 1,
and plotted are survival probabilities for bonds maturing at
= 0 5 years (solid line),
= 1 year (dashed line) and
= 2 years (dotted line). The short-term rate, = 3%.
Nominal debt
= 1.
Property (i) is not a general property, though. For example, we already pointed out that for
1 2
large , the probability of survival is close to one as soon
, a condition ensuring the
2
rm asset value grows so large to ensure default becomes unlikely, eventually. The next picture
1 2
shows that for a xed , such that
, the probability of survival is non-monotonic in .
2
Pr(surv)
1.00
0.98
0.96
0.94
0.92
0.90
0.88
0.86
0
10
15
20
25
30
years
1 2
This property arises for the following reason. Assuming that 0
and
, the rst
2
term of 2 in Eq. (13.8) is decreasing in , whereas the second is increasing. When is small,
the rst term (and its sensitivity to dominates, such that distance-to-default decreases with
maturity. But for large, the second term of 2 dominates, and distance-to-default becomes
eventually large. Non-monotonicities arise even at nite maturities, once we consider low values
of 0 , in which case the relation between maturity and probability of survival can be increasing
or decreasing, according to the values of , as shown in Figure 13.5. Intuitively, when 0
,
the probability of survival is:
1 2
1 2
ln 0 +
2
2
Q (Survival) = ( 2 ) , with 2 =
such that the survival probability decreases in for large although then it increases in for
small . The intuition underlying this property is that for large , the probability the rm asset
value will end up below from 0
can only increase with time to maturity, . Analytically,
730
c
by
A. Mele
E (ln | 0 ) = ln 0 +
asset value will be below
Pr(surv)
1 2
ln +
, such that the probability the rm
2
does indeed increase with .
1
2
1.0
0.9
0.8
0.7
0.6
0.5
0.0
0.1
0.2
0.3
0.4
sigma
FIGURE 13.5. Probability of survival for a given rm predicted by the Mertons model,
( 2 ), depicted as a function of the asset volatility, . The rm asset value is xed at
= 0 5 years
0 = 1 01, and plotted are survival probabilities for bonds maturing at
(solid line), = 1 year (dashed line) and = 2 years (dotted line). The short-term rate,
= 3%. Nominal debt
= 1.
A nal useful concept is that of loss-given-default (under Q), denoted as LGD in the sequel.
Comparing Eq. (13.6) with Eq. (13.4) reveals another property of the Mertons model,
E(
| Default) =
( 1)
=
Q (Default)
0
(
(
1)
2)
= E(
(
(
1)
2)
E(
Recovery rates are dened as the fraction of the bond value the bond-holders expect to obtain
at maturity and in the event of default:
E(
Rec
| Default)
(
(
1)
2)
Loss-given-default is dened as the fraction of the bond value the bond-holders expect to lose
at maturity and in the event of default, i.e., LGD = 1 Rec. Finally, by Eq. (13.5), we can
write,
1
0
( 1) + ( 2)
s0 =
ln
1
=
1
[LGD Q (Default)]
(13.9)
This is actually a general formula, which goes beyond the Mertons model. It can easily be
obtained through Eq. (13.6).
731
c
by
A. Mele
Assume a rm has asset value 0 = 110, and that the asset value volatility is
= 30%,
annualized. The safe interest rate is = 2%, annualized, and the expected growth rate of the
asset value is = 5%, annualized. The rm has outstanding debt with nominal value = 100,
which expires in two years.
First, we compute the distance-to-default implied by the Mertons model, which is,
1 2
1
2
ln 0 +
0
3
2
ln
(1
1)
+
0
02
2
2
D-t-D =
=
= 0 10680
03 2
Accordingly, the probability of default is,
1
(0 10680) = 1
0 54253 = 0 45747
We can compute the same probability, under the physical probability, by simply replacing
= 2% with = 5%, in the formula for D-t-D. We have,
1 2
1
2
0
3
2
ln 0 +
ln
(1
1)
+
0
05
2
2
D-t-Dphysical =
= 0 24822
=
03 2
Therefore, the probability of default under the physical distribution is,
1
physical
(0 24822) = 1
0 59802 = 0 40198
It is, of course, lower under the physical probability than under the risk-neutral probability,
due to the larger asset growth rate,
.
Finally, we can compute the spread on this bond, which is given by:
1
0
Spread =
( 1) + ( 2)
ln
732
= D-t-D, and
= 2+
. So we have,
1
Spread =
0 10680 + 0 30
ln 1 1 0 022
2
1
=
ln 1 1 0 022 0 29769 + 0 54253
2
= 6 20%
c
by
A. Mele
2 +
(0 10680)
On the 5th of August 2011, the rating agency Standard & Poors downgraded the US debt from
AAA to AA+ for the rst time in history. The US and global equity markets sunk (the DJIA lost
nearly 6%) on the rst trading day (Monday the 8th) following the announcement. Somehow
paradoxically, US Treasuries rallied on the very same day, a phenomenon many commentators
described as a ight-to-quality response to a quite unique event. The reason this rally seems
paradoxical is that the downgrade regarded, obviously, US debt! Moreover, at the time, the US
debt/GDP ratio was hovering at about 100%, a fact that mitigates the case of US Treasuries
as safe-heaven assets.
But additional arguments made US Treasuries safe-heaven. First, some of the signals leading
to the downgrade regarded the political gaming about an increase in the debt ceiling, a gaming
that could have led to delinquencies.4 However, this gaming and its e ects were arguably transitory. Moreover, AA+ debt is accepted as a collateral in many transactions, and considered
to be high grade debt in most investment mandates. Furthermore, an issue at the time was
to speculate whether other rating agencies (Moodys and Fitch) would proceed with similar
downgrades of US debt that could have made the 05-08 decision more solidly grounded, so to
speak. Finally, the Standard & Poors decision was not totally unexpected, as rumours about
it would start circulate months earlier indeed.
However, ight-to-quality does not seem to be an exhaustive explanation for the US Treasury
rally during these events. First, during the period around the 5th of August, the US were ooded
with bad news regarding the economic fundamentals, with many leading indicators reaching
levels historically consistent with recessions. These news would make it likely (at the time) that
the US economy would spiral towards a second recession in less than four years, i.e., right after
that related to the subprime crisis (discussed in Section 13.6). As we know from Chapter 12,
recession fears translate into an expectation future short rates will lower as a result of the FED
attempt to stimulate growth, whence the Treasury rally.
Therefore, the rally of US Treasuries at the time might merely reect the expectation hypothesis that future rates would lower. In this period of bad news (which also included adverse
developments regarding the debt crisis in Europe), the Standard & Poors downgrade might
have come as a yet-another negative signal about the general health of the US system, which
would further deteriorate the general investment climate. Therefore, a rally in US Treasuries
could be explained by both ight-to-quality e ects (i.e., an uncertainty premium about the occurrence of possible disorderly tail events) and the market expectations about the FED response
to particularly severe economic developments.
The model in this section suggests one additional potential channel conducive to the equity
crash and the rally in US debt after the Standard & Poors downgrade decision. What happens
to the price of a stock and a bond, after an adverse shock hits the fundamentals? According
to Mertons model, they both fall after a decrease in . But which of the two prices will drop
4 In
733
c
by
A. Mele
more? After all, debt is less risky than stocks (due to subordination), and bad news about the
fundamentals should a ect stocks more than bonds.
The next picture depicts the price of bonds and stocks predicted by the Mertons model.
Naturally, the Mertons model is a very raw approximation to the events we are discussing
in this section. These events relate to sovereign debt, not rms debt! At the same time, this
model can shed some light into these events. It predicts that bond prices do not move too much
when the probability of default is small (i.e. when 0 is large enough), which might roughly
correspond to the situation where an agency announces a name to be downgraded from AAA
to AA+. Instead, stocks prices fall, and substantially, due to convexity, following an increase
in the probability of default (which occurs when 0 falls in the Mertons model).
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
A_0
The solid line depicts the value of the bond and the dashed line depicts that of the stock,
as predicted by the Mertons model, when the nominal value of debt is = 1, and = 1,
= 3% and = 20%, annualized.
While the Mertons model does not obviously regard the joint behavior of stock and sovereign
bond prices, it makes a sharp prediction regarding stock and bond prices for a rm. It is an
open issue as to whether these predictions would also apply to market developments related to
sovereigns. Naturally, the informal arguments of this section do not aim to rule out additional
explanations around the 05-08 events, such as ight-to-quality e ects and the expectation hypothesis. Rather, they suggest one additional hypothesis: even absent ight-to-quality e ects
or the expectation hypothesis, bond prices should not substantially fall, once the probability of
default for a name still remains very small.
13.3.1.5 First passage
The timing of default can be triggered by some exogeneously specied events. For example,
default occurs if the value of the assets hits some exogenously lower bound even before the
expiration of debt. These models are known as rst passage models, because they rely on
mathematical techniques that solve for the probability the rst time the asset value hit some
exogenous barrier, as in Black and Cox (1976). It is an interesting case of how asset evaluation
theories can develop. The evaluation methods in this section mostly rely on the option-pricing
734
c
by
A. Mele
toolkit available since Black & Scholes and Merton. However, rst passage approach by Black
and Cox has inspired further work on design and pricing derivatives with barriers (i.e. barrier
options), such as down (or up-) and-out and down (or up-) and-in options.
13.3.1.6 Strategic defaulting
The timing of default can actually be endogenous. We now analyze a simple model in which
equity holders choose a defaulting barrier (i.e. the rm asset value that triggers bankruptcy) so
as to maximize equity value. Naturally, strategic defaulting cannot arise under the assumptions
underlying the Modigliani-Miller theorem. This section analyzes the following mechanism. The
rm issues debt and needs to ensure coupon payments for this debt. Issuing debt adds value
as debt acts as a tax-shielding device. However, issuing debt exposes the rm to default, which
triggers bankruptcy costs (e.g., legal expenses). The presence of bankruptcy costs prevents the
rm from only issuing debt. Shareholders nance debt coupons by continuously raising equity
capital whenever the rm cash ows are not su cient to honour the coupon payments. The
rm cash ows include both asset performance and tax benets associated with the issuance of
debt.
Now, there a possibility of bankruptcy opened to the rm, which occurs (endogenously)
when the equity holders consider that the value of the assets is too small to warrant them a
lifetime positive expected return. Specically, in bad times, when the assets perform poorly,
shareholders do not necessarily liquidate the rm, because there might be chances that the assets
could perform better in the future. However, should the assets value become small enough, the
equity holders will stop paying the coupons and will liquidate the rm. Naturally, equity holders
choose the value of the asset that triggers bankruptcy to maximize the value of equity.
The model we analyze is developed by Leland (1994, Section VI.B), and extended to one
with nite maturity debt by Leland and Toft (1996). Anderson and Sundaresan (1996) consider cases of debt re-negotiation. [In progress: Literature review far from being completed]
Lelands model considers liquidation of the rm as a strategic choice of the equity holders, as
explained. In fact, the US bankruptcy code includes both a liquidation process (Chapter 7) and
a reorganization process (Chapter 11), but Lelands model only considers rms liquidation at
bankruptcy. Broadie, Chernov and Sundaresan (2007) generalize this setting to one where the
rm may choose to default through a reorganization process, in which case no equity is issued
to honour debt services as in the model analyzed in this section.
Formally, the terms leading to strategic defaulting are as follows. First, to model the rm
cash ows, we relax the assumption that the value of the assets, , is solution to Eq. (13.1).
Instead, we assume that the rm cash instantaneous ows are equal to a constant fraction of
the assets , such that, and generalizing Eq. (13.1),5
=(
Second, debt is innitely lived, in that it pays o an instantaneous coupon equal to , forever,
conditionally upon survival; in the absence of default risk, the value of debt would simply equal
. Third, tax benets are assumed to be proportional to the coupon,
. Fourth, there are
bankruptcy costs: if the rm defaults at =
, recovery is (1
) . Equity holders choose
. Naturally,
0.
5 In
particular, it is straightforward to check that the evaluation formula in Footnote 2 of this chapter collapses to
, once the instantaneous cash ows
=
.
735
c
by
A. Mele
The value of debt is a function of the rm asset value, , say ( ). Moreover, the rm
nances the net cost of the coupon by issuing additional equity, as explained above, and until
the equity value is zero, i.e. until =
, as seen below. That is, bankruptcy occurs when the
rm cannot meet the instantaneous coupon payments. Under the risk-neutral probability, the
value of debt satises:
E [ ( )| ]
+ |{z} =
( )
(13.10)
=
|
{z
} =coupon
=Expected capital gains
( ))
( )
where
( ) (1
(13.11)
(13.12)
( ))
( )
, we have that
+
= Firm value =
( )
( )
Summing up,
( )=
(1
( )) (1
( )
Equity equals (i) the rm asset value, ; minus (ii) the present value of debt contingent on
no-bankruptcy, net of tax benets, (1
( )) (1
) ; minus (iii) the present value of debt
contingent on bankruptcy, and net of bankruptcy costs,
( ) . The second term decreases
with the default boundary,
or, equivalently,
( ). The third term, instead, increases with
. So the time equity-holders wait before declaring bankruptcy, which is inversely related to
, a ects these two terms in opposite ways.
6 Notably,
2(
1
2
1
2
2 2
+2
).
736
c
by
A. Mele
Equity-holders choose
to maximize the value of equity. The solution is a default boundary,
, such that the value of equity does not change for small changes in the value of the assets
around
, or
: 0 ( )| = = 0, a smooth pasting condition, as argued below. The result
is:
(1
)
=
(13.13)
1+
Similarly as in the American option case, the value of the option to wait can be shown to
2
does
be increasing with uncertainty,
this
. Finally, it is easy to check that
solution for7
maximize the value
;
: 0 =
;
= 0. In other
( ) in that
words, the equity holders access to an instantaneous dividend given by
+ , such that
their optimization program is
Z
(
)
( ) = sup
(
(1
) )
The solution to this real option problem is that described by Eq. (13.13). That is, the shareholders are willing to accept some temporary negative values of the dividends (and in this case,
they would inject new equity capital to make sure debt is honoured), althought then they would
declare bankruptcy as soon as the assets value reaches the threshold level
in (13.13) such
that the expected net present value of the dividends, and, hence, equity, , is zero.
How is it that tax shielding incentives do not seem to a ect the existence of a solution to
this problem? That is, the default boundary,
, is well-dened even with = 0. In fact, if
= 0, there are no reasons to issue debt in the rst place: with = 0, equity value is negative
at bankruptcy level
in Eq. (13.13). In fact, when
0, there is a level of leverage that
maximizes the value of the rm, according to simulations reported in Leland (1994). Finally,
note that the solution for
is independent of . However, as noted, without bankruptcy costs,
the rm would only issue debt.
13.3.1.7 Pros and cons of structural approaches to risky debt assessment
Pros. First, they allow to think about more complicated structures or instruments easily (e.g.,
convertibles, as we see in the next section). Second, they lead to simple yet consistent relations
between di erent securities issued by the same name. Structural approaches were very useful
for theoretical research during the 1990s.
Cons. The rm asset value and asset volatility are not observed. Must rely on calibration/estimation methods. Bond prices generated by the model 6= market prices. These models
are a bit di cult to use in practice, for trading or hedging purposes, as we know that in this
case we need theoretical prices that exactly match market prices. Finally, how do we go for
sovereign issuers?
Most important. Structural models predict unrealistically low short-term spreads: see, e.g.,
Figure 13.3. The intuition is that di usion processes are smooth: the probability of default tends
to zero as time to maturity approaches zero, because default cannot just jump in an unexpected
way. This is not what we exactly observe. Jumps seem to be a more realistic device to modeling
spreads, and will be introduced in Section 13.3.3.
7 More formally, it is easy to see that the rm value is maximized by setting
as small as possible. This maximization process
would imply that ( )
0 for all
(1
)
0. But limited liability requires that ( )
0 for all
.
,
(1
) , then, 00 ( )
It is easy to see that if
consistent with positive equity value is
: 0 ( )| =
737
( )|
c
by
A. Mele
out
We assume that the market value of the rm is equal to the value of its assets, , which is
a Geometric Brownian motion, as in Eq. (13.1). Let conv (
; ) be the aggregate value of
the convertible bond with time to maturity and face value . To simplify the presentation,
we do not consider callability issues. However, we shall provide some intuition about this issue
later. Let us assume that the stocks and the convertible bonds are the only two claims in the
capital structure of the rm. Since, after conversion, only the stocks will remain, then, the
post-conversion value of the convertible bonds is simply the conversion value of the convertible,
8 Strictly speaking, the option embedded into this kind of asset is a warrant, not an option. A warrant gives the holder a right
to purchase new shares, i.e. shares issued by the rm.
738
c
by
A. Mele
0,
conv
(13.15)
The rst inequality in (13.15) is simple to understand. Indeed, suppose that conv (
; )
. Then, we can purchase the convertibles, convert them into shares and, nally, sell the
shares for
. The second inequality follows by the limited liability of the equity holders, and
the Modigliani-Miller theorem.
At maturity,
conv
( 0; ) = min { max {
}}
(13.16)
Indeed, max {
} is the value of the convertible, in case of no-default. Then, min{ }
is what the rm will pay to the bond-holders: in case of default, and in case of no-default.
We can re-express the terminal payo in Eq. (13.16) in a manner that allows for a better
understanding of the issues underlying the exercise of the convertibles. In particular, we have
that,
conv
( 0; ) = min { max {
}} = max {
min {
}}
(13.17)
Indeed, let min{
} the payment the rm is ready to supply if the the bond-holders do
} is obviously the payo prole to the bond-holders.
not convert. Then, max{
The terminal payo in Eq. (13.17) illustrates very clearly that convertible bonds embed an
option to convert, on top of the plain vanilla non-convertible bond. Intuitively, at maturity, a
non-convertible bond is worth min {
}, and the option to convert is either worthless (in case
of non conversion) or worth
(in case of conversion), i.e. it is max {
0}. This
intuition is conrmed, mathematically, as we have that:
max {
min {
}} = min {
} + max {
0}
) = min {
0;
} + max {
0}
(13.18)
One can show that it is not optimal to exercize the option to convert before maturity. Therefore,
to price the convertible bond, we only need to deal with the terminal payo in Eq. (13.18).
Eq. (13.18) shows that the current value of the convertible bond is the sum of the value
of a straight bond plus the value of options on the rm with strike price equal to
.
Accordingly, let (
; ) and (
;
) be the prices of the straight bond and the option
on the rm. We have,
conv
)=
)+
We may use the Mertons (1974) model to nd the price of the straight bond, (
the results in Section 13.2, it is:
ln ( / ) + + 12 2
(
; )=
( 1) +
1
1 =
(13.19)
;
). By
(13.20)
where is the instantaneous volatility of the assets, is the (constant) instantaneous shortterm rate, and is the cumulative distribution of a standard normal. Similarly, we may use
the Black-Scholes formula to compute the function :
(
)=
( 1)
739
(13.21)
c
by
A. Mele
2.0
1.5
1.0
0.5
0.0
FIGURE 13.6. The value of convertible and straight bonds as a function of the current
asset value, , when the short-term rate = 3%, the asset volatility = 0 20, time to
maturity = 3 years, the dilution factor = 30%, and nominal debt
= 1. The solid
line depicts the value of the convertible bond. The dashed straight line starting from the
origins, and attening out to the constant
= 0 91393, is the value of the straight
bond. The two dashed straight lines starting from the origins are the no-arbitrage bounds
and in Eq. (13.15).
c
by
A. Mele
Nt
D efault
t0
t2
t1
t3
FIGURE 13.7.
}=
(13.22)
]}
,
)
)
!
We rely on these heuristic calculations and derive a few basic properties of default. We have,
Pr {Survival by
Pr {Default by
} = Pr {0 jumps over [0 ]} =
} = Pr {at least one jump over [0
=1
Pr {Survival by
]}
}=1
c
by
A. Mele
in that at , the bond is equal to either the recovery value (in case of default over any
or the nominal value (in case of survival). In this case we have, by Eq. (13.6), that:
0
[Rec Q (Default) +
|
{z
Q (Survival)]
}
])
(13.23)
where Rec is the expected recovery value of the asset. Using the probabilities predicted by the
Poisson model, we obtain:
0
= Rec 1
(13.24)
Spread =
ln
1
1 =
1
LGD.
Note that in contrast to the structural models reviewed in Section 13.3.1, the spread is not
zero when is small. Rather, it is given by the expected default loss per period, dened as the
instantaneous probability of default times LGD,
Short-Term Spread =
LGD.
Therefore, models with jumps have the potential to explain the empirical behavior of credit
spreads at short maturities discussed in Section 13.3.1. As explained, structural models, being
typically driven by Brownian motions, cannot lead to positive spreads at very short maturities,
as they imply that the probability of default decays quickly as time-to-maturity goes to zero.
Instead, models with jumps predict a possibility the rm can experience a sudden death:
default can occur with positive probability at any time, even when the debt is about to expire.
A theoretical model of Du e and Lando (2001) shows how a structural model of the rm can
lead to positive short-term spreads, once we assume incomplete information and learning about
the asset value. In their model, learning takes place with some delay, which leaves investors
concerned about what they really know about the rm asset value. It is this concern to lead
to positive credit spreads in their model, to an extent comparable to that generated by a jump
process.
Figure 13.8.1 depicts the behavior of the spread predicted by the model at all maturities,
given by,
1
1
Rec
0
Spread =
ln
=
ln
1
+
742
c
by
A. Mele
Spread
240
239
238
237
236
235
234
233
232
231
0
Time to maturity
FIGURE 13.8.1. The term structure of bond spreads (in basis points) implied by an
intensity model, with recovery rate equal to 40% and intensity equal to = 0 04, implying
1
an expected time-to-default equal to
= 25 years.
c
by
A. Mele
equal 400 basis points, i.e. the default intensity, . When = 0 03, which is lower than = 0 04,
they are higher and when = 0 013, they are lower. In fact, when = 0 013, the term structure
of the spreads is even hump-shaped, a feature not visible from the picture. As is clear, this very
simple model predicts features of both short-term and long-term spreads that the Mertons
model in Section 13.3.1.1 cannot, realistically.
Spread
310
300
290
280
270
260
250
240
0
10
12
14
16
18
20
Time to maturity
FIGURE 13.8.2. The term structure of bond spreads (in basis points) implied by an
intensity model with recovery rate equal to 0 40
, where is time to maturity and is
the recovery dissipation rate, taken to equal = 0 05 (solid line), = 0 03 (dashed line),
and = 0 013 (dotted line). The instantaneous probability of default is taken to equal to
1
= 0 04, implying an expected time-to-default equal to
= 25 years.
We can interpret this behavior of term spreads as follows. Suppose we are in good times,
when is small relative to . We are in good times precisely because we expect things would
change adversely in the future, captured by a large value of . In this case, the term structure
of spreads is increasing. Instead, in bad times, when is large compared to , we might expect
future times to improve, which we might model by assuming is small.
Figure 13.8.2 shows that long maturity spreads are smaller than in good times. Naturally,
we would expect that spreads should increase for any maturity in bad times, although this
property is not captured by the numerical examples in Figure 13.8.2, where we x = 0 04.
The point of this exercise is to show that the slope of the term structure of the spreads lowers
as we enter bad times, when we only consider changes in the dissipation rate, . Allowing for a
countercyclical would reinforce the conclusions of this exercise. While these conclusions rely
on comparative statics, Section 13.5.5.5 shows that they still hold in a dynamic context, where
the intensity follows a mean-reverting continuous-time model.
13.3.3.3 One example
Naturally, the intensity, , is the risk-neutral instantaneous probability of default, not the physical probability of default,
say. The ratio
is generally larger than one. Its inverse,
,
is an indicator of the risk-appetite in the credit market. Similarly, LGD is an expectation under
the risk-neutral probability, and should contain useful indications about market participants
risk appetite.
744
c
by
A. Mele
Assume that under the risk-neutral probability, the instantaneous intensity of default for a
given rm is = 4%, annualized, and that under the physical, the instantaneous probability of
default for the same rm is
= 2%, annualized. From here, we can compute the probability
of survival of the rm within 5 years, under both probabilities. They are:
5
50 04
= 0 81873
50 02
= 0 90484
[Rec Q (Default) +
Q (Survival)] =
1
Rec (1
ln
5
[Rec (1
0 81873) + 1 0 81873]
0 81873) + 1 0 81873
1
13.3.4 Ratings
From
In practice, corporate debt is rated by rating agencies, such as Moodys and Standard and Poors.
Depending on the rating, corporate debt may be either investment grade or non-investment
grade (junk). Moodys ratings range from Aaa to C. Standard and Poors range from Aaa to
D. One can compute the probability of migrations based on past experience
Transition
probabilities. Consider, for example, the following table:
One year rating transition probabilities (%), S&P's 1981-1991
To
AAA
AA
A
BBB
BB
B
CCC
AAA
89.1
9.63
0.78
0.19
0.3
0
0
AA
0.86
90.1
7.47
0.99
0.29
0.29
0
A
0.09
2.91
88.94
6.49
1.01
0.45
0
BBB
0.06
0.43
6.56
84.27
6.44
1.6
0.18
BB
0.04
0.22
0.79
7.19
77.64
10.43
1.27
B
0
0.19
0.31
0.66
5.17
82.46
4.35
CCC
0
0
1.16
1.16
2.03
7.54
64.93
D
0
0
0
0
0
0
0
D
0
0
0.09
0.45
2.41
6.85
23.19
100
TABLE 13.1
13.3.4.1 Foundations
A natural approach is to assess credit risk by making reference to probabilities of default built
up on transition probabilities like those in Table 13.1.
Such an approch, also known as a migration approach, is somewhat less drastic than that
based on rare events, and hopefully more realistic. However, it is also technically more complex
than the intensity approach of the previous section. We provide the most foundational issues
of this approach, leaving some details in the Appendix.
At time , there exists several rating classes, say, denoted as Rat ,
Rat
{1 2
745
c
by
A. Mele
are,
Pr (Rat = | Rat = )
0 and
) only depends on
) =1
=1
For example, the probability of transition from rating Rat = to rating Rat +1 = in one
year is, (1) . Table 13.1 contains one possible example of (1) . The probability of transition
from rating Rat = to rating Rat +2 = in two years is (2) , and is obtained as follows,
(2) =
X
=1
(1)
| {z }
Pr(transition from
to
in one year)
89 1
0 86
0 09
0 06
0 04
0
0
0
Pr(transition from
(1)
| {z }
to
9 63 0 78 0 19
03
0
0
0
90 1 7 47 0 99 0 29 0 29
0
0
2 91 88 94 6 49 1 01 0 45
0
0 09
0 43 6 56 84 27 6 44
16
0 18 0 45
0 22 0 79 7 19 77 64 10 43 1 27 2 41
0 19 0 31 0 66 5 17 82 46 4 35 6 85
0
1 16 1 16 2 03 7 54 64 93 23 19
0
0
0
0
0
0
100
(15)
20 01 35 82 23 91 9 92
4 05 3 06
3 38 30 28 32 71 15 91 6 38 5 11
1 17 13 12 34 21 21 93 9 69 8 01
0 64 6 76 22 21 22 40 12 42 11 93
0 33 3 22 10 71 13 616 11 36 14 68
0 14 1 65 5 01
6 75
7 48 13 17
0
1 08 3 54
3 90
3 51 5 60
0
0
0
0
0
0
0 43
0 77
1 29
2 09
2 78
2 64
1 22
0
2 66
5 34
10 33
21 39
43 16
63 04
81 02
100
13.3.4.2 Evaluation
The previous probabilities, { ( ) }, are meant to be taken under the physical world, not the
risk-neutral. They can be used for risk-management purposes, but certainly not for pricing.
Indeed, historical default rates are too low to explain the price of defaultable securities. A
natural explanation relies on the presence of risk-premia. To use migration data for pricing, it
is vital to implement a number of steps.
The rst step relates to clean up the data. For example, it might be that downgrades from
class to class + 2 are more frequent than downgrades from class to class + 1, an occurrence
746
c
by
A. Mele
which we wish to smooth. Moreover, we would need to remove zero entries: although some
rating events did not happen in the past, they might well occur in the future. Finally, we need
to add positive risk-premia to the previous smoothed data, to recover realistic asset prices.
As for the pricing details, the migration model relies on the assumption that there are
classes of assets. Each single asset may migrate from one class to another. Because evaluation
is a dynamic business, we cannot evaluate defaultable securities within a given class without
simultaneously evaluate the defaultable securities in the remaining classes. For example, there
could be a chance that a given asset will mutate into a di erent one over the next year (i.e.
one belonging to another rating class). Therefore, the price of this asset, today, needs to reect
the price of the asset in the other classes where it can possibly migrate. As a result, we must
simultaneously solve for all the asset prices in all the rating classes. This approach, developed
by Jarrow, Lando and Turnbull (1997), is quite complex and is given a succinct account in the
Appendix.
Consider a simple case, arising when default can occur at any time before maturity , but
default implications arise only at maturity, in that the bond pays a recovery value only at ,
should the rm default at any time prior to . Let Q (
) denote the risk-neutral probability
the rm defaults, by time , given it belongs to rating at time . By Eq. (13.6),
Rec
0
) + (1 Q (
))
=
Q (
The risk neutral probabilities, Q (
), must be found using migration frequencies such
as those in Table 13.1, which we must clean up and corrct with appropriate risk-premia as
discussed.
13.3.4.3 One example
To
B
0.07
0.75
0
Def
0.03
0.10
1
where Def denotes the state of default. What is the probability that a name A will remain name
A in two years? What is the probability that a name A will default in two years?
Consider the following two year transition matrix:
0 90 0 07 0 03
0 90 0 07 0 03
0 15 0 75 0 10 0 15 0 75 0 10
Q (2) =
0
0
1
0
0
1
{z
} |
{z
}
|
Q(1)
such that:
Pr {A is A in 2 years} =
Q(1)
0| 90 {z
0 90} + (0 07) (0 15) + 0| 03{z 0}
|
{z
}
= 0 8205
747
c
by
A. Mele
and
Pr {A defaults in 2 years} =
0| 90 {z
0 03} + (0 07) (0 10) + 0| 03{z 1}
|
{z
}
= 0 064
In general, we have that:
Q (2) =
3
X
=1
Q (1) Q (1)
Q ( ) = Q (1) =
0 90 0 07 0 03
0 15 0 75 0 10
0
0
1
Next, consider the following transition matrix, under the risk-neutral probability:
A
A
0.80
From
B
0.15
Def 0
To
B
0.20
0.75
0
Def
0
0.10
1
From here, we may easily compute, again, the (risk-neutral) probability A will default in two
years, and the probability B will default in two years. We have,
0 80 0 20 0
0 80 0 20 0
0 15 0 75 0 10 0 15 0 75 0 10
Q (2) =
0
0
1
0
0
1
{z
} |
{z
}
|
Q(1)
such that:
Q(1)
0| {z
1}
c
by
A. Mele
only at the end of the second period. From here, we can compute the credit spreads for the two
bonds. We have,
Price A = (0 30) (0 02) + (1 0 02) = 0 986
1
Spread A =
ln (0 986) = 7 0495 10
2
and,
Price B = (0 30) (0 175) + (1 0 175) = 0 8775
1
Spread B =
ln (0 8775) = 6 5339 10
2
In a total return swap (TRS, henceforth), one party, who owns some asset, that underlying the
TRS, receives from the counterparty payments based on a mutually agreed rate, either xed or
variable, and makes payments to the counterparty based on the return of the underlying asset,
which includes both the income it generates and any capital gains. The underlying asset can
be a loan, a bond, an equity index, or a basket of assets. The interest payments are typically
based on the LIBOR plus a spread. Consider the following example. Party A receives LIBOR
+ xed spread equal to 3%. Party B receives the total return of the S&P 500 on a principal
amount of $1 million. If the LIBOR is 7% and the S&P 500 is up by 12%, A pays B 12% and
B pays A 7% + 3%. By netting, A pays B $20,000, i.e. $1 million (12% 10%).
While TRS are usually categorized as credit derivatives, they combine both market risk and
credit risk. The main benet from going long a TRS is that the party with the asset on the
balance sheet buys protection against loss in value. The main benet from shorting a TRS
is that it allows the counterparty to receive the payo s of the underlying without necessarily
having to put this underlying in the balance sheet. Hedge funds nd it quite convenient to short
a TRS, as this allows them to have views with limited collateral upfront. The market for TRS
is over-the-counter and market participants include institutions only.
13.4.1.2 Spread Options
Spread Options (SO, henceforth) are options written on the di erence between two indexes. For
example, let 1 ( ) and 2 ( ) be the prices of two assets at time . The payo promised by a
SO entered at some time
, might be max { 1 ( )
0}, where is the strike of
2( )
the SO. A SO can be written on the spread between two rates of returns too. Importantly, a SO
can be written on the spread between the yield of a corporate bond and the yield of a Treasury
bond. Examples include: (i) NOB spread (notes - bonds), which are spreads between maturities;
(ii) Spreads between quality levels, such as the TED spread (treasury bills Eurodollars); (iii)
MOB spreads, i.e. the di erence between municipal bonds and treasury bonds. More generally,
the denition of a SO has now been extended to include payo s written as a linear combination
of indexes, interest rates and yields.
749
c
by
A. Mele
Credit spread options (CSO, henceforth) are options where the payo is the di erence between
(i) the spread between two reference securities (say Italian Government bonds and US Government bonds having the same maturity, or the spread between some stock return in excess of
the LIBOR, or two credit instruments), and (ii) a given strike spread, for a certain maturity
date. It may be an American or European option. So CSOs allow to hedge against, or take specic views about, changes in credit spreads. For example, an investor, while bullish on Italian
bonds, might hedge against the uncertain outcome of a political election, which could trigger a
widening of short-term spreads of Italians versus US. The investor, then, might go long a CSO,
with time to maturity around the days of the political election, where the underlying are the
Italian and Government bonds expiring in ten years, say. A possible payo to the CSO holder
might be proportional to, (ITA US
), where ITA US is the ten year Italian-US spread in
three months, and
is the strike spread.
13.4.2 Credit Default Swaps
13.4.2.1 Single name swaps
TRS provide protection against a general loss in asset value, which could be triggered by both
market or credit risk, although it is obviously more often market risk than credit to kick in.
Credit Default Swaps (CDS, henceforth) di er from TRS insofar as they provide protection
against a credit event.
The premium, assumed to be paid quarterly, on a CDS contract agreed at time , is obtained
by equating the expected discounted value of the protection (the oating protection leg), to
the expected discounted value of all the premiums paid over the life of the contract (the xed
premium leg), i.e. at dates :
= + 4 , and
is the number
1
2
4 , where
of years the CDS refers to. The discounted expected oating protection leg is:
Protection =
4
X
=1
LGD ( ) Pr {Default
)}
4
X
=1
CDS (
) Pr {Survival at }
where is the (constant) risk free rate, CDS ( ) is the premium paid every quarter, prevailing
at time , and LGD ( ) is the LGD at time , which for simplicity is assumed to be constant,
i.e. known at time .
Equating Premium and Protection , and solving for CDS ( ), leaves:
P4
(
)
LGD ( ) Pr {Default ( 1 )}
=1
(13.26)
CDS ( ) =
P4
(
) Pr {Survival at }
=1
c
by
A. Mele
are, obviously, the same as those we use to price the bonds underlying the CDS contract.
Therefore, there are are no-arb relations that link bond prices to CDS premiums, which shall
be emphasized later on (see Section 13.4.5.4). This point illustrates in a remarkable way one
key di erence between nance and insurance. Even if in insurance, one may end up pricing
some products through risk-adjusted probabilities, nance is where we typically end up having
many more traded risks than in insurance, and these risks are tightly related through no-arb
restrictions.
Eq. (13.26) is a general formula we can use, once we have a model determining the riskneutral probability of default. In this chapter, we implement Eq. (13.26) through a reducedform approach, which will allow us to nd the quarterly premium (or spread) CDS ( ) quite
easily, as follows.
We have, denoting again with the instantaneous probability of default, that Pr{Survival at
(
)
( 1 )
(
)
}=
, and that Pr{Default at any
( 1 )} =
. Intuitively, if
the name survives at (event ), it must necessarily have survived at 1 (event
1 ), but
the converse is not true:
to
1 , and the complement of
1 is nothing but the event
of default between 1 and .9 Substituting the previous probabilities into Eq. (13.26), we nd
that:
(
P4
(
)
)
(
)
1
LGD ( )
=1
(13.27)
CDS ( ) =
P4
( + )(
)
=1
The denominator in the RHS of Eq. (13.27) is the defaultable-PVBP (Present Value of a Basis
Point), in perfect analogy with the expressions for the forward swap rate given in Chapter
12. Assuming that LGD ( ) is constant and equal to LGD for each , then, for a generic
=
1 , Eq. (13.27) can be simplied to,
1
(13.28)
CDS ( ) = LGD
That is approximately, for small
CDS (
LGD
(13.29)
751
)} =
Pr{Default at
, where Pr{Default at
} =
c
by
A. Mele
Suppose you go long a CDS, meaning that at time , you commit to a swap agreementyou
pay CDS ( ) at time if the name survives by time , and receive LGD ( ), if default occurs
in the time interval [ 1 ], for 4 time intervals. Each swap payo the CDS-let so to
speakis:
cds ( ) LGD ( ) I{Default ( 1 )} CDS ( ) I{Survival at }
(13.30)
such that
CDS (
):0=
4
X
E (cds ( ))
=1
where E denotes the expectation conditional upon the information set at time , taken under
the risk-neutral probability. The solution to this equation is just that in Eq. (13.27).
What happens to the value of this contract at any subsequent time
( 0 )? The marking
to market value of the CDS is the present value of the risk-neutral expectation of the single
swaps payments cds ( ) in Eq. (13.30), consistently with the explanations in Section 10.4.6 of
Chapter 10. So the marking to market value of the CDS at is,
MtM (
4
X
)
=
=1
4
X
=1
= [CDS (
E (cds ( ))
)
LGD ( )
CDS (
)]
4
X
( + )(
CDS (
( + )(
=1
where the last line follows by the denition of CDS ( ), i.e. by setting
in Eq. (13.27).
These results perfectly match those in Chapter 12 regarding forward swap rates.
Note that in this model, marking-to-market is deterministic, because CDS premiums for a
xed maturity are constant over time, due to the fact that both and are constant. Markingto-market is actually zero once we assume that loss-given-default is constant, as Eq. (13.28)
shows that CDS spreads do not change in this case.
13.4.2.3 CDS on indexes, and options based thereon
A CDS index is a basket of credit entities in which the protection buyer pays the same premium
on all the names in the index, until a xed expiration date. Credit events are typically bound
to bankruptcy or delinquencies. After a credit event, the entity is removed from the index and
the contract goes through with a reduced notional amount, until expiration, as explained in
more detail below.
While CDS on single names are over-the-counter, CDS indexes are standardized and at the
time of writing give rise to relatively more liquid markets, as historical data on bid-ask spreads
show. In fact, it can be cheaper to hedge a portfolio of CDS or bonds with a CDS index than it
would be to buy many CDS to achieve a similar e ect. There exist two main indices: (i) CDX
index, which contains North American and Emerging Market companies; and (ii) iTraxx index,
which contains companies from the rest of the world.
Credit default swaptions are options to enter a CDStypically a CDS index. Consider, rst,
swaptions on single names. A payer swaption gives the right to buy protection at some future
752
c
by
A. Mele
date at some CDS xed strike, and a receiver swaption gives the right to sell protection. If
default of the name occurs prior to the swaption maturity, the contract is terminated. Note
that evaluating credit default swaptions is trivial in the pricing context of Section 13.6.3.1,
because CDS premiums move deterministically over time when the intensity of default and the
short-term rate are both constant. Section 13.6.3.9 hinges upon a continuous-time model of
stochastic intensity rates, and supplies an evaluation framework for these products.
Credit default index swaptions work di erently. Firstly, as noted, at inception, a credit default
index swap (CDIS) is referenced to a number of xed companies chosen by a market maker, each
carrying a given weight. Secondly, buyers of CDIS are typically those who provide protection to
market makers: they stand ready to pay a predetermined loss-given-default for any default that
occurs before maturity, which is constant and identical for all reference entities in the index. In
exchange, the market makers pay the CDIS buyer a periodic xed premiumthe credit default
index spread. After a default takes place, the nominal value of the CDIS is reduced by one, and
no replacement of the defaulted rm would take place, as further explained in Section 13.6.3.9.
13.4.2.4 Disentangling default probability from risk-aversion
The following picture, taken from Fender and Hordahl (2007), illustrates the behavior of the
credit market risk appetite before the 2007 credit market turmoil.
FIGURE 13.9. Antonio Mele does not claim any copyright on this picture, which is taken
from Fender and Hordahl (2007). The picture has been put here for illustrative purposes
only, and permission to the authors shall be duly asked before the book will be published.
How did the authors estimate the price of risk? Consider the expected losses under the
actuarial, or physical probability for a given security. The counterpart to Eq. (13.29), under the
physical probability, is:
Expected Losses
LGD
753
c
by
A. Mele
where
is the physical instantaneous probability of default for a given security. Assume that
LGD is constant, to simplify. If investors require compensation for default events, the actuarial
losses should be less than the CDS spread, i.e. Expected Losses
CDS, or,
The risk-premium is dened as the di erence between the actuarial losses, Expected Losses ,
and the CDS premium,
Risk-Premium =
LGD
The price of risk in Figure 13.9 is dened as the ratio of the CDS spread over Expected Losses ,
Price-of-Risk =
Early references to estimation methods are Du e et al. (2005) and Amato (2005). Typically,
Expected Losses are proxied by Moodys KMVs Expected Default Frequencies (EDFsTM ),
obtained through fully specied structural models for credit risk. The next pictures are taken
from Amato (2005). As we can see, during the 2003-2005 period, credit spreads were so low,
and this in turn gave incentives to CDO issuers to look for illiquid and relatively more complex
assets to put as collateral, which led to the issuance of CDO relying on ABS such as MBS, or
CDO2 , explained below.
FIGURE 13.10. Antonio Mele does not claim any copyright on this picture, which is
taken from Amato (2005). The picture has been put here for illustrative purposes only,
and permission to the author shall be duly asked before the book will be published.
754
c
by
A. Mele
FIGURE 13.11. Antonio Mele does not claim any copyright on this picture, which is
taken from Amato (2005). The picture has been put here for illustrative purposes only,
and permission to the author shall be duly asked before the book will be published.
The following picture illustrates the behavior of CDS indexes during approximately 20 years
before the 2007-2009 credit market turmoil.
FIGURE 13.12. Valuation of Financial Instruments Based on Implied Probability of Default. Antonio Mele does not claim any copyright on this picture, which is taken from
IMF (2008). The picture has been put here for illustrative purposes only, and permission
to the authors shall be duly asked before the book will be published.
755
c
by
A. Mele
We may relax the assumption the instantaneous intensity of default, , is constant. This intensity is dened under the risk-neutral probability and can change either because the intensity of
default under the physical probability changes or because risk-appetite changes, or both. We
examine the asset pricing implications of time-varying intensities, by exploring how probabilities of survival change in a simple setting, where we do not single out the reasons leading to
variations in .
First, we assume the instantaneous probability of default can only change discretely, giving
rise to random intensities , meaning that
is the intensity of default in the time interval
[
1 ]. Let F be the information set as of time . We assume that is F -measurable. What
is the probability of survival of any given name in this case? We have, by Bayess theorem,
Pr {Surv at | Surv at
1} =
Pr {Surv at }
Pr {Surv at
1}
(13.31)
1}
(13.32)
=1
Pr2
s u rv iv a l
d e fa u lt
P r1
n
s u rv iv a l
FIGURE 13.13. This picture illustrates the determination of the probability of survival
in the case of random default intensities going over one period and two states. At the
beginning of period , nature draws the event dening the intensity of default, which is
either
(1) with probability Pr {1}, or
(2) with probability Pr {2} = 1 Pr {1}. Then,
(1)
the two paths leading to survival have probability of occurrence equal to Pr {1}
(2) , such that the total probability of survival equals Pr {1}
(1) +
and Pr {2}
(2)
.
Pr {2}
756
c
by
A. Mele
P
=1
Pr {Surv at } = E
Under regularity conditions, we can easily extend the previous result to a continuous time
setting. For example, we may assume that the risk-neutral default intensity, , is solution to:
p
=
(13.33)
+
0 =
where
is a standard Brownian motion under the risk-neutral probability, and , and are
three positive constants. This is the same as the Cox, Ingersoll and Ross (1985) (CIR) model
of the short-term rate reviewed in Chapter 12. Therefore, under the parameter restrictions in
Chapter 12,
is always positive, and
R
0
(13.34)
) Pr {Surv at } = E
surv (
Eq. (13.34) is, formally, the same as the Feynman-Kac representation of a solution to a partial
di erential equation, solved by a bond price in the Cox, Ingersoll and Ross (CIR) (1985) model
of the previous chapter (Section 12.4.3.3). In other words, the survival probability in Eqs.
(13.33)-(13.34) is mathematically the same as the price of a zero coupon bond in the CIR
model. Therefore, the closed-form solution for surv ( ) is:
( )=
surv
2
( + )(
1
(
2
+ )
1) + 2
)=
( )
( )
1
1) + 2
2
( )=
( + )(
2
2
+2
(13.35)
More generally, we can build up a whole family of models with a closed-form solution, the
a ne class reviewed in Chapter 12, by assuming that:
=
(13.36)
)} =
surv
1)
surv
(13.37)
We can look at the bond spreads and the CDS spreads implied by this modeling choice. In
Appendix 3, we show the price of a defaultable pure discount bond expiring in
years is:
Z
)+
Pr{Default
}Rec ( )
(13.38)
(
)=
surv (
0
757
c
by
A. Mele
where Rec ( ) denotes the recovery value in case of default, supposed to be known. This evaluation result is, naturally, consistent with a similar derivation provided in Section 12.4.7 of
Chapter 12, although in this chapter we are emphasizing more survival arguments.
As for the forward CDS spreads, we have, by Eq. (13.26),
P4
(
)
LGD ( ) [ surv (
)]
1)
surv (
=1
CDS ( ) =
P4
(
)
)
surv (
=1
where
is, again, the number of years the CDS refers to, and = + 4 .
Assume the short-term rate, , is zero, and that loss-given-default is constant and equal to
LGD. Then, as shown in Appendix 3, the price of a defaultable pure discount bond, (
),
and the CDS premium, obtained from the forward once we set = 0, CDS0 ( ), are given by:
(
LGD (1
)=1
surv
))
1
CDS0 ( ) = LGD P4
=1
ln
surv
surv
)
)
(13.39)
240
190
235
180
230
170
225
160
220
150
215
140
210
205
130
bond spreads
CDS spreads, annualized
0
10
120
years
bond spreads
CDS spreads, annualized
0
10
years
FIGURE 13.14. Spreads on bonds and CDS predicted by the a ne model in Eq. (13.33).
The left panel depicts the spreads when the current default intensity equals the long-run
758
c
by
A. Mele
mean, = = 0 04. The right panel depicts the spreads in good times, i.e., when the
current intensity of default takes a low value, = 0 02. In each case the recovery rate
equals 40%.
The mechanism is that given the mean-reverting behavior of , good times are likely followed
by bad,such that when = 0 02, we expect default rates to rise in the future. Spreads increase
with maturity as a result. Moreover, bond spreads are approximately equal to CDS spreads at
short maturities. At longer maturities, the two spreads diverge, with CDS spreads, 4CDS0 ( ),
dominating bonds spreads, 1 ln (
). Moreover, the two curves are decreasing in time
to maturity even when the current value of the intensity equals the long-run one, . This
property is due to the assumption that recovery rates are constant, as explained in the constant
intensity case dealt with in Section 13.3.3.2. In Appendix 5, we provide additional details and
explanations regarding these properties.
13.4.2.6 A trading strategy
Bond prices and CDS spreads are driven by the same state variable, the default intensity, and
so they are restricted to lie on some space, to be consistent with no-arbitrage. To illustrate,
consider, rst, the simple case where the default intensity is constant, such that CDS spreads
are given by Eq. (13.27). Given this model, we can look at the market data for CDS spreads,
and infer the risk-neutral intensity, as in the picture below.
Inferring riskneutral intensity from CDS market data
350
300
250
200
150
100
50
0.01
0.02
0.03
Default intensity
0.04
0.05
In this picture, the CDS spreads predicted by Eq. (13.27) are depicted as a function of the
risk-neutral intensity, , assuming = 5 years, LGD = 0 60 and the short-term rate is zero.
For example, if we had to observe a CDS premium equal to 200 basis points, we would infer a
risk-neutral intensity approximately equal to = 0 033. The key point is this same should
c
by
A. Mele
The previous example relies on a constant default intensity; however, the same strategy
applies when default intensities are stochastic. The picture below shows the no-arbitrage restrictions between bond spreads and CDS spreads, obtained with the same parameter values as
those in Figure 13.14, and cureent intensity values ranging from 0.0050 to 0.05. It also provides
indications of a strategy aiming to exploit deviations from theoretical parity.
Noarb restrictions between bond spreads and CDS spreads
240
220
200
160
140
100
100
120
140
160
180
200
220
CDS spreads, modelbased, in basis points
240
260
In a pricing context, the relevant probabilities of survival are obviously conditioned upon the
time of evaluation, time 0 say. For example, the probability of default in Eq. (13.37) is only conditioned to the information we have at time zero. More generally, the probability of defaulting
in the time interval ( 1 ), conditional upon survival at time
1 , is:
Pr{Default
)| Survival at } =
surv
surv
1)
(13.40)
surv
)| Survival at }
surv
default
surv
)
(
1)
)
(
1)
default
(
)
1)
with straight forward notation. The previous expressions are known as hazard rates. They
coincide with
, when
is deterministic. If
is not deterministic, simple computations
lead to:
(13.41)
Pr{Default ( + )| Survival at } = E ( )
760
c
by
A. Mele
0
=
)
surv (
(13.42)
B1 ( )
B1 ( ) =
(0 ]
() as in Eq. (13.35),
(13.43)
Pr{Default
Z
R
( )
B ( )
0 1
+ B0
( )
)| Survival at } =
()
()
0
Appendix 5 provides a proof of these results, which to the best of our knowledge, are developed
here for the rst time.
13.4.2.8 Extracting probabilities of default from market data
Markets obviously convey information about default probabilities, which could be extracted
under a number of assumptions. To illustrate, assume zero recovery and that both the shortterm rate and the default intensity are continuous-time Markov and independent of each other.
Then, the price of a defaultable zero is def (
) = ( ) surv (
), where def (
) and
( ) are the prices of a defaultable and a non-defaultable zero. Therefore, we can read the
risk-neutral probability of survival from the defaultable/non-defaultable price ratio:
surv
)=
def
(
)
( )
(13.44)
(
),
where
(
1
2
surv
2
surv
1
surv
1
2
surv
1
2 ) is the
risk-neutral probability of survival between 1 and 2 . Using Eq. (13.44), then, we can extract
this probability, as follows:
surv
2)
(
def (
def
2)
1)
(
(
1)
2)
The previous example relies on the simplifying assumption of a zero recovery rate, but it
can be generalized to the case where the recovery rate is nonzero. However, in this case, an
identication issue arises, as prices would convey information about both default probabilities
and recovery rates.
13.4.2.9 Pricing credit default swaptions
Swaptions on single names
With stochastic intensity rates, we can think about the pricing of the credit default swaptions
that we briey mentioned in Section 13.4.2.3. We, now, actually assume that both default
761
c
by
A. Mele
intensities and the short-term rate are stochastic: we assume that the short-term rate is a a
di usion process, and that default arrives as a Cox process with intensity adapted to .10
Consider the following denition of a default swap in Section 13.4.2.1a contract whereby
a party stands ready to pay his counterparty a loss determined by a credit event for a given
ows of premiums, referred to as CDS premiums. At this level, we are not assuming that this
swap is worthless at origination. We assume that loss-given-default is constant and equal to
LGD. Denote the CDS premium, or coupon, agreed at time with ,
0 . Note that like
in Section 13.4.2.3, the contract we are dealing with is a forward default swap. We assume that
the contract is terminated should the underlying obligor default prior to the start date, 0 .
In this contract, the protection buyer commits to a swap agreement whereby it pays
at
time , if the name survives by time , and receives LGD, if default occurs in the time interval
[ 1 ], for 4 time intervals. Each swap payo is:
LGD I{Default
cds ( )
I{Survival
)}
at
(13.45)
4
X
=1
cds ( ) = LGD
(13.46)
where E denotes the risk-neutral expectation, taken conditional upon the information set at
time , and,
I{
0 4
]}
4
X
=1
I{Survival
at
(13.47)
The interpretation of 0 is that of the value of one dollar paid o the rst time after default,
provided default occurs prior to the maturity of the default swap, 4 . Instead, 1 is the value
of an annuity of one dollar paid at the dates 1 2 4 , until default or maturity of the
default swap, whichever occurs rst. In other words, 1 is the value of a basket of defaultable
bonds with zero recovery valuea defaultable present value of the basis point.
The forward default spread is the value of
such that DS = 0. It is:
CDS (
) = LGD
(13.48)
such that the value of the default swap at time , agreed at time
DS ( ), can be expressed as:
DS (
) = LGD
(CDS (
, and denoted as
(13.49)
Note that the derivation leading to Eq. (13.49) generalizes that underlying the marks-to-market
updates in Section 13.4.2.2.
10 Section 13.4.2.5 shows how to generalize Eq. (13.26) to allow for stochastic intensities. It is easy to generalize further while
allowing for stochastic interest rates.
762
c
by
A. Mele
I{Survival at }
R
I{Survival at } F
=E E
=E
E I{Survival at } F
R
( + )
(13.50)
=E
where F is the information set at time , which includes the path of the short-term rate only.
Dene the probability sc through the Radon-Nikodym derivative:
R
sc
1
( + )
(13.51)
=
1
where is the risk-neutral probability. It is easy to see that sc does indeed integrate to one.11
Following Schonbucher (2003, p. 180), we refer to sc as the survival contingent probability.
Chapter 4 provides foundations on changes of probability, with this probability being a special
case of a general framework.
We can also show that for any
0,
R
( + )
(13.52)
0 = E
0
Indeed, 0 is the value of a basket of securities paying o contingent upon default not having
occurred prior to time , with
0 , in which case the value drops to zero. Following derivations
in Chapter 12 of these lectures, we have that
(0
is the
0
0 +
0 ) = 0, where
innitesimal generator for di usions, whence Eq. (13.52). By a similar reasoning, and prior to
, 1 in Eq. (13.47) satises the same partial di erential equation satised by 0 , whence the
claim that sc integrates to one.
Therefore, given the denition of sc in Eq. (13.51) and the martingale property of 0 in
Eq. (13.52), we have that the forward default spread in Eq. (13.48) is a martingale under the
survival contingent probability:
Esc (CDS (
R
=E
))
( +
1
1
CDS (
) =E
( +
0
1
LGD = CDS (
where Esc denotes the time- conditional expectation taken under the survival contingent probability, and where we have used the pricing equation (13.52) and the denition of CDS ( ) in
Eq. (13.48).
11 We could complete the arguments while relying on Sch
onbucher (2003, Chapter 7), Lando (2004, Chapter 5) and Chapter 12
of these lectures (see below).
763
c
by
A. Mele
) = I{Survival
at
(CDS (
)+
DS ( )
E
R
+
( + )
= 1 Esc (CDS (
)
)
=E
1 (CDS (
for a strike
)+
We know that CDS ( ) is a martingale under the survival contingent probability. Let
be a Brownian motion under sc . Assume that
CDS ( )
=
CDS ( )
sc
sc
where is the volatility parameter, a constant. We can apply Black (1976) to obtain evaluation
formulae in this environment.
Swaptions on CDS indexes
Consider, rst, a CDS index, as succinctly described in Section 13.4.2.3. Let be the number
of names in the index decided at time . Each name has notional value equal to 1 , the same
loss-given-default LGD and the same default intensity . Denote with D ( 1 ) the number
of names having defaulted over the time interval ( 1 ),
D(
X
=1
I{Def
)}
where I{Def ( 1 )} is the indicator of the event that the -th name defaults over the time
interval ( 1 ). Dene the following swap payo , occurring at time , and generalizing that
in Eq. (13.45) holding for single names,
!
1X
1
1
D( 1 )
(13.53)
cdx ( ) LGD D ( 1 )
=0
where D ( 1 0 ) denotes the number of defaults occurred over the time interval ( 0 ). The rst
term of cdx ( ) is the loss in the index occurring at time , paid o by the protection seller,
whereas the second term is the protection premium, which equals the constant premium
times the outstanding notional.12
The value of the protection leg minus that of the premium leg over the life of the index is
obtained as:
4
!
X R
cdx ( ) = LGD 0
1
for
(13.54)
DSX = E
0
=1
12 According to standard market practice, the loss in the index would actually occur as soon obligors defaultwithout any need
to wait until the end of any of the time intervals . However, we cast the discussion in terms of a di erent timing convention, as
this makes the nature of the swap transaction in Eq. (13.53) transparent.
764
c
by
A. Mele
where 0 and 1 are as in Eqs. (13.47), and can interpreted as the values of securities indexed
on default events of an hypothetical representative name.
A CDS index at the time of origination
0 is, then, simply, the value of
0 in Eq. (13.54),
which makes DSX 0 = 0, viz
0
) = LGD
Next, consider a forward starting credit default index, which is an index starting at time 0 , as
before, but decided at some point prior to 0 , say at time . Clearly, the value of the protection
leg minus that of the premium leg over the life of the index is the same as that in Eq. (13.54),
for a generic
( 0 ),
0 . Moreover, in Appendix 7, we show that for any time
DSX (
0)
4
X
=1
where
cdx ( ) =
(LGD
(13.55)
1X
=1
I{Surv
()
at }
(13.56)
Finally, an index default swaption payer with strike , gives the holder the option to enter
a CDS index as a protection buyer with an index strike spread equal to . Upon exercise,
the protection buyer would also receive a front-end protection, dened as the losses occurring
from the option origination to the exercise date. Let be the option origination and = 0 the
maturity of the swaption. The front end protection is,
= LGD 1 D ( ), where D ( ) is
the number of defaults occurred over the time interval ( ). In Appendix 6, we show that the
value of the front-end protection is,
R
1
F
= LGD
=E
( (
)
))
(13.57)
D( ) (
)+
def (
where ( ) and def ( ) denote the price of a non-defaultable and a defaultable zero expiring at time , with zero recovery value and default intensity equal to that of the representative
rm, . The underlying of a default swaption payer equals DSX ( ) + . Accordingly, we
can dene the loss-adjusted forward default swap index, as DSX ( ) DSX ( ) + F , and
nd the value of CDS ( ) such that newly issued forwards are worthless, DSX (
) = 0,
denoted as CDX ( ), which is,
CDX (
) = LGD
(13.58)
such that,
DSX (
)=
(CDX (
We wish to use
eraire such that CDX ( ) is a martingale under a suitable
1 as a num
probability,13 similarly as for the probability sc in Eq. (13.51). Dene the probability sc
13 A technical issue with the denition of CDX ( ) in Eq. (13.58) relates to a denominator problemthe possibility of a total
collapse of the index,
= 0. The occurrence of such an event has been taken into account by Rutkowski and Armstrong (2009)
and Morini and Brigo (2011).
765
sc
=
c
by
A. Mele
(13.59)
The probability sc is the index counterpart to sc in Eq. (13.51). For simplicity, we shall
keep on referring to sc as the survival contingent probability. Appendix 7 contains a proof
that sc does indeed integrate to one. It also shows that CDX ( ) in Eq. (13.58) is a martingale
under sc . The price of a swaption payer with strike
is, for any
[ ],
R
sc
SW (
) E
)
)+ =
)
)+
1 (CDX (
1 E (CDX (
sc () denotes the time conditional expectation under the survival contingent probabilwhere E
ity sc in Eq. (13.59). We know CDX ( ) is a martingale under sc . We can use Black (1976)
to evaluate the previous expression, once we assume that under sc , CDX ( ) is a geometric
Brownian motion with constant volatility.
[Explain the post Big-Bang corrections]
13.4.3 Collateralized Debt Obligations (CDOs)
13.4.3.1 A crash description of securitization
On a historical perspective, an important input to the process of securitization relates to nancial innovation put forward by the US government during the 1980s. Until the 1970s, the
nancial system used to live in a buy to hold system, in which banks making loans to
businesses or individuals would typically hold the loans in their balance sheets. During the
1970s, another trend began, where the Government National Mortgage Association (GNMA
or Ginnie Mae) would buy the mortgages from banks to incentivize them to extend more
loans, thereby making houses accessible to families. The second step would then be for GNMA
to sell securities based on the cash ows generated by these mortgages. Securitization would
then begin to take on a higher level when the Federal National Mortgage Association (Fannie
Mae) and the Federal Home Loan Mortgage Corporation (Freddie Mac) would securitize
the assets through tranching. Once the tranching model was initially developed, investment
banks applied this same idea to other kinds of assets, such as corporate bonds, student loans,
small business loans, automobile loans, etc.
How does tranching work? Tranching relies on CDOs, which are securitized shares in pools of
assets. Collateral assets include loans or debt instruments. A CDO may be a collateralized loan
obligation (CLO) or collateralized bond obligation (CBO) according to whether it relies only
on loans or bonds, respectively. CDO investors bear the credit risk of the collateral. Multiple
tranches of securities are issued by the CDO, o ering investors various maturity and credit risk
characteristics. Tranches are categorized as senior, mezzanine, and subordinated, or junior, or
equity, according to their degree of credit risk. If there are defaults or the CDOs collateral
otherwise underperforms, scheduled payments to senior tranches take precedence over those of
mezzanine tranches, and scheduled payments to mezzanine tranches take precedence over those
to junior tranches. Typically, senior tranches are rated, with ratings of A to AAA. Mezzanine
are also rated, typically with ratings of B to BBB. In principle, these ratings should reect both
the credit quality of the collateral and the protection a given tranche is given by the tranches
subordinating to it.
766
c
by
A. Mele
CDOs are part of a more complex securitization process, which could also involve the inclusion
of assets of di erent nature. The stylized example in the diagram below illustrates this process.
In a rst step, subprime mortgages are securitized; in a second step, a CDO is created out of
the securitized subprime mortgages and additional Asset Backed Secutities (ABS); in a third
step, the structuring process involves creating seniority rules.
Monthly
payments
Subprime
Mortgage
Subprime
Mortgage
Monthly
payments
Asset
Backed
Security
(ABS)
Subprime
ABS
ABS
investor
Subprime
ABS
ABS
investor
ABS
investor
Step 1
Collateralized
Debt Obligation
(CDO)
Subprime
Mortgage
CDO
Investors
CDO
Investors
CDO
Investors
Steps 2 and 3
Investors in CDOs senior tranches include banks and pension funds, which might benet from
the expertise of the asset managers, and the risk-return proles di cult to nd in the market.
Investors in junior tranches are hedge funds searching for highly risky investment opportunities
that at the same time, are quite rewarding and certainly unavailable in the market. Additional
investors in junior tranches were dedicated o -balance-sheets entities such as SIV, conduits,
and SIV-lites, which will be reviewed in Section 13.4.7.
Typical CDOs underwriters are investment banks. They work closely with the asset manager
and create the right debt/equity ratio and perform collateral quality tests. They liase with
law rms and create the special purpose vehicle (possibly in some tax heaven system) that will
purchase the assets and issue the tranches, price the various tranches, and obviously nd the
investors. Fees to underwriters are generous due to the complexity of the CDOs.14
Involved into the structuring process are also (i) trustee and collateral administrator, who
distribute noteholder reports, check compliance and execute priority of payments; (ii) accountants, who perform due diligence on the CDOs collateral pool, verifying for example credit
ratings for each asset; and (iii) rating agencies, which we shall discuss in the next subsection.
The economics behind structured nance is interesting. An originator may have private information about the quality of certain assets and/or a comparative advantage in evaluating
these assets relative to other market participants. If the originator intents to sell some of its
assets, an adverse selection problem arises: because investors do not know the true quality of
the assets, they will demand a premium to purchase them or even worse, a market might fail
to arise.
Structured nance helps originators mitigate this problem. First, by pooling the assets, diversication benets can be achieved. Second, tranching allows relatively poorly informed investors
to access senior tranches, and be relatively protected from default. In the process, the originator
or arranger may retain subordinated exposure to alleviate investors concerns about incentive
compatibility. The following scheme summarizes the structuring process.
14 According to Thomson Financial, top underwriters in 2006 were: Bear Sterns, Merrill Lynch, Wachovia, Citigroup, Deutsche
Bank, and Bank of America Securities.
767
c
by
A. Mele
Source: Committee on the Global Financial System: The role of ratings in structured
nance: issues and implications, January 2005.
13.4.3.2 The role of rating agencies
Structured nance has always been a rated market. Issuers of structured instruments had a
natural appetite for a rating to occur at a scale comparable with that applying to debt: the
main reason is that rating should facilitate the sale of these products to investors bound by
ratings-based constraints dened by their investment mandates.
However, the involvement of rating agencies into the delivery of their opinion about credit
risk di ers from that related to traditional bonds. As regards traditional instruments, rating
agencies simply aim to assess the risk of default as given, which they take as given. As regards
structured nance transactions, rating agencies play a much more ex-ante, reverse engineering
role. A tranche rating reects a view about both the credit risk of the asset pool and the extent
of credit support to be provided. These two elements are organized to reverse engineer the
tranche rating targeted by the deals arrangers. Deal origination involves rating agencies into
the structuring process.
13.4.3.3 Types of CDOs
In practice, CDOs are considerably more complex than the stylized examples outlined earlier.
We have a number of cases. We say that a CDO is static, if it holds the same set of assets.
Instead, a CDO is managed, if the asset manager is allowed to change the composition of assets.
If the claims to the CDO arise from the cash ows originated by the assets, we have a cashow CDO. If the claims to the CDO arise from the cash ows originated by the assets and/or
768
c
by
A. Mele
active asset management, we have a market-value CDO. CDOs can also be created to carve out
balance sheets, in which case we have balance-sheet CDOs. Moreover, and interestingly, CDOs
can be created (i) to achieve investment grade bonds through a pool of noninvestment grade
bonds, and (ii) to create riskier securities than those in the asset pool. In these cases, we have
arbitrage CDOs. Naturally, arbitrage CDOs do not give rise to any arbitrage opportunity.
These instruments merely reshu e risk and returns of the assets in the pool, as illustrated
by the examples in the next section. Arbitrage CDOs di er from balance sheet CDOs, because
issuers of arbitrage CDOs do not necessarily hold the underlying collateral in advance, which
is obviously the case for issuers of balance-sheet CDOs. Therefore, the assets to be put into the
an arbitrage CDO pool have to be reasonably liquid.
Furthermore, we have synthetic CDOs, which are exposed to a pool of assets that are not
strictly owned or in the asset pool, typically through CDS underwriting. Like a cash-ow CDO,
the vehicle receives payments (the premium), which is then transferred to the tranche holders.
Naturally, there can be default events, which are also passed through to the investors, according
to the prespecied seniority rules. A synthetic CDO is funded, if the relevant tranche holders are
to pay for in the case of a credit event related to the assets the CDO is exposed to. Typically,
some funding is made available at the very time of investment. At maturity, the investor receives
a payo equal to the funding minus the realized losses. Junior tranches are typically funded,
and senior are typically not. However, senior tranches investors might have to make payments
in the unlikely event losses had ever to erode their tranches.
Finally, we have hybrid CDOs, which are partly cash-ow CDOs and partly synthetic CDOs.
In a single-tranche CDO, the entire CDO is structured to accommodate the specic needs of a
small group of investors, with some remaining tranche held by the dealer. And we have CDO 2 ,
where a large portion of the assets in the pool are tranches from other CDOs; or more generally,
CDO n .
13.4.3.4 Pricing
CDOs repackage cash ows from a set of assets. We provide simple examples to show how to
price this repackaging process. We begin with a simple example, taken from McDonald (2006, p.
583), which we further elaborate. Suppose we have three one-year bonds with face value = 100.
For each of these bonds, the risk-neutral probabilities of default equal 10% and the recovery
rates are 40. The safe interest rate for one year is 6%. So each bond price equals,
=
0 06
( |{z}
0 10
Def Prob
40 +
0 90
|{z}
Surv Prob
100) = 88 526
Mezzanine tranche = 90
Junior tranche = 70
Asset Pool
CDO claims
769
c
by
A. Mele
In this example, each tranche receives the minimum between (i) the nominal value claimed by
the tranche and (ii) what is left available to the tranche after having satised the other tranches
by order of seniority.
Let
be the nominal values claimed by the tranches, so that 1 = 140, 2 = 90 and
3 = 70. Let be the realized payo of the asset pool, dened as,
= No of Defaults 40 + (3
|
No of Defaults) 100
{z
}
No of surviving bonds
(iii) Finally, at the expiration, the junior tranche reveives the minimum between
left-over from the senior and mezzanine tranches.
and the
P1
0
Left-over 1 max
1,
=1
That is,
= min max
P1
=1
All we need now is a model for the risk-neutral probability of default for each rm. Initially,
we assume the default events are independent across rms. We assume binomial distribution,
3
Pr (No of Defaults = ) =
(1
)
= 10%
{1 2 3}
leading to the payo s in the following table:
Payoffs to CDO tranches, and prices: with independent defaults
Defaults Pr(Defaults) : pool payoff (1)
1: Senior
2: Mezzanine
3: Junior
0
0.729
300
140
90
70
1
2
3
0.243
0.027
0.001
240
180
120
140
140
120
131.8281994
0.060142867
90
40
0
83.40266709
0.076129382
10
0
0
50.34673197
0.329561531
Price
Yield
(1)
770
c
by
A. Mele
The price of each tranche is computed as the tranche payo , averaged across states, discounted
at the safe interest rate. For example, the price of the mezzanine tranche is,
0 06
Price Mezzanine =
Its yield is, Yield Mezzanine = ln 8390403 = 7 61%. Naturally, the sum of the three bond prices,
88 5263 = 265 58, is equal to the total value of the three tranches, 131 828+83 403+50 347 =
265 58. As anticipated, a CDO is a mere re-packaging device. It doesnt add or destroy value.
It merely redistributes risks (and returns).
The assumption defaults among names are uncorrelated is unrealistic, as argued in Section
13.5.4. We now remove this assumption. First, what happens in the special case where default
events are perfectly correlated ? In this case, either the three rms all default (with probability
0.10) or none defaults (with probability 0.90), and we have the situation summarized by the
table below.
0.9
300
140
90
70
1
2
3
0
0
0.1
NA
NA
120
NA
NA
120
129.9635056
0.074388737
NA
NA
0
76.28292722
0.165360516
NA
NA
0
59.33116562
0.165360516
Price
Yield
(1)
Note that mezzanine and junior tranches now yield the same, because they each pay o either
their nominal value or zero in exactly the same states of nature. In other words, default clustering implies that good times are really good, in that the probability to have no defaults is
now 90%, much higher than the 72.9% arising when the correlation of defaults is zero.
The previous cases (with independent or perfectly correlated defaults) are extreme. What
happens when defaults are only imperfectly correlated? In this case, the pricing of tranches
is more complex, and requires a model of default correlations. We use the so-called Gaussian
copulae, reviewed in Appendixes 7 and 8, and simulations. Figure 13.14 illustrates how the
yield on each tranche changes as a result of a change in the default correlation underlying the
771
c
by
A. Mele
0.35
0.3
Yield
0.25
0.2
0.15
0.1
0.05
0.1
0.2
0.3
0.4
0.5
0.6
default correlation
0.7
0.8
0.9
FIGURE 13.15. Yields on the three CDO tranches, as functions of the default correlation
among the assets in the structure, with probability of default for each name = 20%. The
thick, horizontal, line is the yield on each securitized asset.
Arbitrage CDOs
Figure 13.15 illustrates how arbitrage CDOs work. The CDO has three assets yielding the same,
12.19% (the horizontal line in the picture). However, by restructuring the asset base through a
CDO, we can create claims (Senior and Mezzanine tranches) that yield less than 12.19%, as they
are considerably less risky than the asset base. Such an excess return, (12 19% Yieldtranche ),
with Yieldtranche
{Senior Mezzanine}, is made available to the Junior tranche/equity
holdersonce management fees and expenses are accounted for. Note that such a redistribution of risk works quite e ectively as soon as the default correlation is relatively low. As the
default correlation in the asset base increases, the situation may change dramatically, with the
mezzanine tranche becoming more risky and, then, yielding a higher expected return. Finally,
Figure 13.16 depicts the output of a comparative statics where we increase from 10% to
20%. The yields are obviously higher for each tranche, and the three assets now yield 18.78%,
reecting the higher marginal probability of default for each of the securities in the pool, .
Correlation assumptions
In Figures 13.15 and 13.16, the yield on the junior tranche decreases with default correlation.
This happens because we are assuming that the probability of default is xed at = 10% for
each default correlation (say). As increases, the probability of clustering events increases,
which makes the Senior and Mezzanine tranches relatively less valuable and, correspondingly,
the Junior tranches more valuable. A more appropriate model is one in which increases as
increases, to capture the fact that in bad times, both default correlation and probability of
772
c
by
A. Mele
defaults increase as these two things are intimately connectedby, e.g., some common business
cycle factors.
0.6
0.5
Yield
0.4
0.3
0.2
0.1
0.2
0.4
0.6
default correlation
0.8
FIGURE 13.16. Yields on the three CDO tranches, as functions of the default correlation
among the assets in the structure, with probability of default for each name = 20%. The
thick, horizontal, line is the yield on each securitized asset.
We relax the assumption that the probability of default, , and the default correlation,
are independent. We assume that and are tied up through the following relation, =
3 8116 ln ( + 1), and let vary from 0.10 to 0.30, such that varies from 0 3633 to 1.
The situation now changes, dramatically. Figure 13.17 depicts the results, which show how
modeling might substantially a ect e ective pricing. First, and naturally, the yield on each
securitized asset is increasing in because is also increasing in the probability of default.
Second, the Junior tranche has a yield that increases over a wide spectrum of values for the
default correlation, . Note that the Junior tranche bends back to lower values as the default
correlation is close to one, reecting the fact that default clustering makes this tranche quite
valuable in good times, as explained earlier.
773
c
by
A. Mele
0.45
Junior
Mezzanine
Senior
0.4
0.35
Yield
0.3
0.25
0.2
0.15
0.1
0.05
0.4
0.5
0.6
0.7
default correlation
0.8
0.9
FIGURE 13.17. Yields on the three CDO tranches, as functions of the default correlation
among the assets in the structure, with probability of default and default correlation
related by = 3 8116 ln ( + 1),
[0 10 0 30]. The thick curve line depicts the yield
on each securitized asset.
13.4.3.5 Nth to default
In this contract, the owner of the 1 to default bears the risk of the rst default that occurs in
the asset pool:
Payo = Pr(No of Defaults
Likewise, the owner of the 2
asset pool:
1) 100
to default bears the risk of the second default that occurs in the
1) 40 + Pr(No of Defaults
2) 40 + Pr(No of Defaults
2) 100
to default bears the risk of the third default that occurs in the
3) 100
Let us assume that default correlation is zero for simplicity. We have previously computed
the previous probabilities as:
Pr(No of Defaults
1) = 0 243 + 0 027 + 0 001 = 0 271
Pr(No of Defaults
2) = 0 027 + 0 001 = 0 028
Pr(No of Defaults = 3) = 0 001
774
c
by
A. Mele
-to-default
-to-default
-to-default
=
=
=
0 06
[0 271 40 + (1
0 06
[0 028 40 + (1
0 06
[0 001 40 + (1
From here, we can compute the yields as follows, Yield1 -to-def = ln (78 863 100) = 23 74%,
Yield2 -to-def = ln (92 594 100) = 7 69%, and Yield3 -to-def = ln (94 120 100) = 6 06%.
13.4.3.6 One numerical example of a stylized structured product
A. Defaultable bonds
Suppose we observe the following risk-structure of spreads, related to two bonds maturing in
two years:
Spread (2 years) = 1 5% Spread (2 years) = 2 5%
where A and B denote the rating classes the bond issuers belong to. Assume that the one-year
transition rating matrix, dened under the risk-neutral probability, is:
To
A B
A
0.7 0.3
From
B
0.3 0.5
Def 0
0
Def
0
0.2
1
where Def denotes default. We assume that in the event of default, the recovery value of the
bond is paid o at the end of the second period. We want to determine the expected recovery
rates for the two bonds, and which expected recovery rate is the largest. We have:
Rec
0
=
Q (2) + (1 Q (2))
{
}
Therefore,
Spread (2 years) = 1 5% =
Spread (2 years) = 2 5% =
Rec
1
ln
Q (2) + (1
2
Rec
1
ln
Q (2) + (1
2
Q (2))
Q (2))
We have to nd Q (2) and Q (2). The transition matrix for two years is,
Q (2) =
07 03 0
03 05 02
0
0
1
07 03 0
03 05 02
0
0
1
such that,
Pr {A defaults in 2 years} = Q (2)
0 20} +
= 0| 70{z 0} + 0| 30 {z
= 0 06
775
0| {z
1}
(13.60)
(13.61)
c
by
A. Mele
+ 0| 50 {z
0 20} + 0| 30{z 0}
= 0 20 + 0 10 = 0 30
Rec
1
ln
0 06 + (1
2
Rec
1
ln
0 30 + (1
2
0 06)
0 30)
Solving, yields,
Rec
= 50 7%
Rec
= 83 7%
The expected recovery rate for the second bond is the largest. This is because the probability
rm B defaults is much larger than the probability rm A defaults and yet the two spreads are
relatively close to each other. So to rationalize the two spreads, we need a large recovery rate
for the second bond.
What would happen to the two credit spreads, once we assume that the recovery rates are
the same, and equal to 50%? This question sheds additional light to the previous ndings. If
the recovery rates are the same and both equal 50%,
Spread (2 years) =
Spread (2 years) =
1
ln [0 50Q (2) + (1
2
1
ln [0 50Q (2) + (1
2
Q (2))]
Q (2))]
Then, using the previously computed transition probabilities for two years, we obtain:
Spread (2 years) = 1 52%
When the recovery rates are the same, the spread on the second bond diverges substantially
from that on the rst bond.
B. Collateralized debt obligations
Let us keep on using the same framework as before, but use di erent gures, so as to gure out
the implications for CDOs pricing. Consider the following one year transition matrix, under the
risk-neutral probability:
A
A
0.7
From
B
0.1
Def 0
To
B
0.3
0.6
0
Def
0
0.3
1
where Def denotes default. Consider (i) 1 one-year bond issued by a company rated A, and
(ii) 3 one-year bonds issued by a company rated B. Both bonds have face value equal to 100.
776
c
by
A. Mele
We assume that the recovery values in case of default of all these bonds are the same, and equal
to 50. Finally, we assume the safe interest rate is taken to be equal to zero.
Consider a collateralized debt obligation (CDO, in the sequel), which gathers the previous
four bonds. Therefore, the CDO has nominal value of 400, and pays o in one year. The CDO
has (i) a senior tranche, with nominal value equal to 150; (ii) a mezzanine tranche, with nominal
value equal to 1 ; and (iii) a junior tranche, with nominal value equal to 2 . We assume that
the structure is such that 1 100.
First, we determine the price and yields on all the four bonds. Since the safe interest rate is
zero, and the company rated A is safe, up to the next year, the price of the A bond is 100, and
its yield is zero. As for the three bonds rated B, we have:
= 50 0 3 + 100 0 7 = 85 0
ln 0 85 = 16 25%
Second, we determine the yield on the junior tranche, and derive the yield on the mezzanine,
as a function of its nominal value 1 . To determine the yield on the tranches, we need to gure
out the following table:
No Def Pr
0
1
2
0
0.7 400 150
1
2
1
0
na na na na
2
0
na na na na
3
0.3 250 150 100 0
4
4
na na na na
where No Def denotes the number of defaults, Pr is the probability of No Def,
payo , dened as,
= No Def 50 + (4 No Def) 100
is the pool
and, nally: 0 is the payo to the senior tranche, 1 is the payo to the mezzanine tranche,
and, 2 is the payo to the junior tranche. Therefore, we have:
price mezzanine = 0 70
such that:
Yield mezzanine
Yield junior
=
=
ln
ln
0 70
0 70
2
+ 0 30 100
1
price junior = 0 70
+ 0 30 100
100
ln 0 70 + 0 30
1
= 35 67%
100
= Yield mezzanine.
ln 0 70 + 0 30
Yield junior = ln (0 70)
1
A reverse enginnering question is, now, to determine which nominal value of the mezzanine
tranche 1 is needed, to ensure that the yield on the mezzanine tranche is equal to or greater
than the yields on the bonds issued by the company with credit rating B? The answer is
1 = 200, for in this case, the mezzanine tranche would have the same payo structure as the
bond rated B: it would deliver (i) the face value, in the event the company rated B does not
default; and (ii) half of its nominal value, 100, in the event the company rated B does default.
777
c
by
A. Mele
Finally, we ask which nominal value of the mezzanine tranche 1 is needed, to ensure that
the yield on the mezzanine is equal to 18%? And what is the corresponding nominal value of
the junior tranche, 2 ? To address these issues, we rst want that:
0 70 1 + 0 30 100
Yield mezzanine = ln
= 18%
1
= 400
Definition I: We are (1
)% certain that a given portfolio will not su er of a loss larger
than $W over the next
weeks, Pr (Loss
) = . That is, $VaR = $ .
778
c
by
A. Mele
= portfolio return
where
denotes the change in value of the portfolio over the next
current value of the portfolio. Hence,
VaR
= Pr (Loss
VaR ) = Pr
0
days, and $
is the
VaR
where =
= Pr
0
The corresponding VaR is just VaR = 0 . For example, suppose that the portfolio return
over the next 2 weeks, 0 , is normally distributed with mean zero and unit variance. We know
that 0 01 = Pr( 0
2 32). Hence, VaR = 2 32 0 .
0.4
0.35
0.3
0.25
0.2
0.15
0.1
1%
VaR/V
0.05
0
3
We are 99% certain that our portfolio will not su er of a loss larger than 2 32 times its
current value over the next 2
. We are 99% certain that our portfolio will not experience
a relative loss larger than 2 32 over the next 2 weeks.
As a second example note that the previous assumption about the portfolio return was
extreme. Assume, instead, the porfolio return over the next 2 weeks, 0 , is normally distributed
2 2
2
with mean zero and variance 2 = 52
year , where year is the annualized variance. We assume
2
2
that year = 0 15 . We have to re-scale the previous formulas, as follows. First, we introduce a
779
c
by
A. Mele
(0 1), i.e. is normally distributed with mean zero and variance = 1. So we can
and, hence,
0 01 = Pr (
2 32
2 32) = Pr (
whence, VaR = 2 32 0 . We know the annualized variance, 2year = 0 152 , from which we
2 2
can derive the two-week standard deviation, 2 = 52
0 032 , and, hence, VaR0 = 2 32 =
year
2 32 0 03 7%. That is, we are 99% certain that our portfolio will not su er of a loss larger
than 7% times its current value over the next 2 weeks. We are 99% certain that our portfolio
will not experience a relative loss larger than 7% over the next 2 weeks.
More generally, we may assume the porfolio return over the next 2 weeks, 0 , is normally
distributed with mean and variance 2 . In this case,
=
0
and, hence,
0 01 = Pr (
whence, VaR =
weeks.
(2 32
2 32) = Pr (
). In practice,
(2 32
))
The assumption that data are generated by a normal distribution does not describe asset
returns well. In previous chapters of this Part and Part II of these Lectures, we explain that
we need ARCH e ects, stochastic volatility and multifactor models. More generally, data can
exhibit changes in regimes, nonlinearities and fat tails. Fat tails are particularly important to
understand, since this is what were interested in after all. More in general, it is quite challenging
to understand what the data generating process is, especially in so far as we consider portfolios
of assets. Asset returns and volatilities are typically correlated, with correlation rising in bad
timescorrelation is stochastic.
We may make distributional assumptions but then, these assumptions have to be carefully
assessed through, for example, backtesting (to be explained below). We may proceed with
nonparametric methods, and this is indeed a promising avenue, but with its caveats.
How do nonparametric methods work? These methods rely on an old and idea, which is to
estimate the data distribution through histograms. These histograms can be readily used to
compute VaR. This approach is nonparametric in nature, as it does not rely on any model.
A more rened method replaces rough histograms with smoothed histograms, as follows.
Suppose to have access to a time series of data , which are drawn from a certain probability
law, with density ( ). We may dene the following estimate of the density ( ),
X1
( )= 1
=1
780
c
by
A. Mele
where
is the sample size, and
is some symmetric function integrating to one. We may
think of
( ) as a smoothed histogram, with window bin equal to . It is possible to show
that as goes to innity and goes to zero at a certain rate, ( ) converges in probability
to ( ), for all . But we are not done, since there are not obvious rules to choose and ?
The choice of is notoriously di cult. Unfortunately, the bias, ( )
( ), tends to be
large exactly on the tails of ( ), which do represent the region were interested in. In general,
we can use Montecarlo simulations out of a smoothed density like this to compute VaR.
Nonlinearities
Finally, portfolios of assets can behave in a nonlinear fashion, especially when the portfolio
contains derivatives. In general, the value of a portfolio including
assets is,
=
X
=1
X
=1
X
=1
There are technical di culties with the very denition of VaR. VaR su ers from some statistictheoretic foundation. VaR tells us that 1% of the time, losses will exceed the VaR gure, but
it does not tell us the entity of the loss. So we need to compute the expected shortfall. Any
risk measure should enjoy a number of sensible properties. Artzner et al. (1999) have noted a
number of properties, and showed that VaR does not enjoy the so-called subadditivity property,
according to which the sum of the risk measures for any two portfolios should be larger than the
risk measure for the sum of the two portfolios. VaR doesnt satisfy the subadditivity property,
but expected shortfall does satisfy the subadditivity property.
13.5.2 Backtesting
How well the VaR estimate would have performed in the past? How often the loss in a given
sample exceeded the reference-period 99% VaR? If the exceptions occur more than 1% of the
time, there is evidence that the models leading to VaR estimates are misspecieda nice
word for saying bad models.
The mechanics of backtesting is as follows. Suppose the models leading to the VaR are
good. By construction, the probability the VaR number is exceeded in any reference period
is , where is the coverage rate for the VaR. Next, we go to our sample, which we assume
781
c
by
A. Mele
it comprises
days, and let
be the number of days the VaR is exceeded. We wish to test
whether the number of exceptions we observe in the sample conforms to the expected number
of exceptions based on the VaR. For example, it might be that the number of exceptions we
have observed, , is larger than the expected number of exceptions, . We want to make
sure this circumstance arose due to sample variability, rather than model misspecication. A
simple one-tail test is described below.
Let us compute the probability that in
days, the VaR is exceeded for
or more days.
Assuming exceptions are binomially distributed, this probability is,
=
X
=
!
!(
)!
(1
.
time , +1 =
1
782
c
by
A. Mele
(v) The scenarios are generated for all the market variables, which would give us an articial
multivariate sample of observations. We can use this sample for many things, including
VaR.
13.5.4 Credit risk and VaR
We can use the tools in Section 13.2 to assess the likelihood of default for a given name. The
important thing to do is to use the physical probability of default, not the risk neutral one. The
risk neutral probability of default is likely to be larger than the physical one. Therefore, using
the risk neutral probability leads to too conservative estimates.
VaR for credit risks pose delicate issues as well. The key issue is the presence of default
correlation. In practice, defaults among names or loans are likely to be correlated, for many
reasons. First, there might be direct relationships or, more generally, network e ects, among
names. Second, rms performance could be driven by common economic conditions, as in the
one factor model which we now describe. This one factor model, developed by Vasicek (1987),
is at the heart of Basel II. In the appendix, we provide additional technical details about how
this model is related to a modeling tool known as copulae functions. We now proceed to develop
this model in an intuitive manner. Let us dene the following variable:
p
=
+ 1
(13.62)
where
is a common factor among the names in the portfolio, is an idiosynchratic term,
and
(0 1),
(0 1). As we explain in the Appendix,
0 is meant to capture the
default correlation among the names.
Next, assume that the physical probability each rm defaults, by , say P ( ), is the same
for each rm within the same class of risk, and given by,
P( ) =
PD )
PD
(PD)
1
where
denotes the inverse of . One economic interpretation of Eq. (13.62) is that
is
the value of a rm and that the rm defaults whenever this value hits some exogenously given
barrier PD .
Conditionally upon the realization of the macroeconomic factor , the probability of default
for each rm is,
1
(PD)
(13.63)
( ) Pr (Default| ) =
1
By the law of large numbers, this is quite a good approximation to the default rate for a portfolio
of a large number of assets falling within the same class of risk.
We see that this conditional probability is decreasing in : the larger the level of the common
macroeconomic factor, the smaller the probability each rm defaults. Hence, we can x a value
of such that Pr (Default| ) = Default rate is what we want. Note, the probability is larger
1
than
( ) is just ! Formally,
1
1
( ) = Pr
( ) =
( ) =
Pr
783
c
by
A. Mele
( )
Risk
(0 999)
The reason Basel II requires the term VaRCredit Risk (0 999) PD, rather than just VaRCredit Risk ,
is that what is really needed here is the capital in excess of the 99.9% worst case loss over the
expected idiosyncratic loss, PD. Well functioning capital markets should already discount the
idiosyncratic losses.
Finally, Basel II requires banks to compute through a formula in which is inversely related
to PD. The formula is based on empirical research (see Lopez, 2004): for a rm which becomes
less creditworthy, the PD increases and its probability of default becomes less a ected by market
conditions. Basel II requires banks to compute a maturity adjustment factor that takes into
account that the longer the maturity the more likely it is a given name might eventually migrate
towards a more risky asset class.
The previous model can be further elaborated. We ask: (i) What is the unconditional probability of defaults, and (ii) what is the density function of the fraction of defaulting loans?
First, note that conditionally upon the realization of the macroeconomic factor , defaults
are obviosly independent, being then driven by the idiosyncratic terms in Eq. (13.62). Given
loans, and the realization of the macroeconomic factor , these defaults are binomially
distributed as:
Pr (No of defaults = | ) =
( ) (1
( ))
where ( ) is as in Eq. (13.63). Therefore, the unconditional probability of defaults is:
Z
Pr (No of defaults = | ) ( )
Pr (No of defaults = ) =
where denotes the standard normal density. This formula provides a valuable tool analysis in
risk-management. It can be shown that VaR levels increase with the correlation .
Next, let denote the fraction of defaulting loans. For a large portfolio of loans, = ( ),
such that:
Z
Z
Pr (
| ) ( )
=
I( )
( )
= ( )
(13.64)
Pr (
)=
where
1I denotes the indicator function, and
(PD)+
. Solving for
leaves:
1
=
( )
784
(PD)
= (
)=
c
by
A. Mele
It is the threshold value taken by the macroeconomic factor that guarantees a frequency of defaults less than . Replacing
into Eq. (13.64) delivers the cumulative distribution function
for . The density function ( ) for the frequency of defaults is then:
( )=
1
2
1(
))
1
2
1(
1 (PD) 2
This model can be generalized to one where the asset value of the rm,
multifactor model,
X
=
+
, is given by a
=1
c
by
A. Mele
values, feeding a vicious feedback loop. Bernanke, Gertler and Gilchrist (1999) present a unied
view of how agency problems make funding opportunities depend on rms collateral.15
Fisher (1933) is one of the earliest proponent of these procyclicality issues, in his attempt
to explain the origins of the Great Depression through a debt-deation spiral. In an economy
with highly levered rms, such as that of the US during the 1930s, a negative productivity
shock leads to bankruptcy of a fraction of these rms, which generates less investments and,
hence, depresses aggregate demand and creates deation. In turn, deation boosts the real
value of debt borne by rms, increasing the rms burden and leading a higher fraction of these
rms to default. Such a debt-deation spiral results in a deterioration of the balance sheet of
nancial intermediariesbanks obviously bleed money as their borrowers defaultand to a
default contagion, from rms to nancial intermediaries. As a result, nancial intermediation
shrinks and the vicious feedback loop might go through a consistently long period.16
[Explain the connections with the credit view and the previous footnote on Friedman and
Schwartz (1963)]
This section provides discussion of these procylicality problems, arising through the balance
sheets of nancial intermediaries. Section 13.6.1 is an overview of the extant regulatory framework, which is useful whilst framing procyclicality issues dealt with later in this section. Section
13.6.2 reviews a few institutional facts sorrounding the 2007 subprime turmoil. Section 13.6.3
develops a few models where the amplication of small shocks occurs because nancial intermediaries have concerns over the structure of their books. Thus, following a negative shock
a ecting the assets in the balance sheet, banks need to restore their Tier 1 and Tier 2 capital
(in short, their top tier capital) and leverage ratios. Since they cannot raise fresh capital in
the short-run, they cash-in by selling some of their assets. These sales create a vicious feedback
loop where banks sell assets, contributing to a further drop in the value of these assets, triggering further sales into a depressed market. We may have situations where this loop leads to
a complete market dry-up, which is even more likely to occur in the presence of capital market
frictions, where some initially moderately low liquidity frictions can turn into spots of liquidity
black holes.
Even absent such extreme situations, the equilibria in these markets can be those where an
initial small loss in the banking system is amplied, to an extent determining a very substantial
lending shrinkage, a credit crunch. Section 13.6.4 discusses the policy that monetary authorities
have implemented in their attempt to mitigate the credit crunch originating from the 2007
subprime crisis. The standard policy action against a recession is to target low interest rates
in the interbank markets for mandatory reserves. However, the cost of capital that matters
to a recovery in the economic activity is that faced by rms whilst demanding new funds to
banks (through loans) and/or the market (through issuance of corporate bonds). This cost can
be substantially higher than the interest rates targeted by the monetary authority, due to the
credit crunch. Quantitative easing is an unconventional policy action, where the monetary
15 Borio, Furne and Lowe (2001) explain that the agents misperception of risk might constitute an additional amplication
mechanism. For example, the credit/GDP ratio might be procyclical because nancial intermediaries under-estimate risk in good
times, and over-estimate risk in bad, thereby lending too much in good times and too less in bad.
16 This view of the Great Depression was challenged by Friedman and Schwartz (1963), who proposed a monetary view instead.
According to this view, the causes of the prolonged recession and the banking crises over the 1930s need to be linked to a nonaccommodating monetary policy. Friedman and Schwartz examine the US economy from Civil War through 1960, and nd a
statistical relation between monetary policy and developments in the real macroeconomic aggregates: an expansionary monetary
policy is associated with an expansion of the real economy. Friedman and Schwartz nd that this linkage is particularly strong
over the 1930s, and go further on, suggesting a causality from monetary policy to developments in the real economy. According
to them, the only role banks might have played over the crisis was their contribution to the shrinkage in money supply through a
lower money multiplier, dened in Section 13.6.3.
786
c
by
A. Mele
authority engages into the purchase of some of the assets held by banks (including the most
illiquid ones), so as to give banks incentives to start lending again.
13.6.1 Regulatory framework
Banks have to set up capital bu ers to guarantee the debt they issue against their risky activities
of lending and investing. The Basel Committee on Banking Supervision (BCBS)17 drafts accords
aiming to create an international standard for the capital necessary to cope with these risks,
together with rigorous tools for risk measurement and management. Quite simply, the greater
the risk a bank is exposed to, the greater the amount of capital the bank needs to hold to
safeguard its solvency, in the interest of overall economic stability. The main issue, then, is to
correctly measure this risk.
The rst accord of 1988, known as Basel I, focussed on minimal capital requirements to cope
with credit risk, and was enforced by law by the Group of Ten in 1992 and then by more than
100 countries. It relied on the so-called Cooke ratio (after Peter Cooke of the Bank of England),
a minimum capital adequacy standard of 8% of the total risk-weighted assets. The accord was
quite coarse, in that it considered ve broad classes of credit risk with which to weigh the assets,
which did not discriminate about the credit quality across classes. For example, corporate loans
had 100% weightings and loans to OECD countries had zero weigthings, independently of the
ratings of the borrowing entities.
The rst amendment to Basel I occurred in 1996, and aimed to include tools to cope with
market risk. In 1999, a rst consultative paper was drafted on a new accord, known as Basel II.
One of the main issues under reform was the one-size-ts-all approach of Basel Ithe fact
that default risk could be substantially lower for some of the assets within the same class of risk
in the banks accounts. For example, banks could have securitized the loans with default risk
lower than that implied by the at rate within the same class, and hold those loans with higher
default risk. This might have led to an increase in the overall riskiness of nancial institutions.
In 1998, the Federal Reserve Chairman Alan Greenspan pointed to the existence of incentives
left to bank to implement regulatory arbitrage:
Banks arbitrage away inappropriately high capital requirements on their safest assets by
removing these assets from the balance sheet via securitization. The issue is not solely
whether capital requirements on the banks residual risk in the securitized assets are
appropriate. We should also be concerned with the su ciency of regulatory capital requirements on the assets remaining on the book. In the extreme, such cherry picking
would leave on the balance sheet only those assets for which economic capital allocations
are greater than the 8 percent regulatory standard.[Greenspan, 1998 p. 166]
There is a consensus that Basel II did indeed considerably mitigate these issues, by paying
more attention to risk-sensitivity by means of a more precise set of indications about classes of
risk and, also, distinguishing among credit risk, market risk and even operational risks.18 Moreover, the Basel II accords aimed to a exible supervisory system whereby banks could choose
17 The BCSB is a committee of banking supervisory authorities established by the central bank governors of the Group of Ten
countries in 1975. It consists of senior representatives of bank supervisory authorities and central banks from Belgium, Canada,
France, Germany, Italy, Japan, Luxembourg, the Netherlands, Spain, Sweden, Switzerland, the United Kingdom, and the United
States. It usually meets at its permanent Secretariat, located at the Bank for International Settlements in Basel.
18 Operational risk is dened as the risk of losses resulting from inadequate or failed internal processes, people and systems, or
external events. Examples of operational risk include two famous cases of rogue trading: Nick Leeson, who in 1995 led Baring
Bank to bankruptcy, through a loss of $1.3bn, and J
er
ome Kerviel, who in 2008 led Soci
et
e Gen
erale to a loss of 5bn.
787
c
by
A. Mele
=1 0 08
For example, the capital requirements for market risk can be determined through dedicated
VaR models, such as (and, possibly, more sophisticated versions of) those surveyed in Section
13.5. The (total) minimum capital requirements are taken to be 8% of the total risk-weighted
assets,
Regulatory capital
8%
Total RWA
One immediate issue arising with Basel II is its heavy reliance on credit rating agencies for
what pertains the standardized approach to credit risk. This approach might be misguided
due to conicts of interest between credit rating agencies and the rms these agencies rate the
debt of. At the time of writing, rating agencies are mostly unregulated, with the credit risk
estimates quality being obviously observable only with lags and, importantly, too late should
serious mispricing take place.
19 Note that within the internal rating approach, banks are not allowed to use internal models of credit risk. Banks that have
received supervisory approval to use the internal approach may rely on their own internal estimates of risk components in determining
the capital requirement for a given exposure. The risk components include: (i) measures of the probability of default, (ii) loss given
default, (iii) exposure at default, (iv) e ective maturity.
788
c
by
A. Mele
A second issue is procyclicality: in bad times, banks reduce lending, exacerbating the
current economic developments, which makes banks reduce lending even further, over a vicious
circle. Sections 13.6.3 and 13.6.4 deal with procyclicality issues. The dangers of procyclicality
in a regulatory context is that in bad times, when risks intensify, the banking system is given
additional burden due to regulatory capital, which might lead to a further lending shrinkage.
Basel III introduces a regulatory device that allow to mitigate cyclicality by requiring banks to
build up capital bu ers in good times with which to cope in adverse times.
After a number of additional consultative papers, national regulators indicated to the Financial Stability Institute (FSI)20 that they would implement Basel II by 2015. During the process
when the European Union was implementing Basel II through its EU Capital Adequacy Directive (CAD III), the global nancial crisis following the 2007 subprime events determined a
re-thinking of Basel II, leading to a new set of rules, known as Basel III. The main innovations
of Basel III are summarized by the following points: (i) new capital requirements, such as those
summarized in the table below, as well as a new mandatory capital conservation bu er of 2.5%
of the Total RWA, to face economic stress; (ii) new rules that allow national supervisors to
require banks to set up capital up to 2.5% of the Total RWA in times of high credit growth
countercyclical capital bu ers; and (iii) a target for the leverage ratio, dened as the ratio of
Tier 1 capital (i.e. equity plus reserves minus intangible assets) over total assets net of intangi1
1
ble assets, to be at least 3% ( Tier
3%
Lev NonTierTier1 1
1), as well as additional
Assets
0 03
liquidity ratios. The following table summarizes the main di erences in capital requirements
that Basel III introduces against Basel II.
Capital requirements as a % of the Total RWA
Basel II Basel III
Common equity
2%
4.5%
Tier 1
4%
6%
Total capital
8%
8%
Common equity (conservation)
2.5%
13.6.2 The 2007 subprime crisis
The 2007 subprime crisis would develop due to a mixture of coincident factors. One factor
certainly regards the institutional details through which MBS securities were tradeda shadow
banking system that escaped the o cial nancial community.
A second factor was model misspecication, that is, the fact that the evaluation framework
for these securities relied on unrealistic assumptions such as stability of delinquencies, reliance
on expected (linear) actuarial losses (not tail risk losses), and small risk-aversion or liquidity
adjustments. For example, the picture below shows delinquencies were actually traveling fast
over the relatively newly created subprime mortgages; in retrospect, this of piece information
could have helped predict that a crisis in the mortgage market was about to arrive. Another
dimension of model misspecication was a reliance on an inappropriate rating mapping system, by which rating agencies tended to rate structured products relying on MBS, by using the
rating system they had in place for corporations. Finally, additional elements such as default
risk correlation were not duly taken into account while calibrating the models.
20 The FSI, headquartered by the Bank for International Settlements in Basel, was established in 1999 in response to the Asian
crisis of 1997.
789
c
by
A. Mele
This section pulls all these elements together while providing a succinct account of the subprime crisis. The crisis erupted while MBS derivatives were producing losses in a market where
the identity of the players was unclear as these products had been channelled through the
o -balance-sheet vehicles of the shadow banking system.
Subprime mortgage delinquencies by vintage Year (60+ day delinquencies, in % of balance)
30
25
2000
2006
2005
20
2004
15
2003
10
2007
5
10
20
30
40
50
60
On the funding side, a typical SIV (Structured Investment Vehicle) issues long-maturity notes.
On the asset side, a SIV typically relies on assets that are more complex than those conduits rely
on. SIVs tended to be more leveraged than conduits. Please remember: SPV = Special Purpose
Vehicle, i.e. a vehicle that organizes securitization of assets; SIV = Structured Investment
Vehicle, i.e. a fund that manages asset backed securities. In a sense, SIV are virtual banks, in
that they borrow through low-interest securities and invest in longer term securities yielding
large rewards (and risk), as we discuss below. SIVs and conduits typically have an open-ended
lifespan.
SIV-lites are less conservatively managed and are structured with greater leverage. Their
portfolios are not much diversied, and are much smaller in size than SIVs. SIV-lites had a
nite lifespan, with a one-o issuance vehicle. They were greatly exposed to the U.S. subprime
market, more so than SIVs.
O -balance-sheet entities borrow in the shorter term, typically through commercial paper or
auction rate securities with average maturity of 90 days, as well as medium term notes with
average maturity of a year. They purchase long-maturity debt, such as nancial corporate bonds
or asset-backed securities, which is high-yielding. Naturally, the prots made by these entities
are paid to the capital note holders, and the investment managers. The capital note holders
are, of course, the rst-loss investors.
The obvious risk incurred by these entities is solvency, a risk that materializes when the value
of long-term assets falls below the value of short-term liabilities. This risk has great chances
790
c
by
A. Mele
to materialize when the pricing of the assets is informal, as argued below. A second risk
is funding liquidity, the risk related to duration mismatch: renancing occurs on a short-term
frequency, but if short-term market conditions are bad, the entities need to sell the assets into
a depressed market. To cope with this risk, the sponsoring banks would grant credit lines.21
13.6.2.2 Credit ratings and model misspecication
The role of credit ratings was crucial to determine the riskiness of MBS-related derivatives, as
explained in Section 13.4.3. Credit ratings were endogenous in the securitization and tranching
process: they would not be applied to an exogenously given tranching scheme; rather, they
would be used to determine the riskiness of the tranching scheme. Another important point is
that many of these MBS securitized assets were illiquid, which did not facilitate pricing. Thus,
the pricing of structured products would rely on the pricing of products that were similarly
rated and for which quotes were available. For example, the price of AAA ABX subindices
would be used to estimate the value of AAA-rated tranches of MBS. Or, the price of BBB
subindices would be used to value BBB-rated MBS tranches. This is the mapping role credit
ratings played for the pricing of customized or illiquid structured credit products.
Yet it is well-known that the risk prole of structured products di ers from that of corporate
bonds. Even if a tranche has the same expected loss as an otherwise similar corporate bond,
unexpected loss or tail risk can be much larger than that for corporate bonds given the complexity of the product (see the Matryoska - Russian doll scheme below). All in all, it would be
misleading to extrapolate structured products ratings from corporate bonds ratings. Typically,
corporate bond ratings only capture the rst moments of the distribution. Finally, credit rating
inertia for bonds does not necessarily work for structured products. Rating deterioration for
structured products can travel very fast.22
Two additional fundamental aspects contributing to the meltdown. First, there was an erosion in lending standards: statistical models were based on historically low mortgage default
and delinquency rates that arose in a credit environment with tight credit standards. Second,
21 Typical sponsors at the time were Citibank ($100bn), JP Morgan Chase ($77bn), Bank of America ($60bn). In the European
Union: HBOS ($42bn), ABN Amro ($40bn), HSBC ($32bn).
22 See IMF 2008 report.
791
c
by
A. Mele
there were correlation issues: past data suggested a quite weak correlation between regional
mortgages, which made investors perceive a sense of diversication. However, the housing
market grinding to a halt turned up to be a nation-wide phenomenon.
13.6.2.3 The meltdown
One crucial element of the crisis was the market fear of contagion from the rising level of
defaults in subprime underlying instruments, many of which were incorporated in complex
products. Fears of contagion concerned safer tranches as well. They came from the investors
understanding the pricing models were misspecied, and their lack of trust vis-`a-vis the rating
agencies.
Banks were a ected for a number of reasons: (i) they had invested in subprime securities
directly; (ii) they had provided credit lines to SIV (indebted through commercial paper) and
conduits that held these securities, thereby creating a shadow banking system, which escaped
accounting and supervision rules; and (iii) this very same shadow system generated banks loss
of condence in the ability of their counterparties to meet their contractual obligations. So
the Asset Backed Commercial Paper market dried up, triggering credit lines. The result was a
sell-o of anything related to structured nance, from junk to AAA, which led to a complete
liquidity black hole, and a severe reappraisal of structured nance.
The reappraisal of structured nance determined severe writedowns, also due to a liquidity
black hole fueled by a di cult repricing. Indeed, in the absence of a liquid market, writedowns
largely rely on marking-to-model. But investors begun not to trust the models and the rating
process leading to them. Meanwhile, credit agencies proceeded to severe downgrades, conrming
the investors beliefs that previous ratings were based on misspecied assumptions, a quite selfreinforcing mechanism. These events escalated to a complete dry up in September-October
2008, partly restored by painful bank bail-outs and recapitalizations.
[In progress, explain Lehmans experience]
13.6.3 Top tier capital ratio targets and endogenous volatility
Treating volatility or credit risk as exogenous could be a good approximation whilst living
in good times. The quality of this approximation deteriorates in times of crisis. The implicit
assumption made in many instances of this and previous chapters, is that ones own actions,
based on a volatility forecast, do not a ect future volatility, just like forecasting weather does
not inuence future weather. Arguably, the actions of many heterogeneous market participants
should tend to cancel with each other, during periods of calm. However, market participants
tend to cluster their decision rules in periods of crisis. The literature on this endogenous risk
is quite fascinating, as are the surveys in Shin (2010), or the recent modeling framework put
forward by Danielsson, Shin and Zigrand (2011).
This section develops a simple model of endogenous risk, where markets can be destabilized
by one instance of procyclicality, arising because nancial institutions need to comply with a
given top tier capital (i.e. the capital comprising Tier 1 and Tier 2, using the terminology of
Section 13.6.1) ratio. After a negative shock in the value of the assets on the balance sheet, a
nancial instititution needs to restore its top tier capital ratio. In the short-run, the institution
can only restore this ratio through asset sales. Because every institution is doing the same,
these asset sales have a market impact, collectively, determining a further fall in the value of
the assets, and so on. The nal outcome is an increased volatility of the risky asset price, as well
as a disproportionate assets sell-o , if compared to the initial shocks triggering it. The model
792
c
by
A. Mele
is useful to think about the subprime events in 2007, as well as the ensuing credit crunch and
the new solutions that monetary authorities have experimented to help mitigate these adverse
developments, as we shall discuss.
13.6.3.1 Model
We consider a model with many identical nancial institutions complying with regulation or,
more generally, concerned with a pre-specied target of top tier capital ratio against risky
assets. Each institution has the following balance sheet.
Balance sheet, Time 1
The notation is a bit more elaborated than that used in Section 13.3: whilst we still dene
as equity (including past retained earnings) and
as debt, we now dene
as the value
to equal cash and
of risky assets, no matter how liquid these can be. Moreover, we dene
reserves. We suppose the 8% Cooke ratio is in place, or
0 08 and to simplify let = 0 08 .
Note that a top tier capital rule does not determine a leverage rule: there are, obviously, many
leverage ratios, , consistent with a given top tier capital ratio . In fact, the new Basel III
expicitly considers leverage ratios, thereby innovating upon Basel II, as discussed in Section
13.6.1. We shall deal with the procyclicality induced by this new rule later in this section.
Next, assume that some exogenous shock takes place, which makes the value of the risky
assets decrease by some amount,
, after which each institution would have the following
balance sheet.
Balance sheet, Time 2
But, each institution has to comply with its top tier ratio target. Therefore, at time 2, the
new top tier capital,
, must be at least 8% of the risky assets,
, or,
= 0 08 (
(13.65)
At time 1, the nancial institution had set = 0 08 . Therefore, Eq. (13.65) cannot hold, as
a simple computation reveals. The intuition behind this impossibility is simple. As the value
of the risky assets falls, the value of equity falls by a larger percentage than that of the risky
assets, due to leveragethe value of risky assets falls by
, whereas the value of equity falls
by a larger percentage,
, due to
.
Two solutions are available to the nancial institution: (i) to inject fresh capital; (ii) to sell
some of the risky assets. The rst solution is not quite viable in the short-run. Let us analyze
the second solution. We are looking for some quantity of the risky asset to sell, such that the
reduction in value of the assets, say s , is able to meet the top tier capital ratio target. In
terms of the balance sheet, we have the following situation.
Balance sheet, Time 3
s
793
c
by
A. Mele
How much of the risky asset value should the nancial institution precisely get rid o ? To
maintain the new top tier ratio, s must satisfy:
s
= 0 08 (
Using
yields,
1
8%
8%
= 11 5
That is, roughly, the number of risky assets to sell is proportional to the percentage loss in their
value. In general,
s
1
=
(13.66)
where denotes the top tier capital ratio against risky assets. This result is intuitive: following
a negative shock a ecting the value of the risky assets, the amount of asset sales is decreasing
in the pre-specied top tier capital ratio, , because the closer is to 100%, the easier the
adjustment is to maintain the same . Eq. (13.66) would end the description of this market, if
the nancial institution had no price impact.
We assume that nancial institutions have a market impact, collectively: while the behavior
of one single institution cannot a ect the price of the risky asset, many institutions doing the
same thing at the same time more likely could, thereby creating price pressures, with the price
of assets falling and triggering new sales into a depressed market, over a vicious feedback loop.
Is there an equilibrium for this loop? The answer relies on the way we think of selling pressure.
We model selling pressure by assuming that there is a continuum of nancial institutions, and
that the asset value changes according to:
s
= +
(13.67)
(13.68)
Note that this solution is also that arising when the market is perfectly liquid, = 0. However,
we assume the existence of price pressure, as formalized by Eq. (13.67), determined by the
concern nancial institutions have to comply with a top tier capital rule, leading them to a sale
S
satisfying Eq. (13.66). The loop we have created is, then, the following. After an initial shock
a ecting the risky asset value, , nancial institutions sell risky assets, to an extent proportional
to the percentage change in the asset value, according to Eq. (13.66). In turn, the sell-o entails
a further percentage drop in the asset value, as determined by Eq. (13.67). An equilibrium,
provided it exists, is a xed point to this feedback-loop, i.e. a situation where the loop stops
because the drops in the risky asset value do not happen anymore, and the assets sell-o s are
interrupted as a result, thereby providing no reasons for the asset value to fall any further.
794
c
by
A. Mele
(1
, leaving:
(13.69)
)
s
Note that in order for this equilibrium to exist, we need that , the slope of the line
7
in
s
Eq. (13.67), be less than 1 , the slope of the line
7
in Eq. (13.66), or that (1
)
.
Intuitively, if the price impact was too large, the feedback from asset sales to the asset value
drops would create a perverse spiral such that the market would collapse.
Eq. (13.69) shows, crucially, that the shock multiplier,
1. When nancial institutions
have a concern over top tier capital ratios, the ultimate change in the asset value resulting
from an initial shock, is larger than that we would have observed otherwise, say in Eq. (13.68).
For example, assuming a price impact = 0 05 implies a multiplier
= 2 4. Naturally, these
e ects become less important as the market becomes more liquid, and do not matter anymore
in the limit case where the market is perfectly liquid, = 0.
What is the amount of asset sales resulting from this loop? Replacing Eq. (13.69) into Eq.
(13.66) yields:
s
1
=
(13.70)
(1
)
Note that the feedback loop might extert quite substantial e ects into the amount of asset sales.
Assuming = 0 08, the multiplier
would equal 11 5 in the absence of feedback, = 0, as
previously noted. The same multiplier more than doubles in the presence of feedbacks and a
price impact of just = 0 05, attaining a value
= 27 1.
This model thus formalizes the idea that even a small shock a ecting the risky assets held by
nancial institutions might lead to large sale adjustments and price corrections, similarly as for
the developments inherent the subprime events described in the previous section. In the model,
the concerns nancial institutions have about top tier capital ratios leads them to substantial
asset sales in response to a shock, which are even more amplifed in the presence of feedback
e ects induced by liquidity frictions.
13.6.3.2 Multiple equilibria and market break-ups
In the model we analyze, an equilibrium exists under parameter restrictions that are independent of the realization of the initial shock, . As noted, we simply need that the denominators
of the multipliers
and
in Eqs. (13.69) and (13.70) be strictly positive. We now present a
variant of the model, where an equilibrium fails to exist when the initial shock is su ciently
large. We simply assume that the price pressure is nonlinear, di erently from the linearity
assumption underlying Eq. (13.67). It is:
s 2
= +
(13.71)
The quadratic term in Eq. (13.71) formalizes the idea that the price impact of asset sales does
not matter too much when the asset sales are limited, but becomes disproportionately high
when the asset sales are at a large scale. This convexity translates into a non-linearity of the
resulting feedback loop. Replacing Eq. (13.71) into Eq. (13.66), we nd that in any equilibrium,
the amount of asset sales is solution to the following quadratic equation
S
S 2
S
1
1
0=
(13.72)
795
c
by
A. Mele
We hypothesize the market is hit by a series of shocks. Initially, we assume that = 0, such
S
that two equilibria are possible: one, where the sell-o is just zero,
= 0; and another, where
the sell-o is (1 ) . We assume the sell-o is zero. Then, we assume that a rst positive shock
S
hits the market, with = 1%. The solid line in the next picture is the graph of
in this
S
S
:
= 0. We assume that the market
case. There are two equilibria, corresponding to
coordinates towards the leftmost one.
0.3
0.2
0.1
0.0
0.2
0.4
0.6
0.8
1.0
1.2
-0.1
1.4
1.6
sell-off
-0.2
-0.3
-0.4
-0.5
S
in Eq. (13.72), when the initial shock is
The three curves depict the graph of
= 1% (solid line), = 2% (dashed line), and = 5% (dotted line), obtained assuming
a top tier capital ratio against risky assets = 8%, and a price impact = 0 05. The
equilibrium sales are those where the curves intersect the horizontal axis, if any.
S
As the risky asset value is hit by one additional shock, say with = 2%, the graph of
shifts to South-East,
Sto the dashed line, and the asset sales increase as a result, still being the
leftmost zero of
. The market collapses when the shock is = 5%, leading the graph of
S
to a further shift to South-East, where no equilibria are left at all.
13.6.3.3 Deleveraging
The assumption made so far is that following a shock a ecting the assets in the balance sheet,
nancial institutions sell additional assets to the extent their top tier capital ratios are restored.
This section investigates additional adjustments, aiming to preserve leverage ratio targets. Denote the leverage ratio as
. The timing of the shock and banks reaction is as in Section
13.6.3.1, with the exception that we now have one additional constraint: at time 3, nancial
institutions also wish to call portions of their debt, l say, so as to comply with leverage ratio
targets, and achieve the following balance sheet.
Balance sheet, Time 3
under a deleveraging scenario with deep liquidity bu ers
s
796
c
by
A. Mele
The term s is the usual sell-o needed to maintain top tier capital targets. The nancial
l
institutions also target an amount of deleveraging l :
= , or:
l
(13.73)
and we initially assume that while doing so, they do not exaust their liquidity bu ers,
+
(13.74)
Note that under the condition in (13.74), deleveraging does not have a price impact because
it would imply banks are simply using cash to repay portions of their debt. In this case, s is
the same as that in Eq. (13.70). If, instead,
+ s is not large enough, nancial institutions
s
would have to sell additional assets,
say, so as to have su cient cash with which to meet
leverage ratio targets. Precisely, if the inequality in (13.74) does not hold, we need to have that,
at least,
s
s
:
= l ( s+ )
(13.75)
such that the balance sheet faced by the nancial institutions at time 3 would be as in the
following alternative scenario:
Balance sheet, Time 3
under a deleveraging scenario leading to exhausting liquidity bu ers
+
s
To determine the feedback e ects of the asset sales, s and
, replace the top tier capital
ratio condition, Eq. (13.66), and the leverage condition, Eq. (13.73), into Eq. (13.75), leaving,
s
(1 + )
(13.76)
Note that this expression is positive by assumptionwe are assuming that over the deleveraging
process, banks would exhaust their liquidity bu ers to the extent the condition in (13.74) does
not hold. Such a situation arises precisely due to a high leverage, . For example, assuming
= 0 08, the loading for
in Eq. (13.76) is positive only when
11 5. We would need to
observe values of larger than 11.5 (and state-dependent, i.e. depending on the realization of
), in order for the condition in (13.74) to break down.
We determine an equilibrium for the feedback loop in this market. Replace Eq. (13.76) into
s
s
Eq. (13.67), evaluated when the asset sales amount to
, i.e.
= +
, and solve for
both the asset value drop and sales,
s
((1 + )
1)
(13.77)
(1+ ) 1
and the constants
and
. To ensure an equilibrium exists,
((1+ ) 1)
((1+ ) 1)
s
we need that , the slope of the line
7
in Eq. (13.67), be less than (1+ ) 1 , the slope of
s
the line
7
in Eq. (13.76), or that the denominators of
and
be strictly positive.
Finally, note that there is a third possibility available to banks: to sell additional assets even
when the condition in (13.74) holds. This possibility is relevant when the nancial institutions
797
c
by
A. Mele
also have concerns over maintaining a certain level of liquidity bu ers, some of them possibly
being mandatory. For example, if the liquidity target is at least , we replace the inequality
l
in (13.74) with the stricter inequality, s
0. The solutions for the asset value drop and
sales are the same as those in Eqs. (13.77), but with the terms involving
being dropped,
s
=
, and
=
. The next picture depicts the two multipliers of for
and
s
arising in this case, assuming the top tier capital ratio target is = 0 08, and the price
impact of asset sales is = 0 05.
Shock
Sales 90
5
80
70
60
50
40
30
20
10
12
14
16
18
20
22
24
26
28
Leverage, L
12
14
16
18
20
22
24
26
28
Leverage, L
Loans make deposits! The well-known mechanics underlying the creation of money relies on the
standard money multiplier, whereby new deposits made available to the banking system are
partially used to extend new loans, which generate further deposits, and so on. Mathematically,
the supply of money, say M1 aggregates, includes cash held by the public plus deposits,
=
+ , with straight forward notation. Instead, the monetary base, or high potential money,
is made of cash held by the public plus banks cash and reserves,
=
+ , where
798
c
by
A. Mele
denotes cash and reserves held by banks, as in the previous sections. Note the leakage banks
create over the circuit of money creation: because banks face liquidity needs possibly arising
in the short-term (vis-`a-vis their clients and/or other banks), and possibly need to maintain
mandatory reserves with the central banks of the countries where they operate, they hoard ,
which escapes the loans-make-deposits loop.
We assume that the ratios
and
are constant and equal to and , respectively, such
that money supply equals a money multiplier times monetary base, viz
=
1+
+
(13.78)
The value of depends on a variety of factors, such as: (i) the discount rate at which a
central bank lends money to banks; (ii) the interbank rates such as the LIBOR; (iii) the level
of mandatory reserves banks have to keep with their central banks; (iv) the interbank rate
for resources to allocate to mandatory reservesthe Federal Funds rate in the US; (v) the
risk/return tradeo prevailaing in other markets. Clearly, the value of increases with the
values of the items from (i) to (iv), and decreases when the tradeo in (v) compares more
favourably to banks.
13.6.4.2 Policy actions
Conventional policy strategies consist of actions aiming to a ect the value of and . For
example, central banks can expand
through open market operations, by purchasing shortterm Government bonds. Note that this action likely a ects as well, as the opportunity costs of
holding excess reserves increase as markets are ooded with more and more liquidity. There are,
obviously, limits to this action, arising when short-term interest rates get close to zero. Consider
the 2007 subprime events. We know that following a shock a ecting the banks books, the overall
adjustments can be quite substantial. Consider, for example, the model in the previous section,
where banks have concerns over the top tier capital ratio. After the shock takes place, banks
aim to a shrinkage in the asset value equal to s . Moreover, the shrinkage can be even more
s
substantial, s +
, in markets where banks are high levered, and concerned about not
increasing their leverage even more as a result of the shock. The model is silent about which
s
particular assets (liquid or not) or loans are involved into the shrinkage plans, s or s +
.
We dene a credit crunch as the situation where banks decide to cut on corporate loans and
bondsthey hold more reserves, instead of lending money to the real sector.
A quite mechanical response to a credit crunch is to increase
through, say, open market
operations. Put simply, a credit crunch entails a higher value of , among other things. Monetary
aggregates, , are destroyed as a result, but can be restored through an injection of high
potential money. Precisely, by Eq. (13.78), the expansion of monetary base needed to maintain
the same supply of money, , is
= 1+
, where
is the increase in determined by
the credit crunch. This policy action is quite fundamental as it helps keep interest rates low, yet
it may not be enough when the credit crunch is so particularly severe to lead to very substantial
shrinkages in the economic activity, as we now explain.
The e ects of a credit crunch on the real sector of the economy are quite obvious, with an
increase in the cost of capital and a subsequent shrinkage in the economic activity. For example,
the recession following the subprime events was spectacular, with industrial production falling
by approximately 13% on a yearly basis in March 2009, the highest drop since World War II.
The policy action was equally impressive: in less than two years after the subprime events, the
FED was capable of pushing short-term rates close to zero although during those periods, it
799
c
by
A. Mele
was already clear that this policy would not be likely to prevent an even deeper recession. All
in all, even if the Federal Funds rate and short-term rates on safe assets were close to zero, the
cost of capital rms had to bear were quite substantial as a result of the credit crunch.23 Note
that at that time, the credit crunch was also exasperated by a freeze in the interbank lending
market, arising from concerns nancial institutions had about counterparty risk.24
The events following the 2007 turmoil can be described as those of a liquidity trap, where
banks hold abundant liquidity and short-term rates are close to zero. Note that the nature
of this liquidity trap is di erent from the standard Keynesian liquidity trap, as formalized in
the Appendix of Chapter 1: the Keynesian trap arises when money demand is at as a result
of the expectations investors have that future interest rates can only increase. In this case,
agents simply absorbe any liquidity injections made by the monetary authority, and interest
rates remain trapped at some minimum rate, coinciding with the lowest, shadow interest
rate beyond which no investors is ready to bet against a decrease. The liquidity trap we are
analyzing is di erent, and stems from the mere and mechanical circumstance that money supply
is so abundant to have made short-term rates close to zero in the rst place. However, in both
cases, the economy is trapped in that a further increase in the monetary base would have no
e ects on short-term interest rates.
What was the initial policy reaction to this liquidity trap? Note that a further issue arising
within the case we analyze in this section, is that the liquidity trap is accompanied by a surge in
corporate spreads, as a result of the credit crunch. Quantitative easing is a policy action that
aims to restore the shrinkage in credit supply, and possibly reduce these corporate spreads, so as
to mitigate the adverse e ects of the credit crunch on to the real economic activity. Consider the
following balance sheets. The balance sheet on the left-hand side is that arising after nancial
institutions have reacted to a shock in their asset value, as formalized in previous sections. The
portion of s that includes corporate loans and bonds is what we are terming credit crunch.
Balance sheets, Time 3 and beyond
s
The e ects of quantitative easing can be seen through the balance sheet on the right-hand
side. Banks can purchase new corporate bonds or extend new loans (securitized loans), which
the central bank immediately purchases, leading to the left arrow of the previous diagram.
The ideal, spontaneous, resolution of a credit crunch is when nancial institutions are willing to
extend corporate loans or purchase corporate bonds, to restore their credit shrinkage, at least
partially, by some amount s . However, we know this resolution cannot occur until the value
of the assets remains depressed. However, the central bank could step in to purchase the assets
the banking system is disliking, to an extent equal to at least s , so as to leave banks with the
excess reserves they wish and comply with their top tier capital ratio targets. This action is
the essence of quantitative easing, with assets typically involved being ABS and even long-term
23 A high cost of capital to rms might occur during a credit crunch, as a result of one additional e ect: a credit crunch leads to
a contraction in aggregate demand, which makes defaults more likely.
24 A shrinkage in the economy activity suggests a natural extension of the models in the previous section, with one additional
element of procyclicality: as the real economy plummets as a result of the credit crunch, the value of the assets decreases even more,
thereby deepening the credit crunch, over a vicious feedback loop. Note that this feedback mechanism would be, quite naturally,
part of the nancial accelerator hypothesis, although distinct from the mechanism mentioned at the beginning of this section, and
surveyed by Bernanke, Gertler and Gilchrist (1999). In the version we suggest, the credit crunch is determined by concerns nancial
intermediaries have about their top tier capital ratios and leverage exposures, rather than agency problems occurring over their
relationships with clients. We do not examine this additional source of procyclicality to keep the analysis as simple as we can.
800
c
by
A. Mele
bonds. Its e ects include both (i) an increase in , equal to the extent of the liquidity injection,
and (ii) higher incentives given to banks to renance the real sector, due to the liquidity bu ers
supplied by the central bank.
801
c
by
A. Mele
+E
(1
( )=E
0
(13A.1)
is the time at which the rm is liquidated. Eq. (13A.1) simply says that the value of debt
where
equals the expected coupon payments plus the expected liquidation value of the bond. We have:
Z
;
( )
(13A.2)
=
E
0
Z
= E
E
0
0
Z Z
;
=
0
0
Z
1
=
;
0
(1
( ))
802
(13A.3)
c
by
A. Mele
+ (Rec
( )) =
=0
with
( )=
where the rst term, ( ) , reects the change in the bond price arising from the mere passage of time,
and (Rec
( )) is the expected change in the bond price, arising from the event of default, i.e. the
probability of a sudden default arrival, , times the consequent jump in the bond price, Rec
( ).
The solution to the previous equation is,
Z
Rec
+
(0) =
| {z }
0
=Pr{Default at }
Rec 1
1
ln
( )=
With
= 1, and Rec =
1
ln
( )=
or equivalently,
( )=
Therefore, if
ln
, then, lim
, we have,
1
+
( ) = , and if
803
ln
ln
, lim
( )= .
+1
)
c
by
A. Mele
) for
(
), and write,
1+
=
6=
(13A.4)
(13A.5)
=1 6=
)=
( )= ( ) =
+
pieces, so to have
. We have,
For large ,
( ) = exp ( )
(13A.6)
P
( )
the matrix exponential, dened as, exp ( )
=0
! .
is the price of derivaTo evaluate derivatives written on states, we proceed as follows. Suppose
tive in state
{1
}. Suppose the Markov chain is the only source of uncertainty relevant for
the evaluation of this derivative. Then,
+[
where {1
}, with the usual conditional probabilities. In words, the instantaneous change
, is the sum of two components: one,
, related to the mere passage of
in the derivative value,
], related to the discrete change arising from a change in the rating.
time, and the other, [
Suppose that = 0. Then,
(
=0=
)=
=1
6=
(
0
0=
=
)=
), for all
X
0
+
{1
)+1
}. Naturally, we have
[ (
6=
6=
804
)]
0
6=
c
by
A. Mele
That is,
(13A.6).
)=
X
6=
X
6=
6=
, which solved through the appropriate boundary conditions, yields precisely Eq.
805
c
by
A. Mele
R
0
0
E0
E0
Rec ( )
(13A.7)
(
)=
+
0
|
{z
}
=Pr{Default (
+ )}
The term indicated inside the integral of the second term, is indeed the density of default time at ,
because,
R
0
(
)
=
1
E
default by time
such that by di erentiating with respect to , yields, under the appropriate regularity conditions, that
Pr{Default ( + )} is just the term indicated in Eq. (13A.7). So Eq. (13.38) follows. Naturally,
Pr{Default
LGD 1
+ Rec
)} =
surv (
surv (
(1
surv (
LGD)
)
surv (
where the second equality follows by integration by parts and the assumption of constant recovery
rates. Setting = 0, produces Eq. (13.39).
806
c
by
A. Mele
ln
)=
ln [1
LGD
LGD
LGD (1
surv (
1
4
surv (
1
P4
surv (
=1
))]
surv (
= 4 CDS0 ( )
Second, we show that approximately, bonds and CDS spreads are bounded away by a function
decreasing in time to maturity once the current is close to its long-term average under the riskneutral probability, . We illustrate this property while relying on arguments similar to those utilized
in Chapter
12 to address a related topic (see Section 12.3.4). For bond spreads, since E0 ( ) =
, we have, approximately, that:
+
1
ln
)=
LGD (1
1
ln 1 LGD 1
1
ln 1 LGD 1
ln [1
LGD
= LGD
0 E0 (
surv (
E0
))]
0 E0 (
) 1
Therefore, even if = , bond spreads are bounded away by a function decreasing in . Naturally, this property does not mechanically imply that bond spreads are decreasing in
too, although
the existence of such a bounding function helps this happening. As for the CDS spreads, we have,
approximately that:
4 CDS0 ( ) = LGD
1
4
1
P4
surv
=1
surv (
)
)
LGD
surv (
surv (
)
)
807
ln
surv
c
by
A. Mele
surv (
E0
and rescaled by . Regularity conditions under which we can perform this di erentiation can be found
in a related context developed in Mele (2003). Eqs. (13.41)-(13.42) follow.
As for Eq. (13.43), the proof follows the same lines of reasoning as that in Appendix 3 of Chapter
12. That is, we can dene a density process,
R
R
0
)
surv (
F
R
( )=
) E
surv (
0
E
It is easy to show that the drift of
surv
( )
=
( )
is
surv (
)))
where,
Vol (
surv (
))
surv (
surv (
) p
808
=
surv
is
c
by
A. Mele
1X
=1
= LGD
1X
=1
I{Surv
I{Surv
at
}E
at } I{Def
I{Def
)}
)}
= LGD
I{Def
)}
(13A.8)
where the last equality follows by the denition of the outstanding notional value in Eq. (13.56), and
the fact that the expectation in the rst equality is the same for each name , due to the assumption
that the index names have the same credit quality. Summing over the reset dates, = 1 4 ,
delivers the rst term in Eq. (13.55). The second term in Eq. (13.55) follows by elaborating the time
expectation of the second term in Eq. (13.53),
E
=E
I{Surv
at
I{Surv
at } I{ Surv at
|Surv at }
= I{Surv
at } E
I{ Surv
at
|Surv at }
and summing over the reset dates and all names, and using the fact that all names have the same
default intensities.
Proof of Eq. (13.57). The derivation of Eq. (13A.8) relies on default events occurring after the
swap origination, i.e. over the reset dates, after = 0 . In evaluating the front-end protection, we need
to price securities that pay o over defaults possibly occurring over the life of the swaption, i.e. before
time = 0 . We have,
F
=E
1
= LGD E
1
= LGD D (
= LGD
D(
X
I{Def
=1
) + LGD
1X
I{Surv
( (
=1
)+
)} + I{Surv
at
at } E
def
} I{Def
))
)}
I{ Surv
at
|Surv at }
where the third equality holds by the assumption that the names have the same credit quality, and
( + )
) and def (
)=E (
). Note that the rst term in the brackets
(
)=E (
of the second equality is, obviously, always zero, when the timing of possible defaults does not overlap
with the evaluation horizon, as for Eq. (13A.8).
Proofs regarding the survival contingent probability. We show that sc in Eq. (13.59)
F
in Eq. (13.58), is a martingale under
does integrate to one, and that CDX ( ) = LGD 01 +
1
809
c
by
A. Mele
1X
)=E
=1
1X
=1
I{Surv
I{Surv
at
at
}E
h
E I{ Surv
at
|Surv at
} F
For example, regarding the survival contingent probability sc , we have that, under regularity
conditions,
R
E
=E E
1
1 F
R
( + )
=
E
1
=
where the second equality follows by Eq. (13.50) and the third by the denition of sc in Eq. (13.51).
sc () the time conditional expectation
As for the martingale property of CDX ( ) under sc , let E
operator under the the survival contingent probability sc . We have, using the denition of sc in Eq.
(13.59),
F
0
sc
sc
sc
E (CDX ( )) = LGD E
+E
1
1
R
R
F
1
0
1
sc
+E
= LGD E
1
1
1
1
R
1
1
F
E
E
+
= LGD
0
1
= LGD
0
1
+
1
where the last equality follows by the Law of Iterated Expectations and Eq. (13.52),
R
=E E
E
0
0 F
R
( + )
=E
0
=
810
c
by
A. Mele
( )
=1 2
(13A.8)
are the cumulative marginal distribuwhere are the cumulative marginal distributions of , and
tions of . That is, for each , we look for the value of such that the percentiles arising through the
mapping in Eq. (13A.8) are the same. Then, we may assume that 1 and 2 have a joint distribution
and model the correlation between 1 and 2 through the correlation between 1 and 2 . This indirect
way to model the correlation between 1 and 2 is particularly helpful. It might be used to model the
correlation of default times, as in the main text of this chapter. We now explain.
B. Copulae functions
We begin with the simple case of two random variables, This simple case shall be generalized to the
multivariate one with a mere change in notation. Given two uniform random variables 1 and 2 ,
consider the function ( 1 2 ) = Pr ( 1
1
2
2 ), which is the joint cumulative distribution of
the two uniforms. A copula function is any such function , with the property of being capable to
into a summary of them, in the following natural way:
aggregate the marginals
(
1 ( 1)
2 ( 2 ))
2)
(13A.9)
1 ( 1)
2 ( 2 ))
= Pr (
= Pr
= Pr (
1 ( 1)
1
1
1
1
1
1)
1
2)
1
2
2 ( 2 ))
2
1
2
2)
2)
(13A.10)
That is, a copula function evaluated at the marginals 1 ( 1 ) and 2 ( 2 ) returns the joint density
( 1 2 ). In fact, Sklar (1959) proves that, conversely, any multivariate distribution function can
be represented through some copula function.
The most known copula function is the Gaussian copula, which has the following form:
1
1
( 1 2) =
(13A.11)
1 ( 1)
2 ( 2)
where denotes the joint cumulative Normal distribution, and denotes marginal cumulative Normal
distributions. So we have,
1
1
(13A.12)
( 1 2 ) = ( 1 ( 1 ) 2 ( 2 )) =
1 ( 2 ( 2 ))
2 ( 2 ( 2 ))
811
c
by
A. Mele
where the rst equality follows by Eq. (13A.10) and the second equality follows by Eq. (13A.11).
As an example, we may interpret 1 and 2 as the times by which two names default. A simple
assumption is to set:
( )= ( )
=1 2
(13A.13)
for two random variables
that are stretched as explained in Part A of this appendix. By replacing
Eq. (13A.13) into Eq. (13A.12),
( 1 2) = ( 1 2)
This reasoning can be easily generalized to the
(
)=
1 ( 1)
( )=
( )
)) =
where
:
We use this approach to model default correlation among names, as explained in the main text, and
in the next appendix.
812
c
by
A. Mele
where
0 10 )
10%
where is the cumulative distribution of a standard normal variable. That is, by time
defaults any time that,
1
(10%)
0 10
, each rm
Def ) 100
P1
= min max
0
=1
where
= 140,
= 90,
= 70).
1X
Price Mezzanine =
=1
1X
=1
Price Junior =
1X
=1
Note, the previous computations have to be performed under the risk-neutral probability . Using
the probability in the previous algorithm can only be lead to something useful for risk-management
and VaR calculations at best
Note, this model, can be generalized to a multifactor model where,
p
=
+ 1
1 1 + +
1
with obvious notation.
813
c
by
A. Mele
References
Amato, J. D. (2005): Risk Aversion and Risk Premia in the CDS Market. BIS Quarterly
Review, September, 55-68.
Anderson, R. W. and S. Sundaresan (1996): Design and Valuation of Debt Contracts. Review
of Financial Studies 9, 37-68.
Artzner, P., F. Delbaen, J.-M. Eber, and D. Heath (1999): Coherent Measures of Risk.
Mathematical Finance 9, 203-228.
Bernanke, B. S., M. Gertler and S. Gilchrist (1999): The Financial Accelerator in a Quantitative Business Cycle Framework. In J. B. Taylor and M. Woodford (Eds.): Handbook of
Macroeconomics, Vol. 1C, Chapter 21, 1341-1393.
Berndt, A., R. Douglas, D. Du e, M. Ferguson and D. Schranz (2005): Measuring Default
Risk-Premia from Default Swap Rates and EDFs. BIS Working Papers no. 173.
Black, F. (1976): The Pricing of Commodity Contracts. Journal of Financial Economics 3,
167-179.
Black, F. and J. Cox (1976): Valuing Corporate Securities: Some E ect of Bond Indenture
Provisions. Journal of Finance 31, 351-367.
Black, F. and M. Scholes (1973): The Pricing of Options and Corporate Liabilities. Journal
of Political Economy 81, 637-659.
Borio, C., C. Furne, and P. Lowe (2001): Procyclicality of the Financial System and Financial
Stability: Issues and Policy Options. Bank for International Settlements working paper
no. 1.
Broadie, M., M. Chernov and S. Sundaresan (2007): Optimal Debt and Equity Values in the
Presence of Chapter 7 and Chapter 11. Journal of Finance 62, 1341-1377.
Christo ersen, P. F. (2003): Elements of Financial Risk Management. Academic Press.
Cox, J. C., J. E. Ingersoll and S. A. Ross (1985): A Theory of the Term Structure of Interest
Rates. Econometrica 53, 385-407.
Danielsson, J., Shin, H. S. and J.-P. Zigrand (2011): Balance Sheet Capacity and Endogenous
Risk. Working paper London School of Economics and Princeton.
Du e, D. and D. Lando (2001): Term Structure of Credit Spreads with Incomplete Accounting Information. Econometrica 69, 633-664.
Fender, I. and P. Hordahl (2007): Overview: Credit Retrenchement Triggers Liquidity Squeeze.
BIS Quarterly Review (September), 1-16.
Fisher, I. (1933): The Debt-Deation Theory of Great Depressions. Econometrica 1, 337-57.
Friedman, M. and A. J. Schwartz (1963): A Monetary History of the United States: 1867-1960.
Princeton, NJ: Princeton University Press.
814
c
by
A. Mele
Greenspan, A. (1998): The Role of Capital in Optimal Banking Supervision and Regulation.
FRBNY Economic Policy Review, October, 163-168.
Hull, J. C. (2007): Risk Management and Financial Institutions. Pearson Education International.
Ingersoll, J. E. (1977): A Contingent-Claims Valuation of Convertible Securities. Journal of
Financial Economics 5, 289-321.
International Monetary Fund, (2008): Global Financial Stability Report. April 2008.
Jamshidian, F. (1989): An Exact Bond Option Pricing Formula. Journal of Finance 44,
205-209.
Jarrow, R. A., D. Lando and S. M. Turnbull (1997): A Markov Model for the Term-Structure
of Credit Risk Spreads. Review of Financial Studies 10, 481-523.
Jorion, Ph. (2008): Value at Risk. New York: McGraw Hill.
Lando, David, 2004. Credit Risk ModelingTheory and Applications. Princeton: Princeton
University Press.
Leland, H. E. (1994): Corporate Debt Value, Bond Covenants and Optimal Capital Structure. Journal of Finance 49, 1213-1252.
Leland, H. E. and K. B. Toft (1994): Optimal Capital Structure, Endogenous Bankruptcy,
and the Term Structure of Credit Spreads. Journal of Finance 51, 987-1019.
Lopez, J. (2004): The Empirical Relationship Between Average Asset Correlation, Firm Probability of Default and Asset Size. Journal of Financial Intermediation 13, 265-283.
McDonald, R. L. (2006): Derivatives Markets, Boston: Pearson International Edition.
Mele, A. (2003): Fundamental Properties of Bond Prices in Models of the Short-Term Rate.
Review of Financial Studies 16, 679-716.
Merton, R. C. (1974): On the Pricing of Corporate Debt: The Risk-Structure of Interest
Rates. Journal of Finance 29, 449-470.
Modigliani, F. and M. Miller (1958): The Cost of Capital, Corporation Finance and the
Theory of Investment. American Economic Review 48, 261-297.
Morini, M. and D. Brigo (2011): No-Armageddon Arbitrage-Free Equivalent Measure for
Index Options in a Credit Crisis. Mathematical Finance 21, 573-593.
Rutkowski, M. and A. Armstrong (2009): Valuation of Credit Default Swaptions and Credit
Default Index Swaptions. International Journal of Theoretical and Applied Finance 12,
1027-1053.
Schonbucher, Ph J., 2003. Credit Risk Pricing ModelsModels, Pricing and Implementation.
Chichester, UK: Wiley Finance.
815
c
by
A. Mele
Shin, H. S. (2010): Risk and Liquidity. Clarendon Lectures in Finance, Oxford University
Press.
Sklar, A. (1959): Fonction de Repartition a` dimensions et Leurs Marges. Publications de
lInstitut Statistique de lUniversite de Paris 8, 229-231.
Vasicek, O. (1987): Probability of Loss on Loan Portfolio. Working paper KMV, published
in: Risk (December 2002) under the title Loan Portfolio Value.
816