Train Control

Analysis of necessary conditions for the
optimal control of a train
by
XUAN VU
B.Info.Tech. (Hons)
Thesis submitted for the degree of
Doctor of Philosophy (Mathematics)
School of Mathematics and Statistics

Division of Information Technology Engineering and the Environment
University of South Australia
July 2006
Contents
Table of Contents i
List of Figures viii
List of Tables xiii
Summary xiv
Declaration xvii
Acknowledgements xviii
I The train control problem 1
1 Introduction 2
1.1 Project aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 History of optimal train control . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Uniqueness of the optimal control strategy . . . . . . . . . . . . . . . . . . 9
1.4 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Optimal journeys 12
2.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
i
Contents ii
2.2 Necessary conditions for an optimal journey . . . . . . . . . . . . . . . . . 14

2.2.1 State and adjoint equations for an optimal journey . . . . . . . . . 19
2.2.2 Possible control transitions . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Example journeys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Phase diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Steep hills and limiting speeds 29

3.1 Steep sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Limiting speed at a constant power on a constant gradient . . . . . . . . . 32
3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Uniqueness of the optimal strategy on a non-steep track 34

4.2 Uniqueness of the final coast–brake phase . . . . . . . . . . . . . . . . . . 37
4.3 Uniqueness of the optimal journey . . . . . . . . . . . . . . . . . . . . . . 42
4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
II A new method for finding optimal driving strategies 50
5 Hold-power-hold for a single steep uphill gradient 51

5.2 Necessary conditions for an optimal power phase . . . . . . . . . . . . . . 53
5.3 An alternative necessary condition . . . . . . . . . . . . . . . . . . . . . . 54
5.3.1 Key equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.4 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4.1 Geometric approach . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4.2 Algebraic approach . . . . . . . . . . . . . . . . . . . . . . . . . . 66
ii
Contents iii
5.5 Numerical solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6 Hold-power-hold on an uphill section with two steep gradients 75

6.2 Necessary conditions for an optimal strategy . . . . . . . . . . . . . . . . . 78
6.3 Existence and uniqueness of an optimal solution . . . . . . . . . . . . . . . 82
6.3.1 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7 Hold-power-hold on an uphill section with piecewise constant gradient 91

7.2.1 Calculating f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.2.2 Finding the optimal power phase . . . . . . . . . . . . . . . . . . . 98
7.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8 Hold-coast-hold for a single steep downhill gradient 105

8.2 Necessary conditions for an optimal coast phase . . . . . . . . . . . . . . . 107
8.3 An alternative necessary condition . . . . . . . . . . . . . . . . . . . . . . 107
8.4 Key equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.5 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8.5.1 Geometric approach . . . . . . . . . . . . . . . . . . . . . . . . . 112
iii
Contents iv
8.5.2 Algebraic approach . . . . . . . . . . . . . . . . . . . . . . . . . . 114

8.5.3 Numerical solution . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9 Hold-coast-hold on a downhill section with two steep gradients 124

9.2 Necessary conditions for an optimal strategy . . . . . . . . . . . . . . . . . 127
9.3 Existence and uniqueness of an optimal solution . . . . . . . . . . . . . . . 131
9.3.1 Existence of an optimal solution . . . . . . . . . . . . . . . . . . . 131
9.3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
10 Hold-coast-hold on a steep downhill section with many gradients 139

10.1.1 Numerical solution . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
10.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
III Phase trajectories 148
11 Phase trajectories on a steep uphill section 149

11.1 An exact integral for the adjoint equation . . . . . . . . . . . . . . . . . . 150
11.2 Necessary conditions for an optimal power phase . . . . . . . . . . . . . . 152
11.2.1 A single steep gradient . . . . . . . . . . . . . . . . . . . . . . . . 153
11.2.2 Two steep gradient sections . . . . . . . . . . . . . . . . . . . . . 156
11.3 Properties of η(v) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
iv
Contents v
11.4 Phase diagram for (v, η) . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

11.5 Phase diagrams of combinations of long and short steep sections . . . . . . 170
11.5.1 γ11 = 0.3, L11 = 500, γ12 = 0.2, L12 ∈ {400, 1000, . . . , 3400} . . 171
11.5.2 γ11 = 0.3, L11 = 1500, γ12 = 0.2, L12 ∈ {200, 600, . . . , 2200} . . . 172
11.5.3 γ11 = 0.3, L11 ∈ {200, 600, . . . , 2200}, γ12 = 0.2, L12 = 1000 . . . 173
11.5.4 γ11 = 0.3, L11 ∈ {700, 1200, . . . , 3200}, γ12 = 0.2, L12 = 1600 . . 174
11.5.5 γ11 = 0.2, L11 = 1200, γ12 = 0.3, L12 ∈ {200, 600, . . . , 2200} . . . 175
11.5.6 γ11 = 0.2, L11 = 4000, γ12 = 0.3, L12 ∈ {200, 600, . . . , 2200} . . . 176
11.5.7 γ11 = 0.2, L11 ∈ {600, 1400, . . . , 4600}, γ12 = 0.3, L12 = 400 . . . 177
11.5.8 γ11 = 0.2, L11 ∈ {200, 600, . . . , 2200}, γ12 = 0.3, L12 = 1200 . . . 178
11.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
12 Phase trajectories on a steep downhill section 180

12.1 An exact integral for the adjoint equation . . . . . . . . . . . . . . . . . . 180
12.2 Necessary conditions for an optimal coast phase . . . . . . . . . . . . . . . 181
12.2.1 A single steep gradient section . . . . . . . . . . . . . . . . . . . . 182
12.2.2 A section of two steep gradients . . . . . . . . . . . . . . . . . . . 182
12.3 Properties of η(v) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
12.4 Phase diagram for (η, v) . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
12.5 Phase diagrams for combinations of long and short steep downhill gradients. 189
12.5.1 γ11 = 0.2, L11 = 1200; γ12 = 0.4, L12 ∈ {800, 4000, . . . , 23200}. . 190
12.5.2 γ11 = 0.2, L11 = 3000; γ12 = 0.4, L12 ∈ {1600, 3200, . . . , 8000}. . 191
12.5.3 γ11 = 0.2, L11 ∈ {18000, 19200, . . . , 24000} ; γ12 = 0.4, L12 = 400. 192
12.5.4 γ11 = 0.2, L11 ∈ {12000, 18000, . . . , 42000} , γ12 = 0.4, L12 = 1600. 193
12.5.5 γ11 = 0.4, L11 = 1800; γ12 = 0.2, L12 ∈ {2400, 4800, . . . , 14400}. 194
12.5.6 γ11 = 0.4, L11 = 3200; γ12 = 0.2, L12 ∈ {2400, 4800, . . . , 14400}. 195
12.5.7 γ11 = 0.4, L11 ∈ {1800, 5400, . . . , 19800}; γ12 = 0.2, L12 = 800. . 196
12.5.8 γ11 = 0.4, L11 ∈ {5600, 9400, 13000, 19800}; γ12 = 0.2, L12 = 5600. 197
12.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
v
Contents vi
13 Coasting and braking at the end of a journey 199

13.1 Coasting and braking on a constant gradient . . . . . . . . . . . . . . . . . 200
13.2 Final coasting and braking phases on a piecewise constant gradient . . . . . 202
13.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
13.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
IV Parameter estimation 214
14 Parameter estimation 215

14.2 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
14.3 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
14.4 Nonlinear Projection Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 220
14.5 Unscented Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
14.6 H∞ Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
14.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
15 Nonlinear Projection Filter 225

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
15.2 A simple example without diffusion . . . . . . . . . . . . . . . . . . . . . 226
15.2.1 Exact solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
15.2.2 Approximation using a Fast Fourier Transform . . . . . . . . . . . 228
15.2.3 Solution using a Fourier Cosine Series . . . . . . . . . . . . . . . 234
15.2.4 Approximation using Discrete Cosine Transform . . . . . . . . . . 241
15.3 A simple example with diffusion . . . . . . . . . . . . . . . . . . . . . . . 249
15.4 A simple train example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
15.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
16 Unscented Kalman Filter 254
vi
Contents vii
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

16.1.1 The Unscented Transformation . . . . . . . . . . . . . . . . . . . . 255
16.2 Estimating train parameters with an UKF . . . . . . . . . . . . . . . . . . 256
16.2.1 Define the problem . . . . . . . . . . . . . . . . . . . . . . . . . . 256
16.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
16.2.3 The role of the covariance matrix . . . . . . . . . . . . . . . . . . 261
16.2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
17 Conclusions 267
A First variation equation 269

A.1 Existence and uniqueness of solution for a system of ordinary differential
equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
A.2 The variational equation for a normal system . . . . . . . . . . . . . . . . 271
Bibliography 273
vii
List of Figures
1.1 FreightMiser screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1 Diagram of possible transitions 1 . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Diagram of possible transitions 2 . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Speed profile and adjoint variable profile 1 . . . . . . . . . . . . . . . . . 23
2.4 Speed profile and adjoint variable profile 2 . . . . . . . . . . . . . . . . . 24
2.5 Speed profile of a final coast and brake phase . . . . . . . . . . . . . . . 25
2.6 Phase diagram for a steep uphill and a steep downhill sections . . . . . . 25
2.7 Zoom into a part of phase diagram . . . . . . . . . . . . . . . . . . . . . 26
2.8 Phase diagram for power phase in non-optimal cases . . . . . . . . . . . 27
2.9 Phase diagram for a coast phase in non-optimal cases . . . . . . . . . . . 27
3.1 Speed decreases on a steep uphill section . . . . . . . . . . . . . . . . . . 30

3.2 Speed increases on a steep downhill section . . . . . . . . . . . . . . . . 30
3.3 Limiting speeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1 Optimal journeys for different hold speeds . . . . . . . . . . . . . . . . . 36

4.2 Perturbed speed profile for coasting and braking . . . . . . . . . . . . . . 39
4.3 Speed profiles of two optimal journeys on a non-steep track . . . . . . . . 42
4.4 Speed profiles of two optimal journeys on a non-steep track 3 . . . . . . . 44
viii
4.7 Speed profiles of power–coast–brake journeys . . . . . . . . . . . . . . . 45
4.8 Optimal journeys for different hold speeds . . . . . . . . . . . . . . . . . 47
4.9 Speed profiles of journeys with different hold speeds . . . . . . . . . . . 48
4.10 Speed profiles of journeys with different hold speeds . . . . . . . . . . . 48
5.1 Optimal speed profile for a single steep gradient . . . . . . . . . . . . . . 53

5.2 Plot of cost function J . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Diagram of a convex curve and its tangent . . . . . . . . . . . . . . . . . 66
5.4 Speed profile when v(b) = v̄b . . . . . . . . . . . . . . . . . . . . . . . . 67
5.5 Result of Example 5.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1 Speed profile of γ12 < γ11 track . . . . . . . . . . . . . . . . . . . . . . . 77

6.2 Speed profile of γ11 < γ12 track . . . . . . . . . . . . . . . . . . . . . . . 77
6.3 Calculation squence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.4 Result of example 6.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.1 Track gradient profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7.2 Calculation sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3 Speed decreases on a non-steep section . . . . . . . . . . . . . . . . . . . 96
7.4 Speed increases on a steep section . . . . . . . . . . . . . . . . . . . . . 97
7.5 Optimal speed profile for example 7.3.1 . . . . . . . . . . . . . . . . . . 99
7.7 Function f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.8 Speed decreases on a non-steep section . . . . . . . . . . . . . . . . . . . 103
8.1 Track profile and speed profile . . . . . . . . . . . . . . . . . . . . . . . 106

8.2 Cost J for various starting locations . . . . . . . . . . . . . . . . . . . . . 108
8.3 Illustration of the geometric proof . . . . . . . . . . . . . . . . . . . . . 115
8.4 Illustration of domain of f 1 . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.5 Illustration of domain of f 2 . . . . . . . . . . . . . . . . . . . . . . . . . 116
ix
9.1 Speed profile of two steep sections 1 . . . . . . . . . . . . . . . . . . . . 126

9.2 Speed profile of two steep sections 2 . . . . . . . . . . . . . . . . . . . . 126
9.3 Calculation sequence for hold-coast-hold . . . . . . . . . . . . . . . . . . 136
9.4 f for example 9.5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.1 Gradient change points for a general steep downhill section . . . . . . . . 140
10.2 Calculation sequence for f . . . . . . . . . . . . . . . . . . . . . . . . . 144
10.3 Results of Example 10.2.1. . . . . . . . . . . . . . . . . . . . . . . . . . 146
11.1 Speed and elevation profiles with one steep gradient . . . . . . . . . . . . 154
11.2 Speed and elevation profile with two steep gradients . . . . . . . . . . . . 157
11.3 Phase diagram of a uphill section . . . . . . . . . . . . . . . . . . . . . . 162
11.4 Phase plots 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.5 Phase diagrams obtained using the traditional method and the new method. 167
11.6 Phase diagrams of optimal and non-optimal strategies . . . . . . . . . . . 168
11.7 Phase diagrams and critical curves 1 . . . . . . . . . . . . . . . . . . . . 169
11.8 Phase diagrams of different journeys 2 . . . . . . . . . . . . . . . . . . . 171
11.9 Phase diagrams of long-steep-normal-less-steep sections . . . . . . . . . 172
11.10 Phase diagrams of long-steep-short-less-steep sections . . . . . . . . . . 173
11.11 Phase diagrams of short-steep-long-less-steep sections . . . . . . . . . . 174
11.12 Phase diagrams of short-steep-normal-steeper sections . . . . . . . . . . 175
11.13 Phase diagrams of long-steep-normal-steeper-hills . . . . . . . . . . . . . 176
11.14 Phase diagrams of normal-steep-short-steeper sections . . . . . . . . . . 177
11.15 Phase diagrams of normal-steep-long-steeper sections . . . . . . . . . . . 178
12.1 Track profile and speed profile . . . . . . . . . . . . . . . . . . . . . . . 182

12.2 Optimal speed profile and track profile of a downhill section . . . . . . . 183
12.3 Phase diagram of a downhill section . . . . . . . . . . . . . . . . . . . . 185
x
12.4 Phase diagrams for coast phase . . . . . . . . . . . . . . . . . . . . . . . 187
12.5 Optimal adjoint variable profiles . . . . . . . . . . . . . . . . . . . . . . 188
12.6 Optimal and non-optimal adjoint variable profiles . . . . . . . . . . . . . 188
12.7 Phase diagrams of short steep and various, steeper sections . . . . . . . . 190
12.8 Phase diagrams of long steep and various steeper sections . . . . . . . . . 191
12.9 Phase diagrams of long various and short steep sections . . . . . . . . . . 192
12.10 Phase diagrams for long various and short steeper section . . . . . . . . . 193
12.11 Phase diagrams of long steep and various-less-steep sections . . . . . . . 194
12.12 Phase diagrams of long steep and long-various-less-steep sections . . . . 195
12.13 Phase diagrams of long various steep and short-less-steep-sections . . . . 196
12.14 Phase diagrams of long various steep and long-less-steep sections . . . . 197
13.1 Track diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

13.2 Track and speed profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 205
13.3 Track and speed profiles . . . . . . . . . . . . . . . . . . . . . . . . . . 206
13.4 Speed profiles of different holding speed journeys 2 . . . . . . . . . . . . 212
15.1 Kolmogorov solution using FFT without diffusion at t = 0 . . . . . . . . 231

15.2 Kolmogorov solution using FFT without diffusion after one iteration . . . 231
15.3 Kolmogorov solution using FFT without diffusion after two iterations . . 232
15.4 Example 1 results using FFT without diffusion at t = 0 . . . . . . . . . . 232
15.5 Example 1 results using FFT without diffusion at time t = x j − xm . . . 233
15.6 Example 1 results using FFT without diffusion at time t = x j − xm . . . 233
15.7 Example 1 using FCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
15.8 Example 1 results using DCT after one time step . . . . . . . . . . . . . 248
15.9 Example 1 using DCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
15.10 Solution to Kolmogorov’s equation using FFT with diffusion . . . . . . . 251
15.11 Train example using KE 1 . . . . . . . . . . . . . . . . . . . . . . . . . 252
15.12 Train example using KE 2 . . . . . . . . . . . . . . . . . . . . . . . . . 252
xi
16.1 Estimations on flat track with no process noise . . . . . . . . . . . . . . . 259
16.2 Estimations on flat track no process noise, P 1 . . . . . . . . . . . . . . . 260
16.3 Mean filter error parallel coordinates . . . . . . . . . . . . . . . . . . . . 264
16.4 Estimation on flat track no process noise 1 . . . . . . . . . . . . . . . . . 264
16.5 Estimations on flat track no process noise 2 . . . . . . . . . . . . . . . . 265
16.6 Evolutions of covariances of initial covariance matrix . . . . . . . . . . . 266
xii
List of Tables
2.1 Summary of optimal control modes . . . . . . . . . . . . . . . . . . . . . 19

2.2 Summary of optimal control modes with modified adjoint variable . . . . . 20
5.1 Experimental results of hold-power-hold on a single steep uphill. . . . . . . 73
6.1 Experimental results of hold-power-hold on a double steep uphill section,

Example 6.5.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.1 Optimal speeds and limiting speeds for Example 7.3.2. . . . . . . . . . . . 101
7.2 Track data for the example of hold-power-hold on multi steep uphill section. 102
8.1 Experimental results of hold-coast-hold on a single steep downhill section. . 123
9.1 Experimental results of an optimal journey on a double steep downhill sec-

tion, Example 9.5.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.1 Track data and results of hold-coast-hold on a multi steep downhill section. 145
11.1 Eight combinations of long and short steep sections. . . . . . . . . . . . . . 170
12.1 Eight combinations of long and short steep sections. . . . . . . . . . . . . . 189
16.1 Relative errors for different values of the initial covariance matrix . . . . . 263
xiii
Summary
The Scheduling and Control Group at the University of South Australia has been
studying the optimal control of trains for many years, and has developed in-cab
devices that help drivers stay on time and minimise energy use.
Previous work formulated the problem of minimising the energy required to

drive a long-haul freight train over an undulating track with speed restrictions,
and developed necessary conditions for an optimal control. It is possible to
construct complete train journeys that satisfy these optimal conditions. However,
the previous work did not prove that these solutions were always unique.
In this thesis, we re-examine the optimal control theory for the train control prob-
lem. In particular, we study the optimal control around steep sections of track.
This work is introduced in the first part of the thesis along with some preliminary
analysis. An uphill section is said to be ‘steep’ if the train has insufficient power
to maintain a desired constant holding speed on the hill. In such cases the opti-
mal control requires that full power is applied at some point before the steep hill
is reached and is maintained until some point beyond the steep hill. The speed
increases above the holding speed before the steep section commences and then
falls to a level below the holding speed while the train traverses the steep uphill
section before finally increasing and eventually returning to the holding speed
after the train leaves the steep uphill section.
Similarly, a downhill section is said to be steep if the speed of the train increases
even when no power is supplied. In such cases the optimal control requires that
the train coasts at some point before the steep downhill section; speed decreases
before the section, increases on the steep section, then decreases again after the
xiv
steep section and finally returns to the holding speed at some point beyond the
steep downhill section.
A critical part of constructing an optimal control strategy is determining where

the control should be changed. These switching points are normally determined
[24, 33] from necessary conditions derived via the Pontryagin’s Maximum Prin-
ciple. In the second part of this thesis we use perturbation analysis to develop
an alternative set of necessary conditions for an optimal driving strategy near a
steep section. The new necessary condition indicates that the optimal strategy
involves a specific local trade-off between the energy used and time taken on
each steep section that the train traverses. We also develop a new numerical
procedure for calculating the optimal switching points.
For a simple steep section, we are able to show that the optimal control strategy
is unique. Proving uniqueness is more difficult for steep sections with two or
more steep gradients and more work is required before these issues can be fully
resolved.
In the third part of the thesis we use phase trajectories to study optimal control
strategies. The optimal control for a journey is determined by the evolution of
an adjoint variable, η. The state of the journey can be characterised by the speed
v and the adjoint variable η. Note there are two different but equivalent adjoint
variables θ and η = θ − 1. We use θ for final coast and brake and η other-
wise. The evolution of the state (v, η) is a phase trajectory. By constructing
phase trajectories for various scenarios, we are able to understand the relation-
ship between η and v for an optimal strategy. In the case of a piecewise constant
gradient we give an explicit formula for η as a function of v.
To calculate an optimal driving strategy, we need a realistic model of train per-

formance. In particular, we need to know a coefficient of rolling resistance and
xv
a coefficient of aerodynamic drag. In practice, these coefficients are different
for every train and difficult to predict. In the fourth and final part of the the-
sis, we study the use of mathematical filters to estimate model parameters from
observations of actual train performance.
xvi
Declaration
I declare that this thesis does not incorporate without acknowledgment any ma-
terial previously submitted for a degree or diploma in any university; and that to
the best of knowledge it does not contain any materials previously published or
written by another person except where due reference is made in the text.
Xuan Vu
xvii
Acknowledgements
Many people have made a contribution, whether small or large, to this project.
Most especially and foremost I would like to thank my Principal Supervisor,
Professor Phil Howlett, for helping me to obtain an APA scholarship for my PhD
study. His patient guidance, tireless support and invaluable advice have helped
me stay strong to go to the end of my PhD journey. From him I have learnt not
only mathematics but also what is required to become a good researcher. He has
been a continual source of ideas and potential solution procedure.
I am very grateful to my Associate Supervisor, Dr Peter Pudney, for his constant

help, advice and encouragement. He is exceedingly generous in answering my
endless, annoying questions about train technology. His constructive feedback,
helpful discussion over the thesis contributed a big part in making it possible. I
am also thankful for his sustenance, inspiration and unique sense of humor that
have helped me through my PhD years.
I would like to thank Mr Basil Benjamin who extensively reviewed the thesis
and suggested many improvements. His sincere encouragement helped me con-
fidently complete this project.
I wish to express my sincere appreciation to the Rail CRC for providing finan-
cial support and opportunities to attend conferences and workshops during my
doctoral studies. Working in a Rail CRC project has given me great experience
and enriched my research career. I also would like to send special thanks to Dr
Anna Thomas for her heartfelt advice and encouragement.
My deep appreciations and thanks belong to all staff members and postgraduate
xviii
students in the School of Mathematics and Statistics at the University of South
Australia (you know who you are), to the Head of School, Associate Professor
David Panton and to the Director of the CIAM, Associate Professor Vladimir
Gaitsgory, for their help, support, encouragement and friendship.
Finally I am indebted to my family for their unconditional love and support

during my studies. They are sources of strength for me to be able to complete
the project. They are always there for me whenever I need help, patiently listen
to all my sorrows, incessantly believe in me even when I sometimes lose belief
in myself.
xix
Part I
The train control problem
1
Chapter 1
Introduction
Trains are an efficient form of transport—the energy per kilometre required to transport
a given mass by rail is one third that required by even the most efficient road transport.
Nevertheless, energy cost is a significant part of the operating cost of a railway, and so
minimising energy use is an important problem. Researchers at the University of South
Australia and elsewhere have been studying the problem of determining optimal control
strategies for many years. At the University of South Australia, this research has led to the
development of in-cab equipment that helps drivers stay on time and reduce energy use.
Although much is known about optimal train control, there are still some gaps in this knowl-
edge. Howlett and Pudney have shown that an optimal strategy always exists. Their argument
is given in a paper [23] on optimal control of a solar powered racing car, but can be easily
modified to apply to a train. The practical driving strategies used in our in-cab devices are
based on necessary conditions for an optimal strategy. We have not yet been able to prove
that these conditions are sufficient for an optimal journey but this will follow if it can be
shown that the strategies are uniquely defined by the necessary conditions.
2
Chapter 1. Introduction 3
1.1 Project aims
The work in this thesis is part of a Rail Cooperative Research Centre (CRC) project to min-
imise fuel consumption and improve the timekeeping of trains. The main and significant
output of the Rail CRC project is an in-cab advice system called FreightMiser, which mon-
itors the progress of a journey, calculates an optimal control strategy, and displays advice to
the driver.
The first aim of this PhD project was to find a constructive proof that the optimal driving
strategy for a journey is unique. Such a proof could lead to more efficient methods for
calculating optimal driving strategies.
The second aim of the PhD project was to develop a method for estimating train parameters
from real-time observations of performance. To calculate an optimal driving strategy, it is
reasonable to assume that FreightMiser will perform best when it uses an accurate model of
train performance. Train parameters such as rolling resistance coefficients vary significantly
between trains and journeys, and even within a journey. If we can estimate train parameters
in real time, we can calculate more accurate optimal driving strategies.
1.2 History of optimal train control
In 1970, Figuera [13] showed that on short track sections without significant gradients or
speed restrictions, the optimal driving strategy for a train has three phases: power, coast and
brake. If the train departs on time and always follows the same speed profile, coasting can
begin at a predetermined time from the start of the run. However, if the train falls behind
schedule then the coast phase should be shortened to recover the lost time and reach the
destination on time.
An early paper by Kokotovic [34] presents a second-order nonlinear model of an electric trac-
tion system, and uses Green’s theorem to approximate the solution to the minimum energy
control problem. Kokotovic constructed a family of speed trajectories that is bounded below
by the minimum-energy, maximum-time trajectory and bounded above by a minimum-time,
maximum energy trajectory. The solution for a particular initial condition and final condition
can always be obtained by choosing the appropriate trajectory in this family.
Hoang et al [17] used two approaches to find the optimal trajectories that minimise energy for
a metro network: a rigorous optimal control formulation and a heuristic method employing
a direct search algorithm. The first approach was formulated rigorously but not solved.
The second approach was developed on a simplified model and applied successfully on the
Montreal Metro subway system, reducing energy use by about 8%.
In his 1980 PhD thesis, Aspects of Automatic Train Control, Milroy included some novel
work on optimal driving strategies for trains. In 1982 he formed the Transport Control Group
at the South Australian Institute of Technology to work on a project to develop a system for
achieving fuel savings on suburban trains. The first part of the project involved calculating
efficient speed profiles for various sections of track on the Adelaide metropolitan rail network
in South Australia. The second part of the project was to design and build a device that could
compute optimal driving strategies in real time, and display driving advice to a train driver.
This device would later become known as Metromiser.
The aim of Metromiser was to calculate driving advice that would help the driver arrive at
each station on time, with minimal fuel consumption. The driving strategy between each pair
of stops was to apply maximum power at the beginning of the section, then switch off power
as soon as possible and coast and brake into the stop. Trains have very low rolling resistance,
and can coast for long distances without slowing significantly. The device monitored the
progress of the train compared to the timetable and calculated a precise point at which the
driver could switch from power to coast on each journey section. Metromiser was first eval-
uated in Adelaide in 1985 and then later in Melbourne, Brisbane and Toronto, and achieved
average fuel savings of more than 15% and significant improvements in timekeeping.
The theoretical basis for the work was extended during the period 1982–1985. In 1984,
Howlett [19] showed that when the problem is formulated in an appropriate function space
an optimal strategy does indeed satisfy the Pontryagin conditions. Howlett gave a rigorous
justification of the work. He also showed that a speed-holding phase should be incorporated
into the optimal strategy if the journey time is relatively large, and found a special rela-
tionship between the holding speed and the speed at which braking should begin for a flat
track.
A similar approach was independently developed by Anis et al [2] in 1985. They also used
the Pontryagin principle to find necessary conditions on an optimal control strategy and
showed that these conditions could be applied to determine an optimal driving strategy.
Howlett used a similar approach but simplified the analysis by using the Pontryagin prin-
ciple to reformulate the problem as a finite dimensional constrained optimisation in which
the variables are the unknown switching times where the level of control is changed. Both
Anis et al and Howlett assumed in this early work that the level of acceleration is controlled.
This assumption has now been replaced by a more realistic assumption that is motive power
controlled, or that there are speed-dependent constraints on acceleration. Throughout the
thesis we refer to switching times or switching points to denote the times or positions where
the control is changed.
A paper by Strobel and Horn [45] presented a uniform approach to the solution of the energy
optimum train control problem for both electric and diesel locomotives on a level track. Their
results are similar to those of Howlett [19, 20]. They also conclude that for long journeys
the optimal strategy includes a speedhold phase, but for short journeys the optimal strategy
is power–coast–brake.
Prime et al [41] consider the problem for an electric train. Once again, the journeys are short
and the role of the coast phase is emphasized. For a single station-to-station run, minimum
energy can be achieved by modifying the coast position as a function of average supply
voltage.
In 1986 the Transport Control Group embarked on a second project to develop a fuel conser-
vation system for long-haul freight trains. On long journeys, with many changes of gradient
and many speed restrictions, the driving strategy is much more complicated than the simple
power-coast-brake strategy for short suburban journeys. A more detailed train model, which
took into account the relationship between locomotive power, train speed and fuel flow rates,
was developed by Benjamin et al [4]. The initial work used a heuristic method to insert
coasting into the driving strategy. Eventually, theoretical work was developed to determine
necessary conditions for an optimal driving strategy and to construct optimal journeys. Pre-
liminary work was presented by Howlett and Cheng in 1989 at the Conference of Australian
Institutes of Transport Research. Since then, the model has been studied systematically on
level track by Cheng and Howlett [9, 10], on level track with speed limits by Pudney and
Howlett [43], on track with piecewise constant gradient by Howlett [14], on track with con-
tinuously varying gradient by Howlett and Cheng [21], on track with piecewise constant
gradient with speed limits by Davydova, Cheng, Howlett and Pudney [11] and on track with
non-zero gradient without speed limits by Cheng in his PhD thesis [8].
In 2000, Khmelnitsky [33] provided a comprehensive analysis of train control on a track

with continuously varying gradient using continuous control. Khmelnitsky proved that under
certain conditions there exists a unique optimal solution. However, it is not clear that these
conditions are always true. Khmelnitsky also developed a numerical algorithm for finding
linking trajectories between sections of speedholding that satisfy the necessary conditions for
an optimal journey. Khmelnitsky’s necessary conditions are equivalent to those of Howlett
et al, and his numerical method is similar. However, while Khmelnitsky’s method is suitable
for journeys with occasional short sections of steep track, it may not be the best method for
heavy-haul journeys where only small portions of the journey are spent speedholding.
My PhD work is mainly based on the results described by Howlett in [20]. This paper
discussed an optimal driving strategy for a train with continuous control and for a train with
discrete control. For the continuous control problem the Pontryagin principle is used to find
necessary conditions for an optimal strategy. It is shown that these conditions yield key
equations that determine the optimal switching points. In the case of discrete control, the
Karush-Kuhn-Tucker equations are used to derive an optimal strategy for each fixed control
sequence by finding the optimal switching times.
An interesting result for a discrete control problem with finite control sets was presented by
Howlett and Leizarowitz [22]. They obtained an algebraic equation, rather than a differential
equation, from the Euler-Lagrange equation in some intervals, and show that the optimal
control is a combination of two types of control: pure control and chattering control.
Recently Liu and Golovitcher [39] also tackled the optimal train control problem, and ob-
tained results similar to those of Howlett et al and Khmelnitsky.
The theoretical work on train control was summarised by Howlett and Pudney in a mono-
graph [25]. In this summary, Howlett and Pudney view the optimal strategy as speed-holding
interrupted by occasional steep sections. Shortly after the monograph was published they re-
alised that for many practical journeys the train spends most of its time reacting to steep
sections and very little time travelling at a constant speed. Although this does not change the
theoretical interpretation it does mean that the numerical algorithm needs to be more robust.
This required development of a new numerical algorithm for calculating optimal journey
profiles [18]. This new algorithm was used to develop an in-cab system, called FreightMiser,
for calculating optimal driving strategies for long-haul journeys. The system determines
the location and speed of the train from GPS measurements and then calculates the optimal
strategy to arrive on time at the next key location.
Figure 1.1 shows the prototype FreightMiser screen. The bottom part of the display shows
the elevation profile of the track over the next 6km. The top part of the display shows the
ideal speed profile. The orange lines are speed limits. The colour of the ideal speed profile
indicates the power required to follow the profile. The two white triangles indicate the current
speed of the train.
Figure 1.1: FreightMiser screen.
The FreightMiser system has been tested on in-service freight trains running between Ade-
laide and Melbourne. Using FreightMiser, drivers were able to achieve average fuel savings
of 12%.
A fascinating vehicle control problem is the corresponding problem for solar-powered racing
cars. An optimal driving problem for solar cars is very similar to optimal driving for trains.
One possible objective is to maximize the distance travelled during a fixed period of time.
In one model used by Howlett and Pudney [23, 24] there are two holding speeds. The upper
holding speed is maintained if solar power is high and the lower holding speed is maintained
if solar power is low. These two holding speeds are related by a simple formula.
The solar car problem for a level road is studied by Howlett et al [24] and for an undulating
road [23]. Pudney [42] developed optimal control strategies for a range of battery models,
motor models and solar irradiance scenarios.
1.3 Uniqueness of the optimal control strategy
The main aim of my PhD work was to show that the optimal control strategy for any specific
journey is unique. Howlett and Cheng [10] prove uniqueness of the optimal strategy for the
simple case of a flat track, and suggested that this argument could be extended to a track with
no steep grades, but they did not find a uniqueness proof when speed-holding is interrupted
by steep gradients. We have used a perturbation analysis to show that on a non-steep track
the optimal strategy is unique.
Khmelnitsky [33] presents a proof that the optimal journey is unique provided a certain
condition is met. Unfortunately, this condition cannot be tested until after an optimal journey
has been constructed. It is not clear that this condition will be true for all practical optimal
journeys. Our aim was to find an alternative proof of uniqueness.
We have shown that when a hold phase is interrupted by a single steep uphill gradient then
there is a uniquely determined optimal power phase required for the steep section. Similarly,
when a hold phase is interrupted by a single steep downhill gradient then there is a uniquely
determined optimal coast phase required for the steep section.
For more complicated tracks, we have not been able to prove that the power phases re-
quired for a steep uphill sections and the coast phases required for steep downhill sections
are uniquely defined. However, we have developed alternative necessary conditions that can
be used to calculate the relevant switching points , and developed a new numerical method
for calculating optimal power and coast phases. The new necessary conditions provide addi-
tional insight and may eventually lead to the desired comprehensive proof of uniqueness.
The optimal control of a train is discussed in Part I – III of this thesis.

1.4 Parameter estimation
The second aim of my PhD work was to develop a method for estimating the parameters
of our train performance model. These parameters include a rolling resistance coefficient
and an aerodynamic drag coefficient. In practice, it is difficult to predict values for these
parameters for a freight train because the length and mass of the train will vary from one
journey to the next, and the aerodynamic drag on the train will depend on prevailing winds
and track conditions which can change even during the journey.
One way to overcome this difficulty is to estimate the parameter values in real-time from
observations of the actual train performance, and then use the estimated parameter values
in the journey optimisation calculations to give a driving strategy tailored to the prevailing
conditions.
We have successfully applied the Unscented Kalman Filter to example problems of estimat-
ing train parameters. This filter is easy to implement, and fast enough to be used in real
time.
Parameter estimation is discussed in Part IV of the thesis.
1.5 Thesis structure
There are four parts to this thesis: The Train Control Problem, A New Method For Finding
Optimal Driving Strategies, Phase Trajectories, and Parameter Estimation.
Part I, The Train Control Problem, comprises three chapters. Chapter 1 is this chapter. In
Chapter 2 we give a detailed problem formulation and review the use of Pontryagin’s Maxi-
mum Principle to derive necessary conditions for an optimal journey. We also introduce the
concept of a phase diagram, which will be used extensively later in the thesis. In Chapter
3, ‘steep’ gradients are defined and the concept of a limiting speed for a fixed gradient is
introduced. In Chapter 4 we show that the optimal strategy is unique for non-steep track.
In Part II , A New Method For Finding Optimal Driving Strategies, we develop new necessary
conditions for an optimal strategy and a new method for finding an optimal strategy. In
Chapter 5 we consider a power phase on a single steep uphill gradient, and introduce a new
method for deriving alternative necessary conditions. Using these conditions, we develop a
new numerical procedure for calculating optimal strategies. We also show that the optimal
strategy is unique. In Chapter 6 we extend the analysis to a steep uphill section with two
gradients. In this case, however, we are not able to prove uniqueness. In Chapter 7 we
extend the analysis again, this time to the case of a steep uphill section with many gradients.
In Chapters 8–10 the analysis of Chapters 5–7 is repeated for a coast phase on a steep down-
hill section.
In Part III, Phase Diagrams, we develop an analytical relationship between the key adjoint
variable and train speed for a power phase on a steep uphill section in Chapter 11, for a coast
phase on a steep downhill section in Chapter 12 , and for a final coasting and braking in
Chapter 13.
In Part IV, Parameter Estimation of the thesis we study the parameter estimation problem.
In Chapter 14 the problem is formulated and we discuss some candidate numerical filters
that could be used for parameter estimation, including the Extended Kalman Filter and H-∞
filter. In Chapter 15 we discuss the Nonlinear Projection Filter which, although it should
do the job, has some serious practical difficulties. We use Chapter 16 to discuss the simpler
Unscented Kalman Filter, and demonstrate how the filter can be “tuned” to give good results
for many example problems.
Finally, we summarise the thesis and propose further research in Chapter 17.
Chapter 2
Optimal journeys
In this chapter we first formulate the train control problem, and then use Pontryagin’s Prin-
ciple to derive necessary conditions for an optimal control.
The results in this chapter are not new and can be found in [19, 20]. However, since it
is necessary to understand and use these results throughout the thesis, it is convenient to
present them here.
2.1 Problem formulation
The motion of a train along a track can be described by the state equations
dv 1 p
m = − q − r(v) + g(x) (2.1)
dx v v
and
dt 1
= (2.2)
dx v
where v(x) is the speed of the train at location x and t(x) is the time at which the train is at
location x.
12
Chapter 2. Optimal journeys 13
In this model, m is the mass of the train, p ≥ 0 is the tractive power applied at the wheels,
q ≥ 0 is the braking force, v ≥ 0 is the speed of the train, r is the resistance force acting
against the train and g is the gradient force acting on the train. The gradient force g is positive
for downhill gradients. The train is controlled by controlling power p and braking force q.
For the remainder of this thesis, we will assume that m = 1, and scale the other parameters
accordingly.
For diesel-electric trains, the power generated is approximately proportional to the fuel flow
rate. In this thesis we will consider the problem of minimising mechanical energy. For diesel-
electric trains this is almost equivalent to minimising fuel consumption, and for electric trains
this is almost equivalent to minimising electrical energy use.
The mechanical work done by the locomotive as the train drives from x = 0 to x = X is
X
p
J= dx.
0 v
More generally we can define x
p
J(x) = dξ
0 v
in which case
dJ p
= . (2.3)
dx v
We ignore the (negative) work done by the brakes since this energy is not recovered. We
wish to drive from location x = 0 to location x = X in time T with minimum mechanical
energy. That is, we want to minimise the cost function J subject to the state equations (2.1)
and (2.2). The boundary conditions for the problem are
v(0) = v(X) = 0
and
t(0) = 0, t(X) = T.
These equations are for a point mass train. Howlett and Pudney [25] show that the motion
of a train with distributed mass on a given gradient profile is the same as motion of a point
mass train on a modified gradient profile. The motion of a long train with length S and mass
M can be modelled as
S
dv p 1
v = − q − r(v) + ρ(s)g(x − s)ds (2.4)
dx v M 0
where ρ(s) is the mass per unit length at distance s from the front of the train. If we define
the modified gradient acceleration as
S
1
ḡ(x) = ρ(s)g(x − s)ds
M 0
then the equation of motion is

dv p
v = − q − r(v) + ḡ(x), (2.5)
dx v
which takes the same form as that for a point mass train. From now on, we will consider
only point mass trains. By using the modified gradient profile we can see that all results can
be applied to trains with distributed mass.
2.2 Necessary conditions for an optimal journey
We can use Pontryagin’s Maximum Principle to find necessary conditions for an optimal
journey. Pontryagin’s Maximum Principle is used to find an extreme of a function under
consideration of a set of optimal control variables that is critical for the control problem.
It is carried out with the help of Hamiltonian function which is a set of differential equa-
tions of adjoint variables. The details about Hamiltonian function can be found in [35]. By
Pontryagin’s Maximum Principle finding optimal solutions of a cost function is equivalent
to maximising its Hamiltonian equation. We can find good descriptions and examples on
Pontryagin’s Maximum Principle in [40].
To apply Pontryagin’s Maximum Principle to minimise the cost function, we first define the
Hamiltonian function
p α p β
H=− + − q + g(x) − r(v) + (2.6)
v v v v
where α and β are adjoint variables. The adjoint variables evolve according to the equations
dα ∂H p α β
=− = − 2 + 3 (2p − qv + g(x)v − r(v)v + r (v)v 2 ) + 2 (2.7)
dx ∂v v v v
and
dβ ∂H
=− = 0. (2.8)
dx ∂t
We want to maximise H subject to
0≤p≤P
and
0 ≤ q ≤ Q,
where P is the maximum available driving power, and Q is the maximum available braking
force. Thus we define
1 α α α
H= − 1 p − q + (g(x) − r(v)) + λp + µ(P − p) + ρq + σ(Q − q) (2.9)
v v v v
where µ, λ, ρ, σ are non-negative Lagrange multipliers. In order to maximize H we apply
the Karush-Kuhn-Tucker conditions
∂H 1 α
= −1 +λ−µ =0 (2.10)
∂p v v
and
∂H α
= − +ρ−σ =0 (2.11)
∂q v
with the complementary slackness conditions
λp = µ(P − p) = ρq = σ(Q − q) = 0. (2.12)
There are two critical values of α: α = v and α = 0. So we must consider the Hamiltonian
in five cases: α > v, α = v, 0 < α < v, α = 0 and α < 0.
Case 1: α > v
If α > v then
1 α
− 1 > 0.
v v
But λ ≥ 0, µ ≥ 0 and, from (2.12), λ and µ cannot both be positive. If λ > 0 and µ = 0
then P = p = 0, which is inconsistent. So we must have λ = 0 and, from (2.10),
1 α
µ= −1 > 0
v v
and λ = 0. Note that λ > 0 and µ = 0 is not possible. Using µ(P − p) = 0 gives P = p.
Since α/v > 0, (2.11) implies ρ = α/v and σ = 0. The complementary slackness
conditions require q = 0. So if α > v the optimal control is given by p = P and q = 0. We
call this driving mode power.
Case 2: α = v
If α = v over some non-trivial interval then
dα dv
=
dx dx
and hence
1 p p α β
− q − r(v) + g(x) = − 2 + 3 (2p − qv − r(v)v + g(x)v + r (v)v 2 ) + 2
v v v v v
=⇒ v 2 r (v) + β = 0
=⇒ ψ(v) + β = 0 (2.13)
where
ψ(v) = v 2 r (v). (2.14)
Lemma 2.2.1 Let

ϕ(v) = vr(v). (2.15)
If ϕ is strictly convex then ψ(v) is monotone increasing.
Proof:
We have
ψ (v) = 2vr (v) + v 2 r (v) = v(2r (v) + vr (v)).
Since ϕ(v) is convex then
ϕ (v) = 2r (v) + vr (v) > 0.
So
ψ (v) = vϕ (v) > 0.
That is, ψ(v) is monotone increasing.
From Lemma 2.2.1, it follows that (2.13) has a unique solution v = V , a constant. We call
this driving mode speed holding or simply hold. To maintain this singular driving mode
requires
dv
= 0 =⇒ p = V [r(V ) − g(x)].
dx
During speed holding, with v = V , we have α = v = V and (2.13) gives
ϕ(V ) − β = 0
and so the adjoint variable β, which from (2.8) is constant for the entire journey, takes the
value β = −V 2 r (V ) < 0.
Speed holding can occur only if
0 ≤ V [r(v) − g(x)] ≤ P.
If this condition is not met then it is not possible to satisfy the condition v = V on the interval
and the track is said to be steep uphill at (x, V ). We will present a systematic definition of
steepness in the next chapter. Since α/v = 1 in this speed holding case , (2.10) =⇒ λ = µ.
Using (2.12) we must have
λ = µ = 0.
From (2.11) with α = v we have

ρ = σ + 1.
From (2.12) we have either
q = Q =⇒ σ > 0 =⇒ ρ > 0 =⇒ q = 0,
which is a contradiction, or else
q < Q =⇒ σ = 0 =⇒ ρ = 1 =⇒ q = 0.
So speed holding requires partial power and no braking.
Case 3: 0 < α < v
If 0 < α < v then

α
− 1 < 0.
v
From (2.10) and (2.12) we must have λ > 0 and µ = 0. Since α/v > 0, then from (2.11),
σ = 0 and ρ = α/v > 0. Using the complementary slackness we have p = q = 0. Thus
there is no power applied and no braking. We call this driving mode coast.
Case 4: α = 0
If α = 0 then because we cannot have both λ > 0 and µ > 0 it follows that
(2.10) =⇒ λ = µ + 1/v =⇒ λ = 1/v, µ = 0 =⇒ p = 0
and since we cannot have both ρ > 0 and σ > 0 we also deduce that
(2.11) =⇒ ρ − σ = 0 =⇒ ρ = σ = 0.
Since α = 0 then
dα β
= 0, (2.7) =⇒ 2 = 0.
dx v
But β = −V 2 r (V ) = 0 (by Lemma 2.2.1) and so α = 0 cannot be sustained on a non-zero
length control interval. The case β = 0 corresponds to V = 0 which means infinite time.
Case 5: α < 0
If α < 0 then
α
− 1 < 0.
v
Equations (2.12) and (2.10) are satisfied when
λ > 0 and µ = 0.
From (2.11) we have

α
ρ−σ = < 0.
v
If σ = 0 and ρ > 0 then, from (2.12), q = Q = 0, which is inconsistent. But ρ and σ cannot
both be positive, and so σ > 0 and ρ = 0. Now using complementary slackness, we must
have p = 0 and q = Q. We call this driving mode braking.
The optimal controls are summarised in the table below:
adjoint mode control

α>v power p = P, q = 0
α=v hold p = r(v) − g(x), q = 0
0<α<v coast p = 0, q = 0
α<0 brake p = 0, q = Q
Table 2.1: Summary of optimal control modes
2.2.1 State and adjoint equations for an optimal journey
We could analyse the optimal trajectory using α but it is convenient to introduce a modified
adjoint variable
α
θ= . (2.16)
v
The modified necessary conditions are
adjoint mode control

θ>1 power p = P, q = 0
θ=1 hold p = r(v) − g(x), v = V
0<θ<1 coast p = 0, q = 0
θ<0 brake p = 0, q = Q
Table 2.2: Summary of optimal control modes with modified adjoint variable
The modified adjoint equation is
dθ 1 dα α dv
= − 2 .
dx v dx v dx
From (2.5) and (2.7) we have the general form of the modified adjoint equation

dθ 1 p α 2 β
= − 3 + 3 (2p − qv + g(x)v − r(v)v + r (v)v ) + 2
dx v v v v

α p q r(v) g(x)
− 2 2− − +
v v v v v
p α
α r (v) β
= 3 −1 + + 3
v v v v v
p r (v) ψ(V )
= 3 (θ − 1) + θ− 3 . (2.17)
v v v
For a power phase, with p = P from (2.17) we have
dθ P + ψ(v) P + ψ(V )
− 3
θ = (−1) . (2.18)
dx v v3
For the brake phase, with p = 0 from (2.17) we have
dθ ψ(v) ψ(V )
− 3 ·θ =− 3 . (2.19)
dx v v
An optimal solution has speed v satisfying (2.1), adjoint variable θ satisfying (2.17) and
controls p and q as given in table 2.2.
2.2.2 Possible control transitions
The adjoint variable θ is continuous and so only some transitions are possible. The possible
control transition are illustrated in Figure 2.1. The control can be changed to or from hold
only if v = V . It is possible to change from power to coast or coast to power if v = V.
Power Hold Coast Brake

T t1 T 1, v V 0 dT d1 T d0
Figure 2.1: Diagram of possible transitions.
We can find further restrictions on transitions by analysing the adjoint equation

dθ P + ψ(v) P + ψ(V )
− 3
θ = (−1) .
dx v v3
A change to power occurs at θ = 1. For the power mode to be maintained, we require
dθ
≥0
dx
at the point where the control changes. For this condition to be met we require v ≥ V . So
the control can be changed to power only if v ≥ V and θ = 1.
A change to coast also requires θ = 1. However to maintain the coast mode we require
dθ
≤0
dx
at the point where the control changes. Since the adjoint equation in the coast mode at the
point θ = 1 is
dθ ψ(v)θ − ψ(V )
= , (2.20)
dx v3
we require v ≤ V . So we can change to coast only if v ≤ V and θ = 1.
If speed v ≥ V during a power phase, and the control is changed to power at speed v ≥ V ,
then θ will continue to increase and no further changes will be possible if v stays above V .
If speed v ≤ V during coasting, and the control is changed to coast at speed v ≤ V , then θ
will continue to decrease and the only other control change will be a change to brake.
Once the brakes have been applied, v < V and θ < 0. Then from (2.20) we have dθ/dx < 0.
So v and θ will always decrease. Hence no further control changes are possible.
On a non-steep track, where speed increases during a power phase and decreases during
coasting and braking, only two control sequences are possible: power – hold – coast – brake,
and power – coast – brake. The possible control transitions are illustrated in Figure 2.2.
Power Hold Coast Brake

T t1 T 1, v V 0 dT d1 T d0
Figure 2.2: Diagram of possible transitions on a non-steep track.
2.3 Example journeys
The optimal control sequence on a flat track is power, speedhold, coast and brake.
Figure 2.3 shows the speed profile (top) and adjoint variable profile (bottom) for an optimal
strategy with V = 20. The journey starts at x = 1000 and finishes at x = 40000. (In
all examples, quantities are in SI units—distance is in metres, and speed is in metres per
second). The adjoint variable θ is greater than 1 during power phase as seen in Figure 2.3.
If the track has steep sections then the optimal strategy is more complicated. If the track has
a steep uphill section then it is necessary to start power before reaching the steep section and
maintain power beyond the steep section until the speed returns to the holding speed. The
30
25
20
v 15
10
0
0 0.5 1 1.5 2 2.5 3 3.5 4
x x 10
4
1.5
θ
0.5
−0.5
0 0.5 1 1.5 2 2.5 3 3.5 4
x x 10
4
Figure 2.3: Speed profile and θ profile of a basic optimal strategy.
train will slow down on the steep section and then increase speed back to the holding speed
V after the steep section. The adjoint variable increases from one and then decreases back to
one.
For a steep downhill section then we should start coasting before the steep section and keep
coasting beyond the steep downhill section until the speed returns to the holding speed again.
This time the speed increases on the steep section and slows down after the steep section until
it comes back to V . The adjoint variable decreases below one and then increases up to one
again.
Figure 2.4 shows the optimal speed and adjoint variable profiles for a journey with a steep
uphill section from 5000 to 6000 and a steep downhill section from 11000 to 12000. The
brown line indicates the elevation profile of the track. Figure 2.5 shows the final coast and
brake phases for this journey.
30
25
20
v
15
10
track gradient
5
0
0 0.2 0.4 0.5 0.6 0.8 1 1.1 1.2 1.4 1.6 1.8 2
x 4
x 10
1.4
1.3
θ 1.2
1.1
1
0.9
0.8
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x 4
x 10
Figure 2.4: Speed profile and θ profile of a journey with one steep uphill and one steep downhill
section.
2.4 Phase diagrams
A useful way to view the system of state and adjoint equations is a phase diagram that plots
θ against v. Figure 2.6 is the phase diagram for the journey shown in Figure 2.4. Figure
2.7 shows more detail of the phase profile on the steep sections. The phase diagram can be
divided into six regions, separated by v = V , θ = 0 and θ = 1. A journey on a non-steep
track starts powering from v = 0 to v = V and from θ > 1 to θ = 1, then coasts from v < V
to v = U and θ = 0, then brakes from v = U to v = 0 with θ < 0.
For a steep uphill section the state changes from hold at v = V, θ = 1 to power at some
point before the start of the steep section. The variables v and θ both increase until the train
reaches the bottom of the steep section. Speed then starts decreasing on the steep section.
The adjoint variable θ still increases till some point and then start decreasing. At the top of
the steep section, speed starts increasing but θ continues to decrease. If the starting point for
30
25
20
v 15
10
5 track gradient
0
3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 4.1
4
x x 10
1.5
θ
0.5
−0.5
3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 4.1
4
x x 10
Figure 2.5: Coasting and braking at the end of a journey on a level track.
1.5
θ
0.5
−0.5
0 5 10 15 20 25 30
v
Figure 2.6: Phase diagram for a journey with a steep uphill section and a steep downhill section.
1.05
0.95
0.9
0.85
0.8
14 16 18 20 22 24 26
v
Figure 2.7: Details of the phase plot for the steep steep sections.
the power phase was chosen correctly, the (v, θ) trajectory will finish at (V, 1). If the power
phase is started too soon or too late, the trajectory will not return to v = V and θ = 1, as
shown in Figure 2.8, and the train will not be able to resume speed holding.
Figure 2.9 shows two incorrect trajectories for a coast phase on a steep downhill section. If
the train starts coasting at the wrong location then the trajectory will not have θ = 1 when v
returns to V , and so the train will not be able to resume speed holding.
In part III of the thesis, we will study phase diagrams for optimal profiles in more detail on
steep sections.
1.025
1.02
1.015
θ
1.01
1.005
1
17.5 18 18.5 19 19.5 20 20.5 21 21.5 22
1.015
1.01
1.005
θ 1
0.995
0.99
0.985
0.98
17.5 18 18.5 19 19.5 20 20.5 21 21.5
v
Figure 2.8: Phase diagram for power phases that start too late and too soon for a steep uphill section.
1
0.95
θ
0.9
0.85
17 18 19 20 21 22 23 24 25 26
1.05
θ
0.95
0.9
0.85
17 18 19 20 21 22 23 24 25 26
v
Figure 2.9: Phase diagram for a coast phase that starts too late or too soon for a steep downhill section.
2.5 Summary
We have used Pontryagin’s principle to show that an optimal journey has only four distinct
driving modes: power, hold, coast and brake. Furthermore, the necessary conditions for an
optimal journey limit the possible transitions between driving modes. For simple journeys,
the sequence of driving modes is power-hold-coast-brake. For more complicated journeys,
with many steep sections, it is not obvious that there is a unique optimal control strategy.
In Part II of the thesis, we will examine the necessary conditions and the uniqueness for an
optimal journey in more detail.
We can use the (v, θ) phase plot to visualise optimal journeys. In Part III of the thesis we
will examine the relationship between v and θ in more detail.
Chapter 3
Steep hills and limiting speeds
On a flat track, the optimal strategy has four phases: power, hold, coast and brake. This
simple strategy is also optimal on undulating track, provided that the track does not contain
gradients that are so steep that the train cannot maintain the holding speed V . In this chapter
we will define what is meant by steep track. We will also introduce the idea of a limiting
speed for a constant gradient and a constant power level. In Parts II and III of the thesis
we will show that if a holding phase is interrupted by a steep section of track, the optimal
strategy must be modified.
3.1 Steep sections
A non-level track gradient exerts a longitudinal force on the train. In much of this thesis,
we will assume that the track gradient is a piecewise constant function of distance, and so
the force per unit mass exerted by the track and the corresponding gradient acceleration, are
also a piecewise constant function of distance. Gradient acceleration is positive for downhill
gradients, and negative for uphill gradients.
29
Chapter 3. Steep hills and limiting speeds 30
Suppose the optimal hold speed for a journey is V . If the train is unable to maintain speed V
on a section of track even if full power is applied we say the section is steep uphill at speed
V.
V V
Figure 3.1: Speed decreases on a steep uphill section.
If train speed increases above the optimal speed V on a section of track even when no power
is applied we say the section is steep downhill relative to V .
V V
Figure 3.2: Speed increases on a steep downhill section.
Definition 3.1.1 A gradient acceleration γ is said to be steep uphill at speed V if
ϕ(V ) − γV > P. (3.1)
From (2.1), for a power phase, we have
dv P
v = + γ − r(v).
dx v
If the track is steep at speed V the train speed decreases. That is,
P
+ γ − r(V ) < 0 =⇒ V r(V ) − γV > P.
V
Writing ϕ(v) = vr(v) we have

ϕ(V ) − γV > P.
Note that steepness of a section depends on the holding speed as well as on the gradient.
A gradient that is steep uphill at some speed V 1 may not be steep uphill at a lower speed
V2 < V1 . Note also that for vary high speeds, a downhill gradient could be classified as
‘steep uphill’.
Definition 3.1.2 A gradient acceleration γ is said to be steep downhill at speed V if
ϕ(V ) − γV < 0. (3.2)
The motion of a train when coasting is given by the equation
dv
v = γ − r(v).
dx
If the track is steep downhill at speed V then train speed increases. That is,
γ − r(V ) > 0 =⇒ γV − ϕ(V ) > 0 =⇒ ϕ(V ) − γV < 0.
A gradient that is steep downhill at some speed V 1 may not be steep downhill at a higher
speed V2 > V1 .
On a track with piecewise constant gradient, a track interval [x1 , x2 ] of constant gradient is
said to be steep at speed V if the gradient acceleration γ on the interval is steep at speed V .
3.2 Limiting speed at a constant power on a constant gra-

dient
Definition 3.2.1 The limiting speed VL = VL (p0 , γ) of a train driving with constant power
p0 ∈ [0, P ] on a gradient γ is determined by the equation
ϕ(VL) + γVL = p0 .
Limiting speed
v
Speed profile
Figure 3.3: Limiting speeds
Lemma 3.2.2 Let VL (p0 , γ) be the limiting speed for a train applying power p 0 on a section
[a, b] of track with constant gradient acceleration γ.
If v(a) > VL (p0 , γ) then v(x) > VL (p0 , γ) throughout the interval and v(x) decreases toward
the limiting value. If v(a) < VL (p0 , γ) then v(x) < V )L(p0 , γ) throughout the interval and
v(x) increases towards the limiting value.
The speed of a train on an interval [a, b] with constant gradient acceleration γ and power
p = p0 approaches the limiting speed VL (p0 , γ).
We can show that it is not possible for any phase with power p 0 to cross the limiting speed
on a track interval [a, b] with gradient acceleration γ by calculating the distance required to
reach the limiting speed. If

v(c) < VL (p0 , γ)
for some c ∈ [a, b], then v

v 2 dξ
x(v) − xc = .
vc p0 − ϕ(v) + γv
Assume that p0 − ϕ(v) + γv > 0. As v ↑ VL (p0 , γ) we have x(v) ↑ +∞. Hence
v(x) < VL (p0 , γ)
for x ∈ [c, b]. Similarly if

v(c) > VL (p0 , γ)
for some c ∈ [a, b], then v

v 2 dξ
x(v) − xc = .
vc ϕ(v) − γv − p0
As v ↓ VL we have x(v) ↑ +∞. This means that
v(x) > VL (p0 , γ)
for all x ∈ [c, b].
3.3 Conclusion
A gradient acceleration γ is said to be steep uphill at speed V if the train has insufficient
power to maintain speed V on the gradient. A gradient is said to be steep downhill at speed
V if v increases while coasting from speed V . The limiting speed V L (p0 , γ) for a constant
power and a constant gradient acceleration γ is the speed at which the applied power p 0 gives
dv/dx = 0.
Chapter 4
Uniqueness of the optimal strategy on a

non-steep track
In this chapter we show that the optimal driving strategy on a non-steep track is unique.
First we show that for each optimal final coast phase there is a unique holding speed V .
Next we consider the optimal control of a train as it coasts and then brakes to a stop on
a non-steep track, and prove the uniqueness of the optimal coast and brake phases using
an infinitesimal perturbation analysis. A proof of uniqueness for level tracks was given by
Cheng and Howlett in [10]. Khmelnitsky [33] argues that the optimal strategy is unique
under more general conditions. However it is not clear that these conditions are always true
and his method has not been useful for numerical calculations in real time control.
We then show that on a non-steep track the entire optimal strategy is completely determined
by the holding speed, and that increasing the holding speed implies increasing the speed at
every point. This means that for a fixed total journey time there is only one corresponding
strategy of optimal type that arrives at the final destination on time. It follows that the optimal
strategy is unique for each holding speed.
34
Chapter 4. Uniqueness of the optimal strategy on a non-steep track 35
The equation of motion of a train is
dv P
v = − r(v) + g(x) (4.1)
dx v
for full power P during a power phase,
dv
v = −r(v) + g(x) (4.2)
dx
for coast and

dv
v = −Q − r(v) + g(x) (4.3)
dx
for full brake Q during a brake phase. We assume that the track gradient is non-steep, so
that the speed of the train is strictly increasing during a power phase and strictly decreasing
during coast and brake phases.
In Section 2.2.2 we showed that for a journey on non-steep track starting at x = 0 and v = 0
and finishing at x = X and v = 0, there are only two control sequences that can satisfy the
necessary conditions for an optimal journey: power–coast–brake and power–hold–coast–
brake. Figure 4.1 shows optimal journeys for hold speeds V ∈ {15, 20, 25, 30, 40, . . . , 120}
on a non-steep track. For each of the journeys, the power phase starts at x = 0 and the brake
finishes at X = 12000. For journeys with V ∈ {15, 20, 25} we can find a coast phase that
starts with v = V and θ = 1 and finishes on the braking curve with θ = 0. For journeys with
higher hold speeds, no such coast phase can be found and instead, the coast phase must start
from the power phase with v < V and θ = 1 and finish on the braking curve with θ = 0. For
both types of journeys, as the hold speed increases, the length of the coast phase decreases
and the journey time decreases.
In the more usual case of a power–hold–coast–brake strategy with a given hold speed, V ,
the train powers from speed v = 0 to the hold speed, v = V . Two control decisions remain:
when to change from hold to coast, and when to change from coast to brake. In Section 4.2
40
35
30
25
v 20
15
10
0
0 2000 4000 6000 8000 10000 12000 14000
x
Figure 4.1: Optimal journeys for different hold speed V on a non-steep track.
below, we will show that there is only one coast–brake trajectory that starts with v = V and
finishes at x = X with v = 0 and satisfies the necessary conditions for an optimal journey.
Since the only two decisions are when to start coasting and when to start braking, the optimal
journey for a power–hold–coast– brake strategy with hold speed V is unique.
For sufficiently short journey times, a power–hold–coast–brake strategy is not feasible. In-
stead, we must use a power–coast–brake strategy. There is only one power–coast–brake
journey that satisfies the distance and time constraints, and so we do not need to use the ad-
joint equations and the necessary conditions. However, the power–coast–brake journey will
have θ = 1 at the start to the coast phase and θ = 0 at the end of the coast phase for some
hold speed V that is greater than the speed at which coasting begins.
For journeys where the required journey time is less than the time required by a power–brake
strategy, no feasible strategy exists.
4.2 Uniqueness of the final coast–brake phase
From (2.19) in Chapter 2, the adjoint equation for the system during coasting and braking is
dθ ψ(v) ψ(V )
= 3 θ− 3 (4.4)
dx v v
where V is the holding speed for the journey. The necessary conditions for an optimal
journey require the coast phase to start with θ = 1 and finish with θ = 0. The brake phase
starts with θ = 0 and finishes with v = 0.
We will assume that the final coast phase for the optimal journey starts with v = V C and
θ = 1, and finishes with v = VB and θ = 0. The following theorem states that there is a
unique hold speed V associated with an optimal final coast phase.
Theorem 4.2.1 For a given optimal coast phase that starts with x = x C , v = VC and θ = 1,
and finishes with v = VB and θ = 0, there is a unique hold speed V associated with that
optimal coast phase.
Proof:
Let xC be the starting point for the coast phase with v(x C ) = VC and θ(xC ) = 1, and let xB
be the finishing point for the coast phase with v(x B ) = VB and θ(xB ) = 0. Integrating the
adjoint equation (4.4) from xC to xB gives
xB
ψ(v) ψ(V )
−1 = θ − 3 dx. (4.5)
xC v3 v
If V = 0 then the right hand side of (4.5) is positive. If V = ∞ then the right hand side of
(4.5) is negative.
Let
ψ(v) ψ(V )
f (v, V ) = 3
θ− 3 .
v v
Since θ > 0 and ∂f /∂V < 0 for all v then there must exist a unique value of V ∈ [0, ∞)
that satisfies equation (4.5).
We now wish to show that the final coast and brake phases for an optimal power–hold–coast–
brake journey are unique. In this case the coast phase starts with V C = V .
Theorem 4.2.2 On a non-steep track, for given hold speed V there is exactly one coast–
brake strategy for which the coast phase starts with speed V and θ = 1 and finishes with
θ = 0, and for which the brake phase finishes at x = X with v = 0. In other words, if two
optimal speed profiles v and w have holding speeds V < W then they also have braking
speeds VB < WB .
Proof:
Consider the uniquely defined braking curve v = vB (x) for x ≤ X that satisfies the equation
dv
v = −Q − r(v) + g(x)
dx
with v(X) = 0. Choose a speed VB > 0 and define xB < X such that vB (xB ) = VB . The
point xB is unique because v decreases during braking. Let v = vC (x) for x ≤ xB be the
uniquely defined coasting curve satisfying the equation
dv
v = −r(v) + g(x)
dx
with v(xB ) = VB . Finally suppose that for some V > VB we can find a point xC such that
vC (xC ) = V and vC (x) < V when xC < x ≤ xB . The point xC is also unique since speed
decreases during coasting. We define a composite speed profile v = v (x) by setting
⎧
⎪
⎪ v (x) for xC ≤ x ≤ xB
⎪
⎨ C
v (x) =
⎪
⎪
⎪
⎩ v (x) for x ≤ x ≤ X.
B B
The composite speed profile is shown in Figure 4.2. If the composite speed profile is the
v v + G v
V
W
v
VB G VB
VB
xC xW xVB +G VB xB x
Figure 4.2: Perturbed speed profile for coasting and braking.
coast–brake phase for an optimal strategy then θ = θ (x) must satisfy the equation
dθ r (v (x)) V 2 r (V )
− θ = (−1)
dx v (x) v (x)3
with θ(xC ) = 1 and θ(xB ) = 0. Now consider an infinitesimal perturbation analysis. Sup-
pose VB is increased to VB + δVB and define xVB +δVB < xB such that vB (xVB +δVB ) =
VB + δVB . Let v = [vC + δvC ](x) for x ≤ xVB +δVB be the uniquely defined coasting curve
satisfying the equation
dv
v = −r(v) + g(x)
dx
with v(xVB +δVB ) = VB + δVB . Finally, define a composite speed profile v = [v + δv](x) by
setting ⎧
⎪
⎪ [v + δvC ](x) for x ≤ xVB +δVB
⎪
⎨ C
[v + δv](x) =
⎪
⎪
⎪
⎩ v (x)
B for xVB +δVB ≤ x ≤ X.
Now we define θ = [θ + δθ](x) as the unique solution to the differential equation
d r ([v + δv](x)) ψ(V + δV )

(θ∗ + δθ) − θ = (−1)
dx [v + δv](x) [v + δv](x)3
with θ(xVB +δVB ) = 0. By applying a first order Maclaurin series approximation this equation
can be rewritten as

d r (v ) r (v ) r (v ) r (v∗ )
[θ + δθ] − + − δv θ − δθ
dx v v v 2 v∗

ψ(V ) ψ (V )δV 3ψ(V )
= (−1) 3
+ 3
− δv
v∗ v∗ v∗4
and since
dθ r (v (x)) ψ(V )
− θ = (−1)
dx v (x) v (x)3
it follows that

d δθ r (v ) r (v ) r (v ) 3ψ(V ) ψ (V )
− δθ = − θ (x) + δv(x) − δV (4.6)
dx v v v 2 v 4 v∗3
for xC < x ≤ xVB +δVB . Assume W ≤ V so we can form a contradiction. Let xW ≥ xC be

the unique solution to (v ∗ + δv)(x) = W , as shown in Figure 4.2. If we write

r (v ) r (v∗ ) 3ψ(V ) ψ (V )
m(x) = θ − θ∗ + δv − δV (4.7)
v v 2 v∗4 v∗3
then because ψ(v) = v 2 r (v) is an increasing function of v and v∗ (x) ≤ V and θ (x) < 1 in
the region xC < x ≤ xVB +δVB it follows that

r (v∗ ) 2ψ(V ) ψ (V )
m(x) ≥ θ∗ + δv − δV
v∗ v∗4 v∗3

r (v∗ ) 2ψ(v∗ ) ψ (V )
≥ θ∗ + δv − δV
v∗ v∗4 v∗3
1
≥ 3 [ψ (v∗ )θ∗ δv − ψ (V )δV ]
v∗
> 0 (4.8)
for xC < x ≤ xVB +δVB if δV < 0. The integrating factor for (4.6) is
xV +δV
B B r (v (ξ))

I(x) = exp (−1) dξ
x v∗ (ξ)
and so (4.6) can be rewritten as
d
[I(x)δθ(x)] = I(x)m(x)
dx
and hence, by integrating over [x, xVB +δVB ], we obtain

xVB +δVB
I(xVB +δVB )δθ(xVB +δVB ) − I(x)δθ(x) = I(ξ)m(ξ)dξ.
x
Since δθ(xVB +δVB ) = 0 it follows that

xVB +δVB
I(x)δθ(x) = (−1) I(ξ)m(ξ)dξ
x
and hence that δθ(x) < 0 for xC ≤ x ≤ xVB +δVB . We conclude that [θ + δθ](x) < 1 for
xC ≤ x ≤ xVB +δVB and so the optimal control associated with the trajectory [θ + δθ] does
not change in the interval [xC , xVB +δVB ]. In particular, it cannot change at xW , and so we
have a contradiction. Hence if [v + δv](x) is part of an optimal strategy then it must be asso-
ciated with a holding speed W > V . In other words, if the optimal braking speed increases
then the optimal holding speed also increases. The linear perturbation analysis is justified by
the uniform convergence of δv(x) → 0 on xW ≤ x ≤ xVB +δVB as δVB approaches 0.
For a power–coast–brake journey the final coast phase does not start with v = V but with
v = VC < V . From Section 2.2.2 we know that on non-steep track a change to coast
requires VC ≤ V . In this case, we want to show that if the braking speed V B increases then
the coasting speed VC also increases.
Theorem 4.2.3 On a non-steep track, for a given holding speed V there is exactly one coast-
brake strategy for which the coast phase starts with speed V C ≤ V and θ = 1 and finishes
with θ = 0, and for which the brake phase finishes at x = X with v = 0.
The proof of this theorem is similar to the proof of Theorem 4.2.2. Since V C < V and
v∗ (x) < VC on a non-steep track then the statement (4.8) in the proof of Theorem 4.2.2 is
still valid for the case when the train starts coasting from V C < V . Thus we can obtain a
similar conclusion to Theorem 4.2.2, that is if [v ∗ + δv](x) is part of an optimal strategy then
it must be associated with a coasting speed V C + δVC > VC . Hence if VB increases then VC
increases.
V WB
VB
a x1 x2 x3 x 4 x5 x6 b
Figure 4.3: Speed profiles of two optimal journeys on a non-steep track.
4.3 Uniqueness of the optimal journey
Using the proof from the previous section, we can show that the optimal journey is unique.
Lemma 4.3.1 Let V and W be holding speeds of optimal driving strategies v(x) and w(x)
on a non-steep track over the interval [a, b]. If
V <W
then
v(x) ≤ w(x)
for all x ∈ [a, b].
Proof of Lemma 4.3.1
There are two types of optimal journey: power–hold–coast–brake for long journeys, and
power–coast–brake for short journeys. We need to show that w(x) ≥ v(x) for the power,
coast and brake phases.
First we consider a power–hold–coast–brake journey. Suppose we have the speed profiles

for two optimal journeys with hold speeds V and W as shown in Figure 4.3. For the power
phases, v(x) is the solution of

dv 1 P
= − r(v) + g(x) (4.9)
dx v v
with v(a) = 0 and w(x) is the solution of

dw 1 P
= − r(w) + g(x)
dx w w
with w(a) = 0. Since v(a) = w(a) then by Picard’s uniqueness theorem the solutions of the
two differential equations are the same. At the end of the respective power phases we have
v(x1 ) = V and v(x2 ) = W
where x1 < x2 . On [a, x1 ] we have v(x) = w(x). On [x1 , x2 ] we have v(x) = V ≤ w(x). So
v(x) ≤ w(x) on [a, x2 ].
Now consider the brake phases of the two journeys. The equation of motion during a brake
phase is
dv
v = −Q − r(v) + g(x). (4.10)
dx
Let VB and WB be the respective braking speeds of the two strategies. Since the two strate-
gies stop at the same location x = b, this means v(b) = w(b) = 0. By Picard’s uniqueness
theorem the two brake phases follow the same braking curve. If we have VB ≥ WB then by
Theorem 4.2.2 we must have V ≥ W . This contradicts our assumption that V < W , and so
we must have VB < WB .
Now consider the coast phase. The speed profile is the solution of the differential equation
dv
v = −r(v) + g(x). (4.11)
dx
We know that for a power–hold–coast–brake journey the two strategies start coasting at
speeds V and W respectively and change into braking at V B and WB respectively. Since
V < W then VB < WB , and so x5 < x6 where x6 and x5 are the locations where the train
starts braking for the strategy of V and of W respectively.
W v( x3 )
v( x4 )
V
WB
VB
x4 x3 x5 x6 b
Figure 4.4: Speed profiles of two optimal journeys on a non-steep track, coast and brake phases.
w( x3 )
W w( x4 )
V
v( x4 ) WB
VB
x3 x4 x5 x6 b
Figure 4.5: Speed profiles of two optimal journeys on a non-steep track, coast and brake phases.
Let x4 and x3 be locations where coast phase starts for the optimal journey with hold speeds
V and W respectively.
If x3 > x4 then it is obvious (see Figure 4.4) that v(x) < w(x) in [x 3 , x4 ]. So by Picard’s
theorem we also have v(x) < w(x) in [x3 , x6 ]. Hence v(x) < w(x) during the coast phase.
If x3 < x4 then in the interval [x3 , x4 ], we may have two cases. If V < w(x4 ) (see Figure 4.5)
then using the argument similar to the one above, we have v(x) < w(x) for x ∈ [x 3 , x6 ).
If V ≥ w(x4 ) (see Figure 4.6) then by Picard’s theorem for the interval [x4 , x5 ] we must
have v(x5 ) ≥ UW which leads to UV > UW . That contradicts Theorem 4.2.2. Therefore
v(x) < w(x) in [x3 , x6 ].
W
V
w( x4 )
VB
WB
x3 x4 x5 x6 b
Figure 4.6: Speed profiles of two optimal journeys on a non-steep track, coast and brake phases. This
situation, where the two speed profiles cross, cannot occur.
WC
VC
WB
VB
0 x1 x2 x3 x4 X
Figure 4.7: Speed profiles of power–coast–brake journeys.
We now consider a power–coast–brake journey. Let VB and WB be the braking speeds, VC

and WC be the coasting speeds, and v(x) and w(x) be the speed profiles of the journeys with
holding speeds V < W respectively. The speed profiles of the two journeys are illustrated in
Figure 4.7.
Let x3 be the unique location with w(x3 ) = WB and let x4 be the unique location with
v(x4 ) = VB . By Theorem 4.2.2 V < W =⇒ VB < WB . Since there is a unique brake
curve which v = 0 at x = X and speed decreases during a brake phase then x3 < x4 .
Because v(x4 ) = w(x4 ) and dv/dx > dw/dx on [x3 , x4 ], it follows that v(x) ≤ w(x) on
[x3 , x4 ].
Similarly, let x1 be the unique location with w(x2 ) = WC and let x1 be the unique location
with v(x1 ) = VC . By Theorem 4.2.3 if VB < WB then VC < WC . Since there is also a
unique power curve with v = 0 at x = 0 and speed increases during a power phase then
x1 < x2 . Because v(x1 ) = w(x1 ) and dv/dx < dw/dx we have v(x) ≤ w(x) on [x1 , x2 ].
By Picard’s theorem, v(x) < w(x) for x ∈ [x2 , x3 ] because v(x2 ) < w(x2 ). Therefore
v(x) < w(x) at every point in the journey.
We have shown that v(x) ≤ w(x) if v and w are power–hold–coast–brake strategies, and that
v(x) ≤ w(x) if v and w are power–coast–brake strategies. Consider Figure 4.8, where each
optimal speed profile vi has corresponding holding speed V i . We have shown that V1 < V2
and that V4 < V5 . The critial speed profile v3 with holding speed V3 is both a power–coast–
brake strategy with V3 < V4 and a power–hold–coast–brake strategy with zero length hold
phase and with V2 < V3 . Hence V2 < V3 < V4 .
4.4 Examples
Example 4.4.1 A train travels on a non-steep track with different holding speeds V ∈
{20, 22, 24} and stops at X = 12000. The track elevation profile is defined by function
1
y= 100
(cos (x/1000) + sin (x/800)). Power P = 3 and braking force Q = 0.3.
To find an optimal journey we need to find an optimal point x h where the train switches to
speedhold and an optimal point x c where the train changes to coast. To find the optimal point
xh we solve equation
dx v2
=
dv P − v[r(v) + g(x)]
35
v5
30 v
4
25 v
3
20
v
2
v1
15
10
0
0 2000 4000 6000 8000 10000 12000 14000
Figure 4.8: The optimal speed profile v3 has holdspeed phase length of 0.
from v = 0 to v = V . We use the midpoint method to find x c for which the coast and brake
phases start with v = V , θ = 1 and finish with v = 0 at x = X. Figure 4.9 shows the optimal
speed profiles for the three different holding speeds, and hence three different journey times.
Example 4.4.2 We find speed profiles of three short journeys on a track with continuous
non-steep gradient function and stopping at X = 6000. The gradient function and train
parameters used in this example are the same as those in Example 4.4.1. The holding speeds
are {20, 22, 24} although these journeys do not have a speedhold phase.
For power–coast–brake journeys, we do not know speeds at which coasting should com-
mence. We use the midpoint method to find the point (x c , vc ) on the power curve for which
the optimal coast and brake phases finish at x = X. Figure 4.10 shows optimal speed profiles
for three different hold speeds.
24
22
20
15
v
10
track elevation profile

5
0
0 2000 4000 6000 8000 10000 12000
x
Figure 4.9: Speed profiles of journeys with different hold speeds {20, 22, 24}.
25
20
15
v
10
track elevation profile

5
0
0 1000 2000 3000 4000 5000 6000
x
Figure 4.10: Speed profiles of journeys with different hold speeds {20, 22, 24}.
4.5 Conclusion
We have used perturbation analysis to prove the uniqueness of the final coast and brake
phases of an optimal journey for a train on a non-steep track. We then used this result to
show that there is a unique optimal power–hold–coast–brake strategy for long journeys and a
unique optimal power–coast–brake strategy for short journeys. Finally, we have shown that
if v is the optimal speed profile with holding speed V and w is the optimal speed profile with
holding speed W , then V ≤ W implies v ≤ w.
In parts II and III of the thesis we will consider changes to the form of the optimal strategy
when there are steep hills.
Part II
A new method for finding optimal

driving strategies
50
Chapter 5
Hold-power-hold for a single steep uphill

gradient
The optimal journey for a long journey on a non-steep track has four phases: an initial power
phase, a long speed holding phase, a coast phase and a final brake phase. The majority of the
journey is speed holding.
On a track with steep gradients, it becomes necessary to vary the strategy around steep sec-
tions of track because it is not possible to hold a constant speed on steep track. In this chapter
we consider the optimal control when the speed holding phase is interrupted by a single steep
uphill section. For simplicity we assume the track gradient is piecewise constant, and com-
prises a non-steep gradient, a steep uphill gradient, and another non-steep gradient.
It is not possible to hold speed on the steep uphill section. Instead, we must interrupt the
speed hold phase with a power phase that starts somewhere before the steep section and
finishes somewhere beyond the steep section. The aim of this chapter is to show that there
is a unique power phase that satisfies the necessary conditions for an optimal journey. We
formulate the problem of finding the optimal power phase, and present a new condition for
51
Chapter 5. Hold-power-hold for a single steep uphill gradient 52
optimality of the power phase that minimises the energy used during the power phase subject
to a weighted time penalty. We then derive key necessary conditions for an optimal power
phase, and prove that the optimal hold-power-hold phase exists and is unique.
In later chapters we will consider tracks where the non-steep and steep sections comprise
more than one gradient.
The equation of motion for a train in full power is

dv P
v = − r(v) + g(x) (5.1)
dx v
where P is the maximum power per unit mass, r(v) is the resistance force per unit mass and
g(x) is the gradient force per unit mass. The adjoint equation of the system, as defined in
equation (2.18) in Chapter 2, is
dθ ψ(v) + P P + ψ(V )
− 3
θ = (−1) (5.2)
dx v v3
where
ψ(v) = v 2 r (v) (5.3)
and v = v(x) is the solution to (5.1). From Chapter 2, we know that for an optimal journey
a change from hold to power and a change from power to hold each requires θ = 1. For
convenience, we define η = θ − 1, so that power starts and finishes at η = 0. From (5.2), the
modified adjoint equation is
dη ψ(v) + P ψ(v) − ψ(V )
− η = . (5.4)
dx v3 v3
Suppose the optimal holding speed for the entire journey is V . Howlett [20] and Howlett and
Leizarowitz [22] show that when the hold phase for an optimal journey is interrupted by a
steep uphill section, the optimal control requires a power phase that starts before the start of
V V
p b c q
Figure 5.1: Optimal speed profile for a single steep gradient.
the steep section and finishes beyond the steep section. During this power phase, the speed
of the train increases from the hold speed V before the start of the steep section, decreases to
below speed V on the steep section, and returns to speed V after the steep section. Intuitively,
we want to keep the ”average“ speed of the train during the power phase the same as the
holding speed for the overall journey. Figure 5.1 indicates the problem: we want to find
an optimal point p at which to start the power phase so that speed increases before the start
of the steep section at b, decreases through speed V on the steep uphill interval [b, c], and
increases back to speed V at some point q beyond the steep section.
5.2 Necessary conditions for an optimal power phase
As mentioned in Chapter 2, for an optimal journey we need
θ=1⇔η=0
and
dθ dη
=0⇔ =0
dx dx
at x = p and x = q. Let v0 (x) be the optimal speed profile. Integrating the modified adjoint
equation
dη ψ(v) + P ψ(v0 ) − ψ(V )
− 3
η= (5.5)
dx v0 v03
over [p, q] gives q
ψ(v0 ) − ψ(V )
Ip (x)dx = 0 (5.6)
p v03
where the integrating factor Ip (x) is defined for x ∈ [p, q] by the formula
x
ψ(v0 ) + P
Ip (x) = C exp − dξ . (5.7)
p v03
Equation (5.6) for p < b < c < q was used by Howlett [20] as a standard necessary condition
to specify the optimal power phase on a steep uphill section. We will use a variational
argument to find an alternative necessary condition.
5.3 An alternative necessary condition
If the speed changes from the optimal profile v0 (x) by an infinitesimal increment to a new
profile (v0 + δv)(x) then the equation for the new profile is
d(v0 + δv) P
(v0 + δv) = − r(v0 + δv) + g(x). (5.8)
dx v0 + δv
By applying a Maclaurin series expansion, subtracting the original equation (5.5) for v 0 and
neglecting second and higher order terms we obtain the perturbation equation

dv0 dδv P
δv + v0 = (−1) 2 + r (v0 ) δv. (5.9)
dx dx v0
We call δv a first order variation of the speed. The equation for the first order variation is
derived rigorously on page 163 of the book by Birkhoff and Rota [5]. For convenience the
relevant details are presented in Appendix A. If we rewrite (5.9) in the form

d P + ψ(v0 )
(v0 δv) = (−1) (v0 δv). (5.10)
dx v03
and integrate using the initial condition v 0 (p) = V we obtain
(v0 δv)(x) = V δv(p)Ip (x)
where Ip (x) is given by (5.7). By substitution into (5.6) we have

q
ψ(v0 ) − ψ(V )
· δv · dx = 0
p v02
or q
−ψ(V )
+ r (v0 ) · δv · dx = 0 (5.11)
p v02
where δv = δv(x) is the first order variation. This is an alternative necessary condition for
the optimal power phase on a steep uphill section. The expression (5.11) takes the form of a
first order variation for an integral cost function. Suppose we define
q
ψ(V )
J0 (v) = + r(v) dx. (5.12)
p v
We have the following result.
Theorem 5.3.1 Let v(x) be a solution to (5.1) and define

q
ψ(V )
J0 (v) = + r(v) dx
p v
where p < b < c < q are chosen so that v(p) = v(q) = V . A necessary condition for a
minimum of J0 is
q
−ψ(V )
2
+ r (v) · δv · dx + (δq − δp)ϕ (V ) = 0

p v
where δv is the first order variation and where δp and δq are chosen to satisfy
[v + δv](p + δp) = [v + δv](q + δq) = V.
Proof:
At v = v0 we have v0 (p0 ) = v0 (q0 ) = V and

q0
ψ(V )
J0 (v0 ) = + r(v0 ) dx. (5.13)
p0 v0
If we consider a small variation δv(x) to the optimal speed profile and let v = v 0 + δv, we
obtain
q0 +δq
ψ(V )
J0 (v0 + δv) = + r(v0 + δv) dx (5.14)
p0 +δp v0 + δv
where δp and δq are chosen so that the value of the function v0 + δv at the points p0 + δp and
q0 + δq is equal to V . Using a Taylor series expansion we can write (5.14) in form
q0 +δq
ψ(V )
J0 (v0 + δv) = + r(v0 ) dx
p0 +δp v0
q0 +δq
∂ ψ(V )
+ + r(v0 ) δv · dx + O(δv 2 )
p0 +δp ∂v v0
q0 q0 +δq
ψ(V ) ψ(V )
= + r(v0 ) dx + + r(v0 ) dx
p0 v0 q0 v0
p0 +δp q0
ψ(V ) ∂ ψ(V )
− + r(v0 ) dx + + r(v0 ) δv · dx
p0 v0 p0 ∂v v0
q0 +δq
ψ(V )
+ + r(v0 ) δv · dx
q0 v0
p0 +δp
ψ(V )
− + r(v0 ) δv · dx + O(δv 2 )
p0 v0
= J0 (v0 ) + (δq − δp)ϕ (V )
q0
−ψ(V )
+ 2
+ r (v0 ) δv · dx + O(δv 2 )

(5.15)
p0 v0
where we have rewritten ψ(v)/v + r(v) = ϕ (v). By subtracting (5.13) and neglecting the
second order quantities, we have the perturbed equation
q0
ψ(V )
δJ0 (v0 ) = (δq − δp)ϕ (V ) + − 2 + r (v0 ) · δv · dx (5.16)
p0 v0
for the infinitesimal variation of the functional J. In order to have (5.13) as a local minimum
of (5.14) we need to have
δJ0 = 0. (5.17)
However what we really wanted was to find a functional where the minimum is defined by
condition (5.11). Thus we modify our original proposal (5.12) for the cost functional.
Theorem 5.3.2 Let v(x) be a solution to (5.1) and define
J(v) = J0 (v) − (q − p)ϕ (V )
where p < b < c < q are chosen so that v(p) = v(q) = V . A necessary condition for a
minimum of J is q
−ψ(V )
+ r (v) · δv · dx = 0
p v2
where δv is the first order variation of v.
Proof:
For the modified functional we have
J(v0 + δv) = J0 (v0 + δv) − [(q0 + δq) − (p0 + δp)]ϕ (V )

q0 +δq
ψ(V )
= + r(v0 + δv) dx − [(q0 + δq) − (p0 + δp)]ϕ (V ).
p0 +δp v0 + δv
Note that δp and δq are chosen in the same way as the previous proof. If we write v = v0 +δv,
p = p0 + δp and q = q0 + δq then we have
q
ψ(V )
J(v) = + r(v) − ϕ (V ) dx. (5.18)
p v
Using a Taylor expansion for (5.18) and proceeding as before we have
δJ(v0 ) = J0 (v0 + δv) − J0 (v0 )
q0 +δq
ψ(V )
= + r(v0 + δv) − ϕ (V ) dx
p0 +δp v0 + δv
q0
ψ(V )
− + r(v0 ) − ϕ (V ) dx
p0 v0
q0
ψ(V )
= − 2 + r (v0 ) · δv · dx + O(δv 2).

p0 v0
q
From equation (5.11) it follows that p00 [−ψ(V )/v02 + r (v0 )] · δv · dx = 0. Hence the
optimal speed profile v0 (x) that satisfies the necessary condition for an optimal journey also
minimises the functional J 0 (v).
We can rewrite (5.18) as:

q
ψ(V ) ψ(V )
J(v, V ) = + r(v) − + r(V ) dx
p v V
q q
ψ(V ) ψ(V )
= dx − (q − p) + [r(v) − r(V )]dx
p v V p
q
q−p
= ψ(V ) (tq − tp ) − + [r(v) − r(V )] dx. (5.19)
V p
Equation (5.19) allows us to interpret the meaning of J. The term (t q − tp ) − q−p

V
on the right
hand side of (5.19) is the time taken by a train with speed profile v = v(x) to power from p to
q
q, less the time taken to travel from p to q at constant speed V . The term p [r(v) − r(V )] dx
is the extra energy spent to overcome resistance in the power mode compared to the speed
hold mode. So J is a combined measure of the time taken and the energy spent during the
power mode. If the time taken is low, energy used will be high. Conversely, if the energy used
is low, time taken will be high. The optimal power phase that minimises J is a compromise
between time and energy.
Note that the overall problem is to minimise energy usage subject to a time constraint. For
an individual steep incline the local condition that defines the power phase is to minimise
energy usage subject to an appropriately weighted time penalty.
Our next theoretical task is to prove that the minimum value of J is unique. Before we do
that, we illustrate the results with a numerical calculation.
In our example we use P = 3, driving over a constant gradient hill g = −0.2. The steep
section starts at 5000 and ends at 6000. The resistance parameters are
r0 = 6.75 × 10−3 ; r1 = 0; r2 = 5 × 10−5 .
Previously, Howlett [20] has used a shooting method to find a strategy that meets the neces-
sary conditions (2.18) described in Chapter 2. Using this method, the optimal starting point
is calculated as 4158. Calculating J for various starting points confirms that a minimum of
J occurs at point 4158 as seen in Figure 5.2.
5.3.1 Key equations
Suppose a train is powering on a steep uphill section [b, c]. The gradient function is defined
as ⎧
⎪
⎪
⎪
⎪ γ0 if x ∈ [p, b)
⎪
⎨
g(x) = γ1
⎪ if x ∈ [b, c] (5.20)
⎪
⎪
⎪
⎪
⎩γ2 if x ∈ (c, q] .
The gradient accelerations γ0 and γ2 could be either positive or negative but we assume they
are not steep. We assume γ1 is steep at hold speed V .
Theorem 5.3.3 The necessary conditions for minimising the cost function (5.18) for a train
travelling on the section defined in (5.20) are
[P − ϕ(vb ) + γ0 vb ]µ = [ϕ(vb ) − ϕ (V )vb + ψ(V )](γ0 − γ1 )

1.4932
1.4932
1.4932
1.4932
1.4932
1.4931
4152 4154 4156 4158 4160 4162 4164
starting points
Figure 5.2: Plot of J against starting points
and
[P − ϕ(vc ) + γ2 vc ]µ = [ϕ(vc ) − ϕ (V )vc + ψ(V )](γ2 − γ1 )
where µ(v) > 0 defined by

µ = λ − (ϕ (V ) − γ1 ).
Proof:
The equation of the motion of the train is
dv P
v = − r(v) + g(x) (5.21)
dx v
with v(p) = v(q) = V . We choose the starting point p with p < b and v(p) = V and then
find q < c such that v(q) = V . By separating the variables in (5.21) and integrating we
obtain vb
v 2 dv
p=b− (5.22)
V P − v[r(v) − γ0 ]
where vb is the speed of the train at the bottom of the steep section, and
V
v 2 dv
q =c+ (5.23)
vc P − v[r(v) − γ2 ]
where vc is the speed of the train at the crest of the steep section. If we integrate (5.21) from
b to c then we have vb
v 2 dv
c−b= . (5.24)
vc v[r(v) − γ1 ] − P
Integrating both sides of (5.21) from p to b gives
b b
1 2 1 2 P
v − V = dx − r(v)dx + γ0 (b − p). (5.25)
2 b 2 p v p
Integrating from b to c gives

c c
1 2 1 2 P
v − v = dx − r(v)dx + γ1 (c − b) (5.26)
2 c 2 b b v b
and integrating from c to q gives

q q
1 2 1 2 P
V − vc = dx − r(v)dx + γ2 (q − c). (5.27)
2 2 c v c
By combining (5.25)–(5.27) and rearranging, we get

q q
P
r(v)dx = dx + γ0 (b − p) + γ1 (c − b) + γ2 (q − c). (5.28)
p p v
From the previous section, the cost function (5.18) is defined by

q
ψ(V )
J0 = + r(v) − ϕ (V ) dx.
p v
By substituting (5.28) into (5.14) we obtain

q
dx
J0 = [ψ(V ) + P ] + γ0 (b − p) + γ1 (c − b)
p v (5.29)

+ γ2 (q − c) − ϕ (V )(q − p).
From (5.21) we can write

⎧
⎪
⎪
vdv
if x ∈ [p, b)
⎪
⎪
⎪
⎪ P − v[r(v) − γ0 ]
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
1 −vdv
dx = if x ∈ [b, c] (5.30)
v ⎪
⎪ v[r(v) − γ1 ] − P
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
vdv
if x ∈ (c, q].
P − v[r(v) − γ2 ]
From (5.30) we define
q
dx
A(vb , vc ) = dv
v
p vb vb
vdv vdv
= +
V P − v[r(v) + γ0 ] vc v[r(v) + γ1 ] − P
V
vdv
+ . (5.31)
vc P − v[r(v) + γ2 ]
Hence using (5.22) and (5.23) we obtain
J(vb , vc ) = [ψ(V ) + P ]A(vb , vc )

vb V
v 2 dv v 2 dv
+ γ0 + γ1 (c − b) + γ2
V P − v[r(v) − γ0 ] vc P − v[r(v) − γ2 ]
(5.32)
vb V
v 2 dv v 2 dv
− ϕ (V ) c − b + + .
V P − v[r(v) − γ0 ] vc P − v[r(v) − γ2 ]
We need to minimise J(vb , vc ) subject to
vb
v 2 dv
c−b= .
vc v[r(v) − γ1 ] − P
We define the Lagrangian function
vb
v 2 dv
J (vb , vc ) = J(vb , vc ) + λ c − b − (5.33)
vc v[r(v) − γ1 ] − P
where λ is the Lagrange multiplier. Applying the Karush-Kuhn-Tucker conditions we have
∂J ∂J
= 0 and =0
∂vb ∂vc
and the complementary slackness conditions

vb
v 2 dv
λ c−b− = 0.
vc v[r(v) − γ1 ] − P
If we weaken the equality constraint (5.24) to

vb
v 2 dv
c−b ≤ (5.34)
vc v[r(v) − γ1 ] − P
then we can also guarantee that λ is non-negative and our solution is unchanged because the
control that minimises energy will not travel further than the required distance c − b.
So we have

∂J 1 1
= [ψ(V ) + P ]vb −
∂vb P − vb [vb (r) − γ0 ] P − vb [vb (r) − γ1 ]

ϕ (V ) − γ0 λ
−vb2 −
P − vb [vb (r) − γ0 ] P − vb [vb (r) − γ1 ]
vb2 [ψ(V ) + P ](γ1 − γ0 )

=
(P − vb [vb (r) − γ0 ])(P − vb [vb (r) − γ1 ])

[ϕ (V ) − γ0 ](P − ϕ(vb ) + γ1 vb ) − λ(P − ϕ(vb ) + γ0 vb )
−vb2
(P − vb [vb (r) − γ0 ])(P − vb [vb (r) − γ1 ])
= [ψ(V ) + P ](γ1 − γ0 ) − [ϕ (V ) − γ0 ](P − ϕ(vb ) + γ1 vb )
+λ(P − ϕ(vb ) + γ0 vb ) (5.35)
and hence ∂J /∂vb = 0 gives
(P − ϕ(vb ) + γ0 vb )λ = (ψ(V ) + P )(γ0 − γ1 ) − (ϕ (V ) − γ0 )(ϕ(vb ) − γ1 vb − P ). (5.36)
Similarly with ∂J /∂vc = 0 we have
(P − ϕ(vb ) + γ2 vc )λ = (ψ(V ) + P )(γ2 − γ1 ) − (ϕ (V ) − γ2 )(ϕ(vc ) − γ1 vc − P ). (5.37)

Let
µ = λ − (ϕ (V ) − γ1 ). (5.38)
Then we can rewrite (5.36) and (5.37) as
[P − ϕ(vb ) + γ0 vb ]µ = [ϕ(vb ) − LV (vb )](γ0 − γ1 ) (5.39)
and
[P − ϕ(vc ) + γ2 vc ]µ = [ϕ(vc ) − LV (vc )](γ2 − γ1 ) (5.40)
where LV (v) = ϕ(V ) + ϕ (V )(v − V ). The line y = LV (v) is the tangent to the convex
curve y = ϕ(v). Equations (5.24), (5.39) and (5.40) are necessary conditions for an optimal
solution.
We now wish to show that µ > 0. Since γ0 and γ2 are non-steep gradient accelerations, we
have
P − ϕ(vb ) + γ0 vb > 0 (5.41)
and
P − ϕ(vc ) + γ2 vc > 0. (5.42)
Since ϕ(v) is convex and and LV (v) is the tangent to ϕ(v) at v = V it follows that
ϕ(v) − LV (v) ≥ 0.
Since γj − γ0 for j = 0, 2, we can use (5.39), (5.41) and (5.42) to conclude that µ is positive.
5.4 Existence and uniqueness
5.4.1 Geometric approach
We can write (5.39) and (5.40) as
ϕ(vb ) = Lµ,γ0 (vb ) (5.43)

and
ϕ(vc ) = Lµ,γ2 (vc ) (5.44)
where

(ϕ (V ) − γi)(γi − γ1 ) (ψ(V ) + P )(γi − γ1 )
Lµ,γi (v) = γi + vb + P −
µ + γi − γ1 µ + γi − γ1
for i = 0, 2. The right hand sides of (5.43) and (5.44) are linear functions. The straight line
y = Lµ,γ0 (v) passes through the fixed point

P + ψ(V ) [P + ψ(V )]γ0
P0 = ,P +
ϕ (V ) − γ0 ϕ (V ) − γ0
and the straight line y = Lµ,γ2 (v) passes through the fixed point

P + ψ(V ) [P + ψ(V )]γ2
P2 = ,P + .
ϕ (V ) − γ2 ϕ (V ) − γ2
Since ϕ(v) is convex, equations (5.43) and (5.44) each have at most two solutions for v.
If we let µ = 0, the lines become
y = ϕ (V )v − ψ(V ) = ϕ (V )(v − V ) + ϕ(V ) = LV (v).
The line y = LV (v) is the tangent to the curve y = ϕ(v) at the point v = V . Thus (5.43) and
(5.44) imply that vb = vc = V and the fixed points are on the common tangent.
Figure 5.3 shows the tangent y = LV (v) in purple, the line Lµ,γ0 in red and the line Lµ,γ2 in
green.
When µ > 0 the lines Lµ,γj (v) for each j = 0, 2 cut the curve y = ϕ(v) at two points v1,γi
and v2,γi with v1,γi < V < v2,γi for each i = 0, 2. Since vc < V < vb there is only one
possible solution to each equation.
Consider the slope of the lines (5.43) and (5.44)
(γi − γ1 )ϕ (V ) + γi µ
si (µ) = , for i = 0, 2.
γi − γ1 + µ
y y M ( v) y LV (v)
x P
0
xP
2
vc V vb v
Figure 5.3: Illustration of the geometric proof.
Since γi − γ1 > 0 and γi < 0 for then we can easily see that si is a monotone decreasing
function. So if µ increases the slopes of the two lines y = L µ,γ0 (v) and y = Lµ,γ2 (v)
decrease. It follows that the solution v b to the equation ϕ(vb ) = Lµ,γ0 (vb ) increases and the
solution vc to the equation ϕ(vc ) = Lµ,γ2 (vc ) decreases as shown in Figure 5.3. However
from the constraint (5.24) we can see if vb increases then vc also increases. Therefore there
is precisely one value of µ for which the necessary conditions (5.39), (5.40) and (5.24) are
satisfied. Thus the solution to equations (5.39), (5.40) and (5.24) is unique.
5.4.2 Algebraic approach
Existence of the solution
For an optimal strategy we must satisfy the conditions (5.24), (5.39) and (5.40). It is not easy
to solve this system explicitly so we use a numerical iteration. Given a value of v b we can
use (5.24) to calculate vc . Now we can calculate µ = M(vc ) from (5.40)
[ϕ(vc ) − LV (vc )](γ2 − γ1 )

M(vc ) = . (5.45)
P − ϕ(vc ) + γ2 vc
If vb , vc and µ are optimal they must satisfy (5.39). That is, we require f (v b ) = 0 where
f (vb ) = [P − ϕ(vb ) + γ0 vb ]M(vc ) − [ϕ(vb ) − LV (vb )](γ0 − γ1 ). (5.46)

Theorem 5.4.1 A solution to the equation
f (vb ) = 0
where f is defined by (5.46) and vb is the speed at x = b, exists in its domain (V, v̄ b ).
Proof:
Since vc < V < vb , the possible upper bound v̄b for vb can be determined by setting vc = V
and using (5.24) to calculate the corresponding v b . Thus vb = v̄b is the solution to the
equation vb
v 2 dv
c−b= .
V v(r(v) − γ1) − P
The minimum possible value for v b is V and the domain of f (vb ) is (V, v̄b ). The speed profile
when vb = V is illustrated in Figure 5.4.
vb
V V
b c
Figure 5.4: Speed profile when v(b) = v̄b
We can prove the existence by observing that f is a continuous function of v b , and showing
that either f (V ) < 0 and f (v̄b ) > 0 or f (V ) > 0 and f (v̄b ) < 0 . First we need to check the
sign of the two ends of the range of vb . At vb = V we have
ϕ(vb ) − LV (vb ) = ϕ(vb ) − [ϕ(V ) + ϕ (V )(vb − V ) = 0 (5.47)
when vb = V . That is,

f (V ) = [P − ϕ(V ) + γ0 V ]M(vc ).
Since γ0 is non-steep at speed V then
P − ϕ(V ) + γ0 V > 0.
We now check the sign of M(vc ). We have, as in (5.45),

[ϕ(vc ) − LV (vc )](γ2 − γ1 )
M(vc ) = .
P − ϕ(vc ) + γ2 vc
Recall that γ2 is also non-steep gradient acceleration at speed v ≤ V . Since vc < V ,
P − ϕ(vc ) + γ2 vc > 0.
That means M(vc ) > 0 for all v < V and so f (V ) > 0.
When vc = V , vb = v̄b and (5.40) gives M(vc ) = 0. Then
f (v̄b ) = −[ϕ(v̄b ) − LV (v̄b )](γ1 − γ0 ) < 0. (5.48)
Note that the maximum speed at b, v̄ b might be greater than the limiting speed at b, v L (P, γ0).
If that is the case then we need to set v̄b = vL (P, γ0). Since
ϕ(vL (P, γ0)) + γ1 vL (P, γ0) − P = 0
then
f (vL (P, γ0)) = −[ϕ(vL (P, γ0)) − LV (vL (P, γ0))](γ1 − γ0 ) < 0.
Therefore there exists at least one solution to f (v b ) = 0 in the interval vb ∈ (V, v̄b ).
Uniqueness of the solution
If we can prove f is monotonic decreasing then we can prove the solution of f (v b ) = 0 is

unique. Consider
df (vb ) dM(vc ) dvc
= [P −ϕ(vb ) + γ0 vb ] · −(ϕ (vb ) −γ0 )M(vc ) −(γ0 −γ1 )(ϕ (vb ) −ϕ (V )).
dvb dvc dvb
(5.49)
By differentiating (5.45) we have

dM(vc ) ϕ (vc )[P + ψ(V ) − ϕ (V )vc ] − ϕ (V )[P − ϕ(vc )] − γ2 [ψ(V ) − ψ(vc )]
= (γ2 −γ1 ) .
dvc [P − ϕ(vc ) + γ2 vc ]2
(5.50)
The numerator N of (5.50) is
N = ϕ (vc )[P + ψ(V )] − ϕ (V )[P − ϕ(vc ) + ϕ (vc )vc ] − γ2 [ψ(V ) − ψ(vc )]
= ϕ (vc )[P + ψ(V )] − ϕ (V )[P + ψ(vc )] − γ2 [ψ(V ) + P − ψ(vc ) − P ]
= (ϕ (vc ) − γ2 )[P + ψ(V )] − (ϕ (V ) − γ2 )[P + ψ(vc )].
Let
π(v) = (ϕ (v) − γ2 )(ψ(V ) + P ) − (ϕ (V ) − γ2 )(ψ(v) + P ). (5.51)
So now we can write (5.50) as

dM π(vc )(γ2 − γ1 )
= . (5.52)
dvc [P − ϕ(vc ) + γ2 vc ]2
Since
ψ (v) = vϕ (v)
and
P > v[r(v) − γ2 ] = ϕ(v) − vγ2
and
vϕ (v) = ψ(v) + ϕ(v)
then we have
π (v) = ϕ (v)(ψ(V ) + P ) − (ϕ (V ) − γ2 )ψ (v)
= ϕ (v)(ψ(V ) + P ) − (ϕ (V ) − γ2 )vϕ (v)
> ϕ (v) [ψ(V ) + ϕ(v) − γ2 v − (ϕ (V ) − γ2 )v]
= ϕ (v) [ψ(V ) + ϕ(v) − vϕ (V )] (5.53)
≥ ϕ (v) [ψ(V ) + ϕ(V ) + ϕ (V )(v − V ) − vϕ(V )]
= ϕ (v)[ψ(V ) + ϕ(V ) − V ϕ (V )]
= 0.
Since π (v) > 0 and, from (5.51), π(V ) = 0 we can conclude that π(v) > 0 for v > V and
π(v) < 0 for v < V . Since vc < V then π(vc ) < 0. Hence using (5.52) we obtain
dM(vc )
< 0.
dvc
By differentiating (5.24) with respect to vc we have
vc2 vb2 dvb
= ·
ϕ(vc ) − γ1 vc − P ϕ(vb ) − γ1 vb − P dvc
and hence 2
dvc vb ϕ(vc ) − γ1 vc − P
= · > 0.
dvb vc ϕ(vb ) − γ1 vb − P
So the first term of (5.49) is
dM(vc ) dvc
[P − ϕ(vb ) + γ0 vb ] · < 0.
dvc dvb
Now consider
[ϕ(vc ) − LV (vc )](γ2 − γ1 )
M(vc ) = .
P − ϕ(vc ) + γ2 vc
Since ϕ(v) is convex then ϕ(vc ) − LV (vc ) > 0 . Hence M(vc ) > 0, and therefore the second
term of (5.49) is negative. Since vb > V and ϕ(v) is convex then
ϕ (vb ) > ϕ (V )
and so the last term of (5.49) is also negative. So from (5.49) we have
df (vb )
< 0.
dvb
Therefore, there is only one solution to the equation f (v b ) = 0.
5.5 Numerical solution
We want to solve equations (5.24), (5.39) and (5.40) for v b , vc and µ. For convenience we
write them here again. They are
[P − ϕ(vb ) + γ0 vb ]µ = [ϕ(vb ) − LV (vb )](γ0 − γ1 ),

[P − ϕ(vc ) + γ2 vc ]µ = [ϕ(vc ) − LV (vb )](γ2 − γ1 )
and vb
v 2 dv
c−b=
vc v[r(v) − γ1 ] − P
respectively. From (5.39) we define the function f (vb ) by the formula
f (vb ) = [P − ϕ(vb ) − γ0 vb ]µ − [ϕ(vb ) − LV (vb )](γ1 − γ0 ). (5.54)
From the previous section, we know that f (V ) > 0 and f (v̄ b ) < 0, and that f is monotonic
decreasing. We can use a numerical method such as the Bisection Method [7] to find the
solution to f (vb ) = 0. For each candidate value of vb , we must calculate vc and µ before
we can evaluate f . The value for vc can be found using a numerical DE solver to solve the
equation of motion (5.24) forwards from (x = b, v = vb ) to x = c. The value for µ is
calculated using formula (5.45) which is derived from (5.40).
Evaluating f (v) requires many calculations. We could speed up the method by using the
regula falsi method or Brent’s method [7] to reduce the number of evaluations of f required.
5.6 Example
The non-steep and steep gradient accelerations are chosen based on limiting speeds as dis-
cussed in Chapter 3. The gradient acceleration γV is the gradient on which the train will
approach a limiting speed V under power. That is,
P
− r(V ) + γV = 0.
V
Therefore, we have
ϕ(V ) − P
γV = .
V
In our examples we use holding speed V = 20 and P = 3. The gradient acceleration that
gives speed V as a limiting speed is γ V = −0.1233. If γ < γV then the track is steep uphill.
Example 5.6.1 A single constant gradient steep section.
In this example we simulate a train powering over a constant gradient uphill section. This
section starts at x = 5000 and ends at x = 6000. The gradient of the track is
⎧
⎪
⎪
⎪
⎪ −0.075 if x < 5000
⎪
⎨
g(x) = −0.2 if x ∈ [5000, 6000] (5.55)
⎪
⎪
⎪
⎪
⎪
⎩−0.09 if x > 6000 .
First, we plotted f (vb ) for various vb ∈ [20, 24]. The left side of Figure 5.5 shows the result.
We then used the Bisection Method to find the solution to f (v b ) = 0. The optimal speed
profile is shown on the right of Figure 5.5.
24 24
23 23
22 22
21 21
v
b
20 20
19 19
18 18
17 17
−10 −5 0 5 4000 5000 6000 7000
f(vb) −3
x 10 distance
Figure 5.5: Result of Example 5.6.1

Table 5.1: Experimental results of hold-power-hold on a single steep uphill.

p vb J f (vd )
3904.15 22.000000 1.5435219085 -0.00080804
4527.70 21.000000 1.6125591341 0.00120800
4238.99 21.500000 1.4985675595 0.00026211
4078.01 21.750000 1.4983069060 -0.00025739
4160.02 21.625000 1.4931503194 0.00000624
4119.40 21.687500 1.4943599371 -0.00012460
4139.81 21.656250 1.4934189882 -0.00005893
4149.94 21.640625 1.4932013580 -0.00002628
4154.98 21.632812 1.4931551063 -0.00001000
4157.50 21.628906 1.4931475411 -0.00000187
4158.76 21.626953 1.4931476388 0.00000218
4158.13 21.627929 1.4931472669 0.00000015
4157.82 21.628417 1.4931473232 -0.00000086
4157.97 21.628173 1.4931472749 -0.00000035
4158.05 21.628051 1.4931472658 -0.00000010
4158.09 21.627990 1.4931472651 0.00000002
The sequence of estimates for the optimal vb are shown in the table 5.1. Recall that J is the
objective function for our problem, from (5.18).
5.7 Conclusion
For a track with a single steep uphill section, we have used an algebraic argument and a
geometry argument to show that there is a unique optimal location before the start of the
steep gradient at which the power phase should begin. We have developed a new set of
necessary conditions for an optimal power phase for a steep uphill section by minimising a
cost function which is a compromise of energy used and time taken. We have also developed
a new method for calculating the optimal power phase for the steep uphill section. This
method converges quickly to the unique solution.
In the next chapter we consider a more complicated uphill section where the steep section
has two gradients.
Chapter 6
Hold-power-hold on an uphill section

with two steep gradients
In the previous chapter we developed an optimal strategy for a steep uphill section with a
single constant gradient. In this chapter, we present the same analysis for a steep uphill
section with two gradients. We will formulate the problem, present a new condition for
optimality and derive key necessary conditions. The additional gradient section introduces
significant difficulties to the analysis. We are able to prove the existence but not uniqueness
of a solution. However, we can demonstrate uniqueness in our numerical examples.
75
Chapter 6. Hold-power-hold on an uphill section with two steep gradients 76
In this chapter we consider a track with four gradients, of which the middle two are steep
uphill. The acceleration due to gradient is
⎧
⎪
⎪ if x ∈ (−∞, b)
⎪
⎪ γ0
⎪
⎪
⎪
⎪
⎨γ11 if x ∈ (b, d)
g(x) = (6.1)
⎪
⎪
⎪
⎪ γ12 if x ∈ (d, c)
⎪
⎪
⎪
⎪
⎩γ if x ∈ (c, ∞).
2
We assume that
P
γ0 > − + r(V )
V
and
P
γ2 > − + r(V ).
V
That is, the track is non-steep at speed V for x ∈
/ [b, d]. We also assume that
P
γ1j < − + r(V )
V
for each j = 1, 2. Thus the track is steep at speed V for x ∈ (b, d). This is illustrated in
Figure 6.1.
The equation of motion and adjoint equation are the same as in Chapter 5. As before, we
wish to find the optimal interval [p, q] for the power phase. If γ 12 < γ11 < 0 (i.e. the second
section is more steeply uphill) then the limiting speed on the second slope will be less than
the speed at d, and so speed will decrease on the second slope, as illustrated in Figure 6.1.
On the other hand, if γ11 < γ12 < 0 then v(d) may be above or below the limiting speed on
the second slope, and so speed can either decrease or increase on the second slope. Figure
6.2 shows the case where speed increases on the second slope.
Limiting speeds
vb
V V
vd
vc
J2
J 12
J 11
J0
b d c
Figure 6.1: Speed profile on a steep uphill section with γ12 < γ11 < 0.
Limiting speeds
vb
V V
vd
vc
J2
J 12
J 11
J0
p b d c q
Figure 6.2: Speed profile on a steep uphill section with γ11 < γ12 < 0.
6.2 Necessary conditions for an optimal strategy
Lemma 6.2.1 The necessary conditions for minimising the cost function (5.18) of a train
travelling on a section described in (6.1) are
(ϕ(vb ) − γ0 vb − P )µ1 = (ϕ(vb ) − LV (vb ))(γ11 − γ0 ),
(ϕ(vd ) − γ11 vd − P )µ2 − (ϕ(vd ) − γ12 vd − P )µ1 = [ϕ(vd ) − LV (vd )](γ12 − γ11 )
and
(ϕ(vc ) − γ0 vc − P )µ2 = (ϕ(vc ) − LV (vc ))(γ12 − γ2 )
where
LV (v) = ϕ (V )(v − V ) + ϕ(V )
and µ1 , µ2 are non-negative constants.
Proof:
We have three fixed points b < d < c where the gradient acceleration changes. For a given
power phase, the speeds of the train at these points are vb , vc and vd . Using a similar argument
to the previous chapter, the starting point of the power phase is determined by
vb
v 2 dv
p=b− (6.2)
V P − v[r(v) − γ0 ]
and the final point of the power phase is determined by

V
v 2 dv
q =c+ . (6.3)
vc P − v[r(v) − γ2 ]
The speed profile must also satisfies the conditions

vb
v 2 dv
d−b= (6.4)
vd v[r(v) − γ11 ] − P
and vd
v 2 dv
c−d = . (6.5)
vc v[r(v) − γ12 ] − P
We use the same procedure that we used in Chapter 5 to obtain the cost function. This time
the cost function J depends on vb , vc and vd and has the form
vb
v 2 dv
J(vb , vd , vc ) = [ψ(V ) + P ]A(vb , vd , vc ) + γ0 + γ11 (d − b)
V P − v[r(v) − γ0 ]
V
v 2 dv
+γ12 (c − d) + γ2 − ϕ (V )(q − p)
vc P − v[r(v) − γ2 ]
where
q vb vd
dx vdv vdv
A(vb , vd , vc ) = = +
p v V P − v[r(v) − γ0 ] vb P − v[r(v) − γ11 ]
vc V
vdv vdv
+ +
vd P − v[r(v) − γ12 ] vc P − v[r(v) − γ2 ]
and
vb vb
v 2 dv v 2 dv
q−p = +
V P − v[r(v) − γ0 ] vd v[r(v) − γ11 ] − P
vd V
v 2 dv v 2 dv
+ + .
vc v[r(v) − γ12 ] − P vc P − v[r(v) − γ2 ]
In order to find the optimal speed profile we need to minimise J(v b , vc , vc ) with respect to
the two distance constraints (6.4) and (6.5). We define a Lagrangian function
vb
v 2 dv
J (vb , vc , vd ) = J(vb , vc ) + λ1 d − b −
vd v[r(v) − γ11 ] − P
vd
v 2 dv
+λ2 c − d −
vc v[r(v) − γ12 ] − P
where λ1 and λ2 are Lagrange multipliers. The Karush-Kuhn-Tucker conditions are
∂J ∂J ∂J
= 0 and = 0 and = 0.
∂vb ∂vd ∂vc
We have

∂J 1 1
= [ψ(V ) + P ]vb −
∂vb P − vb [r(vb ) − γ0 ] P − vb [r(vb ) + γ11 ]

ϕ (V ) − γ0 λ1
−vb2 −
P − vb [r(vb ) + γ0 ] P − vb [r(vb ) + γ11 ]
= [ψ(V ) + P ](γ11 − γ0 ) − [P − ϕ(vb )](ϕ (V ) − γ0 − λ1 )
+vb [λ1 γ0 − γ11 (γ0 + ϕ (V ))] .
The optimality condition ∂J /∂v b = 0 gives, after some algebra,
(P − ϕ(vb ) + γ0 vb )λ1 = (ψ(V ) + P )(γ0 − γ11 ) − (ϕ (V ) − γ0 )(ϕ(vb ) − γ11 vb − P ). (6.6)
Similarly, when ∂J /∂vc = 0 we have
(P − ϕ(vc ) + γ2 vc )λ2 = (ψ(V ) + P )(γ2 − γ12 ) − (ϕ (V ) − γ2 )(ϕ(vc ) − γ12 vc − P ) (6.7)
and J /∂vd = 0 gives

∂J 1 1
= vd [ψ(V ) + P ] −
∂vd P − vb [r(vb ) − γ11 ] P − vb [r(vb ) − γ12 ]

λ1 λ2
−vd2 +
P − vb [r(vb ) − γ0 ] P − vb [r(vb ) − γ11 ]
= [ψ(V ) + P ](γ12 − γ11 ) − (λ1 − λ2 )(P − ϕ(vd )) − vd (λ1 γ12 − λ2 γ11 ) = 0
which gives
(ϕ(vd ) − γ12 vd − P )λ1 − (ϕ(vd ) − γ11 vd − P )λ2 = (γ11 − γ12 )(ψ(V ) + P ). (6.8)
Let
µi = λi − ϕ (V ) + γ1i . (6.9)
By substituting (6.9) into (6.6) we have
(P −ϕ(vb )+γ0 vb )(µ1 +ϕ (V )−γ11 ) = (ψ(V )+P )(γ0 −γ11 )−(ϕ (V )−γ0 )(ϕ(vb )−γ11 vb −P )
which can be rearranged as
(P − ϕ(vb ) + γ0 vb )µ1 = (P − ϕ(vb ) + γ0 vb )(γ11 − ϕ (V )) + (ψ(V ) + P )(γ0 − γ11 )
−(ϕ (V ) − γ0 )(ϕ(vb ) − γ11 vb − P )
= (ϕ(vb ) − LV (vb ))(γ0 − γ11 ). (6.10)
Similarly, substituting (6.9) into (6.7) gives
(P − ϕ(vc ) + γ2 vc )µ2 = (ϕ(vc ) − LV (vc ))(γ2 − γ12 ). (6.11)
For equation (6.8) we have
(ϕ(vd ) − γ12 vd − P )(µ1 + ϕ (V ) − γ11 ) − (ϕ(vd ) − γ11 vd − P )(µ2 + ϕ (V ) − γ12 )
= (γ11 − γ12 )(ψ(V ) + P ).
So,
(ϕ(vd ) − γ12 vd − P )µ1 − (ϕ(vd ) − γ11 vd − P )µ2
= (ϕ(vd ) − γ12 vd − P )(γ11 − ϕ (V ))
−(ϕ(vd ) − γ11 vd − P )(γ12 − ϕ (V )) + (γ11 − γ12 )(ψ(V ) + P )
= [ϕ(vd ) − LV (vd )](γ12 − γ11 ). (6.12)
The equations (6.4), (6.5), (6.10), (6.11) and (6.12) are necessary conditions for minimising
the cost function.
Now we wish to prove µ1 and µ2 are positive. Because γ0 is non-steep uphill gradient
acceleration then P − ϕ(vb ) + γ0 vb > 0 and also γ0 > γ11 =⇒ γ0 − γ11 > 0. Hence, from
(6.10), we have µ1 > 0. Similarly, from (6.11), we also have µ2 > 0.
We want to solve equations (6.4), (6.5), (6.10) and (6.11) for v b , vd , vc , µ1 and µ2 . Suppose
we have a given value of vb . We can calculate vd , vc , M1 (vb ) and M2 (vc ) using equations
(6.4), (6.5), (6.10) and (6.11) respectively where
[ϕ(vb ) − LV (vb )](γ0 − γ11 )
µ1 = M1 (vb ) = (6.13)
P − ϕ(vb ) + γ0 vb
which is derived from (6.10) and
[ϕ(vc ) − LV (vc )](γ2 − γ11 )
µ2 = M2 (vc ) = (6.14)
P − ϕ(vc ) + γ2 vc
which is derived from (6.11). There is one remaining equation to be satisfied—equation
(6.12). If the calculated values satisfy this equation, they are the optimal values. If we define
f (vd ) = (ϕ(vd ) − γ11 vd − P )M2 (vc ) − (ϕ(vd )
−γ12 vd − P )M1 (vb ) − [ϕ(vd ) − LV (vd )](γ12 − γ11 )
where vb and vc are functions of vd , then (6.12) can be written as
f (vd ) = 0. (6.15)
This is a necessary condition for an optimal strategy on an uphill section with two adjoining
steep gradients.
6.3 Existence and uniqueness of an optimal solution
6.3.1 Existence
To prove the existence of an optimal solution we will show that the values of the function f at
the two ends of its domain differ in sign. The possible maximum value of v d can be obtained
by setting vc = V and the possible minimum values of v d can be obtained by setting vb = V .
We consider two cases: γ11 < γ12 < 0 and γ12 < γ11 < 0.
Case γ12 < γ11
When vb = V then from (6.13) we have M1 (vb ) = 0. Since vd < V then
f = (ϕ(vd ) − γ11 vd − P )M2 (vc ) + [ϕ(vd ) − LV (vd )](γ11 − γ12 ).
Since γ11 is steep uphill gradient acceleration then
dv P
v = − r(v) + γ11 < 0.
dx v
That is,
P − ϕ(v) + γ11 v < 0 or ϕ(v) − γ11 v − P > 0.
Recall that ϕ(v) is convex and LV (v) is a tangent to the curve y = ϕ(v) at v = V and so
ϕ(v) − LV (v) > 0. As defined in (6.14) we have
(ϕ(vc ) − LV (vc ))(γ2 − γ12 )

M2 (vc ) =
P − ϕ(vc ) + γ2 vc
which is positive because P − ϕ(vc ) + γ2 vc > 0 for a non-steep uphill gradient acceleration
γ2 . Hence,
f (vd ) = (ϕ(vd ) − γ11 vd − P )M2 (vc ) + [ϕ(vd ) − LV (vd )](γ11 − γ12 ) > 0.
When vc = V , from (6.11) we have µ2 (vc ) = 0, and so vd > V then
f (vd ) = (ϕ(vd ) − γ12 vd − P )M1 (vb ) − [ϕ(vd ) − LV (vd )](γ12 − γ11 ). (6.16)
As defined in (6.13), we have
(ϕ(vb ) − LV (vb ))(γ0 − γ11 )

M1 (vb ) = .
P − ϕ(vb ) + γ0 vb
By differentiating (6.13) we obtain
dM1 (vb ) ϕ (vb )[P + ψ(V ) − ϕ (V )vb ] − ϕ (V )[P − ϕ(vb )] − γ0 [ψ(V ) − ψ(vb )]
= (γ0 −γ11 ) .
dvb [P − ϕ(vb ) + γ0 vb ]2
(6.17)
dM (vc )
Recall that ψ(V ) = V 2 r (V ). Using a procedure similar to that used to prove dvc
< 0 in
Section 5.4.2 we can obtain
dM1 (vb )
< 0.
dvb
So with vd < vb we have M1 (vd ) > M1 (vb ), and so (6.16) can be written as
f (vd ) < (ϕ(vd ) − γ12 vd − P )M1 (vd ) − [ϕ(vd ) − LV (vd )](γ12 − γ11 ).
Using (6.13) we obtain
(ϕ(vd ) − γ12 vd − P )[ϕ(vd ) − LV (vd )](γ0 − γ11 )

f (vd ) <
P − ϕ(vd ) + γ0 vd
(P − ϕ(vd ) + γ0 vd )[ϕ(vd ) − LV (vd )](γ11 − γ12 )
− .
P − ϕ(vd ) + γ0 vd
Now check the numerator:
N = (ϕ(vd ) − γ12 vd − P )[ϕ(vd ) − LV (vd )](γ0 − γ11 )
+(P − ϕ(vd ) + γ0 vd )[ϕ(vd ) − LV (vd )](γ11 − γ12 )
= (P − ϕ(vd ) + γ11 vd)[ϕ(vd ) − LV (vd )](γ0 − γ12 )
< 0
because P − ϕ(vd ) + γ11 vd < 0 and ϕ(vd ) − LV (vd ) > 0 as proved before, and so we have
γ12 < γ0 < 0 =⇒ γ0 − γ12 > 0.
So, vb = V =⇒ f (vd ) > 0 and vc = V =⇒ f (vd ) < 0. The function f (vd ) is continuous
so it has at least one solution on its domain.
Case γ11 < γ12
When vb = V from (6.13) we have M1 (vb ) = 0. Since vd < V then
f (vd ) = (ϕ(vd ) − γ11 vd − P )M2 (vc ) + [ϕ(vd ) − LV (vd )](γ11 − γ12 )
> (ϕ(vd ) − γ11 vd − P )M2 (vd ) + [ϕ(vd ) − LV (vd )](γ11 − γ12 ) (6.18)
because we can prove similarly M2 (vc ) is a monotone decreasing function as in the previous
case and vc < vd . As seen in (6.14) M2 (vc ) is
(ϕ(vc ) − LV (vc ))(γ2 − γ12 )

M2 (vc ) = .
P − ϕ(vc ) + γ2 vc
By substituting (6.14) into (6.18) we obtain
(ϕ(vd ) − γ11 vd − P )[ϕ(vd ) − LV (vd )](γ2 − γ12 )

f (vd ) >
P − ϕ(vd ) + γ2 vd
(P − ϕ(vd ) + γ2 vd )[ϕ(vd ) − LV (vd )](γ11 − γ12 )
+ .
P − ϕ(vd ) + γ2 vd
Now consider the numerator:
N = ϕ(vd ) − γ11 vd − P )[ϕ(vd ) − LV (vd )](γ2 − γ12 )
+(P − ϕ(vd ) + γ2 vd )[ϕ(vd ) − LV (vd )](γ11 − γ12 )
= [ϕ(vd ) − LV (vd )](γ2 − γ11 )(ϕ(vd ) − P )
−γ11 vd (γ2 − γ12 )[ϕ(vd ) − LV (vd )]
+γ2vd (γ11 − γ12 )[ϕ(vd ) − LV (vd )]
= [ϕ(vd ) − LV (vd )](ϕ(vd ) − γ12 vd − P )(γ2 − γ11 ) > 0.
Therefore f (vd ) > 0.
When vc = V from (6.14) we have M2 (vc ) = 0. Since γ12 is a steep uphill gradient acceler-
ation then ϕ(vd ) − γ12 − P > 0 and γ12 − γ11 > 0 then
f (vd ) = −(ϕ(vd ) − γ12 − P )M1 (vb ) − [ϕ(vd ) − LV (vd )](γ12 − γ11 ) < 0.
So when γ11 > γ12 the function f (vd ) has at least one solution on its domain.
Note that we need to take into account the limiting speed at point d, v L (P, γ11). If the possible
maximum speed at d is greater than vL (P, γ11 ), we need to set the possible maximum speed
at d to this limiting speed. We can easily see that the new upper bound of v d will not change
the result of our proof above.
6.3.2 Uniqueness
To prove the uniqueness of the solution to f (v d ) = 0 in its domain we sought to prove f (v d )

is either strictly increasing, strictly decreasing, convex or concave. However we were unable
to prove any of these hypotheses. However, we can use numerical examples to demonstrate
that the solution is unique in many situations.
By integrating the equation of the motion of the train
dv P
v = − r(v) + g(x)
dx v
on the intervals [b, d] and [d, c] we obtain the two constraints

vb
v 2 dv
d−b=
vd v[r(v) − γ11 ] − P
as defined in (6.4) and vd

v 2 dv
c−d=
vc v[r(v) − γ12 ] − P
as defined in (6.5). As shown in the previous section, the necessary conditions for an optimal
power phase defined in (6.10), (6.11) and (6.12) are
(ϕ(vb ) − γ0 vb − P )µ1 = [ϕ(vb ) − LV (vb )](γ11 − γ0 ),
(ϕ(vd ) − γ11 vd − P )µ2 − (ϕ(vd ) − γ12 vd − P )µ1 = [ϕ(vd ) − LV (vd )](γ12 − γ11 ),
and
(ϕ(vc ) − γ0 vc − P )µ2 = [ϕ(vc ) − LV (vc )](γ12 − γ2 ).
We want to solve the two constraints and the three necessary conditions shown above for v b ,
vc , vd , µ1 and µ2 . Since it is not straightforward to solve these equations analytically, we
vb vd vc
P1 f P2
Figure 6.3: Variable dependencies in the optimisation algorithm.
developed a numerical algorithm to find the solution and to demonstrate that the solution is
unique.
Since µ1 and µ2 can be determined by vb and vc respectively then if we define
f (vd ) = (ϕ(vd ) − γ11 vd − P )µ2 − (ϕ(vd ) − γ12 vd − P )µ1
−[ϕ(vd ) − LV (vd )](γ12 − γ11 )
then equation (6.12) for the optimal values of v b , vc , vd , µ1 and µ2 can be rewritten as
f (vd ) = 0.
If we know speed vb , we can calculate the other variables vd , vc , µ1 and µ2 . We can then use
a numerical method such as the Bisection Method [7] to find the solution to f (v d ) = 0. For
each candidate value of vb , we must calculate vd , then vc , µ2 and µ1 before we can evaluate
f . The value for vd can be found using a numerical DE solver to solve the equation of motion
(6.4) forwards from (x = b, v = vb ) to x = d. The value for vc is calculated in a similar way
to vd using equation (6.5). The values for µ1 and µ2 are calculated from (6.13) and (6.14)
respectively.
Figure 6.3 shows the variable dependencies in the algorithm; the variable at the head of each
arrow depends on the variable at the tail.
6.5 Example
In this section we use our numerical algorithm to calculate optimal power phases for two
examples. In the example, the second part of the steep section is steeper than the first.
Example 6.5.1 In this example we have hold speed V = 20, and γ12 < γ11 < 0. The points
at which the gradient acceleration changes are
b = 5000 d = 5600 c = 6000.
The gradients are
γ0 = −0.075, γ11 = −0.2, γ12 = −0.225, γ2 = −0.09.
To find the optimal journey, we must first find values of v b that give f < 0 and f > 0. If we
set vb = V then we get f > 0. By trial and error, we find that setting vb = 24 gives f < 0.
We now use the Bisection method to find the optimal solution. They are
vb = 21.77, vd = 19.39, vc = 17.52.
Figure 6.4 shows how f varies with vb . We can see that f = 0 when vb = 21.77. The
corresponding speed profile is shown on the right hand side of the figure. The detailed
results obtained from the iteration are shown in Table 6.1. Recall that J is our objective
function (5.18), to be minimised.
6.6 Conclusion
For a steep uphill section with two gradients we are unable to find a theoretical proof of the
uniqueness of the optimal power phase but we are able to prove the existence of the solution.
It is quite easy to extend the numerical calculation scheme developed in Chapter 5 to handle
Table 6.1: Experimental results of hold-power-hold on a double steep uphill section, Example 6.5.1.
p vb J f
3510.69 22.500000 1.9157012441 -0.00611095
4388.49 21.250000 1.7754693693 0.00239745
3992.79 21.875000 1.6890111592 -0.00065734
4199.87 21.562500 1.6998701516 0.00109906
4098.81 21.718750 1.6856190813 0.00028531
4046.44 21.796875 1.6850097169 -0.00016888
4072.78 21.757812 1.6847509847 0.00006236
4059.65 21.777343 1.6847379085 -0.00005220
4066.23 21.767578 1.6847090363 0.00000534
4062.94 21.772460 1.6847145949 -0.00002336
4064.58 21.770019 1.6847095994 -0.00000899
4065.41 21.768798 1.6847087642 -0.00000182
4065.82 21.768188 1.6847087619 0.00000176
4065.61 21.768493 1.6847087284 -0.00000002
the additional gradient. The calculation method still converges quickly to the solution. We
have run many example journeys, of which two are given. In every example, the equation
f (vd ) = 0 only has one solution.
In the next chapter, we expand the analysis to handle any number of gradient acceleration
changes before, on, and after the steep uphill section.
25 25
24 24
23 23
22 22
v
b
21 21
20 20
19 19
18 18
17 17
−0.06 −0.04 −0.02 0 0.02 4000 5000 6000 7000
f(v ) distance
d
Figure 6.4: Result of example 6.5.1

Chapter 7
Hold-power-hold on an uphill section

with piecewise constant gradient
In the previous chapters we considered tracks with an initial non-steep gradient followed by
one or two steep uphill gradients, then a final non-steep gradient. In this chapter, we consider
the optimal control strategy when a holding phase is interrupted by a steep uphill section on
a track with many gradient changes before the steep section, on the steep section and after
the steep section. We formulate the problem and derive the key necessary conditions for an
optimal power phase. As in Chapter 6 we are unable to prove analytically the uniqueness
of an optimal solution. However, we are able to develop a numerical scheme for calculating
power phases that satisfy the necessary conditions for an optimal strategy, and demonstrate
the algorithm using an example.
91
Chapter 7. Hold-power-hold on an uphill section with piecewise constant gradient 92
steep part
J1 J n 1
f p1 p2 pg 1 pg pr ps ph ph 1 pn 1 pn f
Figure 7.1: Track gradient changes on the steep uphill section.
Suppose the train is travelling on a piecewise constant gradient comprising a sequence of

gradients that are non-steep at the hold speed V , a sequence of gradients that are steep at
speed V , and a final sequence of gradients that are non-steep at speed V .
Let
• p1 , . . . , pn be the locations in the interval (−∞, ∞) where the gradient changes;
• v1 , . . . , vn be the velocities of the train at the points p 1 , . . . , pn ;
• γ0 , γ1 , γ2 , . . . , γn−1 , γn be the respective gradient of the segments (−∞, p1 ),
[p1 , p2 ), . . . , [pn , ∞);
• pr and ps be the starting point and ending point of the steep part and 1 ≤ r, s ≤ n;
• a and b be the start and finish points of the power phase;
• pg−1 ≤ a ≥ pg+1 where pg is the first gradient change point during the power phase;
and
• ph ≤ b ≥ ph+1 where ph is the last gradient change point during the power phase.
The notation is illustrated in Figure 7.1.

We use the same cost function as defined in (5.18):

b
ψ(V )
J= + r(v) − ϕ (V ) dx. (7.1)
a v
The equation of motion for the power phase is
dv P
v = − r(v) − γj
dx v
for x ∈ (pj , pj+1 ). By integrating the equation of motion as we did in Chapters 5 and 6, we
have
b b
h−1
P
r(v)dx = dx + γi (pi+1 − pi ) + γg (pg − a) + γh (b − ph ). (7.2)
a a v i=g+1
The cost function can be written as a function of (vg , . . . , vh ) as

vg
v 2 dv
J(vg , . . . , vh ) = [ψ(V ) + P ]A(vg , . . . , vh ) + γ1
V P − v[r(v) − γg ]

h−1 V
v 2 dv
+ γi−1 (pi+1 − pi ) + γh
i=g+1 vh P − v[r(v) − γh ]
−ϕ (V )B(vg , . . . , vh )
where
vg V
vdv vdv
A(vg , . . . , vh ) = +
V P − v[r(v) − γg ] vh P − v[r(v) − γh ]
g−1 vi+1
vdv
+
i=h+1 vi
P − v[r(v) − γi]
and
r−1
vi+1 h−1 vi+1

v 2 dv v 2 dv
B(vg , . . . , vh ) = ps − pr + +
i=g vi
P − v[r(v) − γi] i=s vi P − v[r(v) − γi ]
vg V
v 2 dv v 2 dv
+ + .
V P − v[r(v) − γg ] vh P − v[r(v) − γh ]
The optimal velocities at each pair of adjacent points p i and pi+1 must satisfy the constraint
vi+1
v 2 dv
pi+1 − pi = . (7.3)
vi P − [ϕ(v) − vγi]
To minimise J(v1 , v2 , . . . , vn ) subject to (7.3) we define the Lagrangian function
J (v1 , v2 , . . . , vn ) = J(v1 , v2 , . . . , vn )

s−1 vi+1
v 2 dv
+ λi−1 pi+1 − pi − (7.4)
i=r+1 vi P − v[r(v) − γi ]
where λi−1 are Lagrange multipliers. In order to minimise J we apply the Karush-Kuhn-
Tucker conditions as we did before. That is we put
∂J
=0 (7.5)
∂vi
where i = g, . . . , h. Evaluating equation (7.5) at each point i = g + 1, . . . , h−1, we obtained
an equation in vi , µi and µi−1
[ϕ(vi ) − γi vi − P ]µi − [ϕ(vi ) − γi+1 vi − P ]µi−1 = [ϕ(vi ) − LV (vi )](γi+1 − γi ) (7.6)
where
µi = λi − ϕ (V ) + γi+1 .
At any point pi where i ∈ {g, h} we have
[P − ϕ(vi ) + γivi ]µi = [ϕ(vi ) − LV (vi )](γi − γi+1 ). (7.7)
Speed vi can be found by solving the equation (7.3) forward from v i−1 or backward from
vi+1 using a numerical DE solver. Given an initial guess for the speed v r at the bottom of the
steep section, we can calculate vi by solving equation (7.3) from v r forward if r < i ≤ h or
backward if g ≤ i < r. The values of µi can be computed using the formula
[ϕ(vi ) − LV (vi )][γi+1 − γi ] + [ϕ(vi ) − γi+1 vi − P ]µi−1
µi = Mi (vi ) = (7.8)
ϕ(vi ) − γi vi − P
derived from (7.6) for i = g + 1, . . . , h − 1 or
[ϕ(vi ) − LV (vi )](γi+1 − γi )
µi = Mi (vi ) = (7.9)
P − ϕ(vi ) − γi vi
from (7.7) for j = g, h. During the calculation we use all of the equations (7.3), (7.6) and
(7.7) at every point except the equation (7.6) at point r + 1. This remaining equation can be
written as f (vr+1 ) = 0, where
f (vr+1 ) = [ϕ(vr+1 ) − γivr+1 − P ]µr+1 − [P − ϕ(vr+1 ) + γr+1 vr+1 ]µr
−[ϕ(vr+1 ) − LV (vr+1 )](γr+1 − γr ). (7.10)
The values of vr+1 , µr+1 and µr that satisfy the equation f (vr+1 ) = 0 are optimal. The
sequence of calculations is illustrated in Figure 7.2.
vg v g 1 vr vr 1 vr 2 vh1 vh
Pg P g 1 Pr f Pr+1 Ph 2 Ph 1
Figure 7.2: The calculation sequence for the new method.
In the next section we discuss a numerical algorithm for finding v r+1 satisfying f (vr+1 ) = 0.
Before we can search for the value of vr+1 that satisfies the equation f (vr+1 ) = 0, we must
find a suitable interval [v r , v̄r ] in which to search. As in previous Chapter, we evaluate f at
vr+1 , but search for the optimal value of vr .
We require a lower bound vr for which f (vr+1 ) > 0 and an upper bound v̄r for which
f (vr+1 ) < 0. In previous chapters we were able to prove that such bounds could be found.
We have not proved that such bounds could be found in the more general case. However, our
numerical experiments show that we can find suitable bounds.
vJ 1
V vJ 2 V
Figure 7.3: Speed can sometimes decrease on a non-steep section before the start of the steep section.
We cannot start the steep section with a speed that is less than the hold speed V . Setting
vr = V gives a lower bound.
A suitable upper bound v̄r is the minimum of
• the maximum speed that can be achieved at the bottom of the hill; and
• the speed v(pr ) of the speed profile v that finishes at the top of the steep section at the
hold speed; that is, with v(p s ) = V .
The maximum speed that can be achieved at the beginning of a steep section depends on the
gradients before the steep section, and is not necessarily the limiting speed of the gradient
immediately before the steep section. For example, Figure 7.3 shows an example where the
optimal speed of the train at the beginning of the steep part is greater than the limiting speed
of the gradient immediately before the start of the steep section.
To find the maximum speed achievable at the beginning of the steep section, we:
• calculate the limiting speed vL (P, γ0 ) for gradient acceleration γ0 ;
• calculate the power speed profile v(x) with v(p1 ) = vL (P, γ0 ).
The maximum speed at the beginning of the steep section is v(p r ).

V V
vJ 2
vJ 1
Figure 7.4: Speed can sometimes increase on a steep section.
7.2.1 Calculating f
Let γV be the gradient acceleration of the steepest gradient on which the train can hold speed
V . It can be determined from
P − ϕ(V )
ϕ(V ) + γV V = P =⇒ γV = . (7.11)
V
Algorithm Calculating f for a given speed vr

Input: [p1 , p2 , . . . , pn ]; [γ1 , γ2, . . . , γn+1 ]; holding speed V ; r, index of the point where steep
part begins; vr .
Output: f (vr+1 )
1. Calculate forward vr+1 , . . . , vh using equation (7.3).
2. Calculate backward vr−1 , . . . , vg using equation (7.3).
3. Calculate µh−1 using equation (7.7).
4. Calculate µg using equation (7.7).
5. Calculate µi−1 using equation (7.6) and µi for i = h, . . . , r + 1.
6. Calculate µi+1 using equation (7.6) and µi for i = g, . . . , r.
7. Calculate f (vr+1 ) using vr+1 , µr and µr+1.
8. return f (vr+1 ).
7.2.2 Finding the optimal power phase
The algorithm for calculating the optimal power phase is defined below.
Algorithm Calculate optimal power phase

Input: [ph , . . . , pg ];[γg , . . . , γh ];
Output: all optimal velocities vg , . . . , vh ; all optimal Lagrange multipliers λ i .
1. Calculate γV , the maximum gradient of the steep section where the train can hold speed
V using (7.11).
2. Find the point r where the steep part begins by comparing the gradient acceleration to
γV . The first gradient change point whose gradient acceleration is greater than γ V is
point r.
3. Find the acceptable range [v̄r , vr ] for vr as discussed in Section 7.2.
4. Apply the Bisection method to find optimal value of v r where f (vr+1 ) = 0.
5. Calculate all optimal speed vi for i = g, . . . , h using optimal value of vr .
6. return all optimal speeds.
7.3 Examples
We will use the method described above to solve three example problems. In the first exam-
ple, the train slows on the approach to the section that is steep uphill at the hold speed V . In
the second example, the train speeds up on the part of the section that is steep uphill at the
hold speed V . The final example has many gradients before, on and after the steep section.
In the first example we consider a train with two non-steep gradients before the steep section,
one steep gradient on the steep section, and one non-steep gradient after the steep section.
The gradient immediately before the steep section has a limiting speed that is only slightly
greater than the holding speed V . The gradient acceleration that gives a limiting speed of
22
21.5
21
20.5
20
v 19.5
19
18.5
18
17.5
17
3500 4000 4500 5000 5500 6000 6500 7000 7500
x
Figure 7.5: Optimal speed profile for example 7.3.1.
V = 20 is γ = −0.12325.
Example 7.3.1 The gradient changes at locations
p1 = 4600, p2 = 5000, p3 = 5600, p4 = 6000.
The gradient accelerations on the intervals (−∞, p 1 ], (p1 , p2 ), (p2 , p3 ),

(p3 , p4 ) and (p4 , ∞) are
γ0 = −0.075, γ1 = −0.012, γ2 = −0.2, γ3 = −0.225, γ4 = −0.09.
At hold speed V = 20, the track is steep from p 2 to p4 .
The optimal speed profile is shown in Figure 7.5. The track section [4600, 5000] is non-steep
at hold speed V = 20. However, speed decreases on this section because the limiting speed
for the section, indicated in red in Figure 7.5, is less than v(4600).
In the second example, we consider a track consisting of one non-steep gradient before the
steep section, four steep gradients on the steep section, and one non-steep gradient after the
23
22
21
20
19
v 18
17
16
15
14
13
3000 4000 5000 6000 7000 8000 9000
x
Figure 7.6: Optimal speed profile for example 7.3.2.
steep section. The third of the steep gradients is steep at speed V = 20, but the commencing
speed is less than the limiting speed on this section, and so the speed of the train increases
on this section.
Example 7.3.2 The gradient changes at locations
p1 = 5000, p2 = 5600, p3 = 6000, p4 = 6500, p5 = 6800.
The gradient accelerations are
γ0 = −0.075, γ1 = −0.22, γ2 = −0.27, γ3 = −0.15, γ4 = −0.2, γ5 = −0.09.
At hold speed V = 20, the track is steep from p 1 to p5 .
The optimal speed profile is shown in Figure 7.6. The limiting speed on the interval from
6000 to 6500 is 17.45, which is greater than v(6000) = 16.79, and so the speed of the train
increases on this track interval. Speed decreases on the next interval because the limiting
speed on the interval is 13.85 which is less than v(6500) = 16.98.
Table 7.1: Optimal speeds and limiting speeds for Example 7.3.2.
p γ v vL
−∞ -0.075 25.976
5000 -0.22 22.627 12.77
5600 -0.27 19.60 10.623
6000 -0.15 16.79 17.445
6500 -0.2 16.98 13.865
6800 -0.09 16.26 23.927
∞ -0.09
The final example has fifteen gradients: three non-steep gradients before the steep section,
nine steep gradients on the steep section, and three non-steep gradients after the steep section.
The gradient change points, gradients, optimal speeds and limiting speeds are given in table
7.2. Function f (v4 ) is shown in Figure (7.7), which shows the convexity of the function f
and confirms the uniqueness of the optimal solution in this case. The zero of the function is
at v3 = 22.50. The optimal speed profile is shown in Figure 7.8. Note that speed decreases
on the second gradient, before the steep hill, and increases on the second-to-last gradient on
the steep section.
Table 7.2: Track data for the example of hold-power-hold on multi steep uphill section.
p γ v vL
−∞ -0.075 25.978
4600 -0.120 22.39 20.34
5000 -0.080 22.09 25.28
5400 -0.200 22.50 13.87
6000 -0.250 20.02 11.39
6400 -0.150 17.59 17.45
6900 -0.230 17.55 12.29
7200 -0.190 16.25 14.48
7700 -0.220 15.59 12.77
8100 -0.260 14.57 10.99
8400 -0.170 13.27 15.85
8900 -0.225 14.35 12.52
9200 -0.080 13.76 25.28
9600 -0.110 16.39 21.46
10100 -0.090 17.69 23.93
∞
When the limiting speed is less than the speed at the beginning of the interval, speed de-
creases even on a non-steep section. If limiting speed is greater than the speed at the begin-
ning of the interval, speed will increase even on a steep section. The two cases where speed
increases on a particular gradient section in this example can be seen in Figure 7.8 and are
shown in bold in table 7.2.
0.1
−0.1
−0.2
−0.3
f
−0.4
−0.5
−0.6
−0.7
−0.8
20 20.5 21 21.5 22 22.5 23 23.5 24 24.5 25
v
3
Figure 7.7: Function f
23
22
21
20
19
v 18
17
16
15
14
13
3000 4000 5000 6000 7000 8000 9000 10000 11000 12000
x
Figure 7.8: Speed decreases on a non-steep section.

7.4 Conclusion
The new numerical method for finding an optimal power phase for a steep uphill track can be
extended to handle any number of gradients. Although we have not been able to prove that
the solution found using our method is unique, we have used our numerical method to plot
the function f for many examples; in each example f = 0 appears to have only one solution
on the domain of feasible speeds.
Chapter 8
Hold-coast-hold for a single steep

downhill gradient
In this chapter we consider the optimal strategy when a holding phase is interrupted by a
single steep downhill gradient section. We want to show that an optimal speed profile exists,
and that it is unique.
We first formulate the problem and present the necessary conditions for an optimal strategy
obtained by Howlett [20]. We then derive alternative necessary conditions using an argument
similar to that used in Chapter 5. Next we prove the existence and uniqueness of the optimal
strategy. Finally we calculate an optimal strategy with an example.
For a steep downhill section, the train must start coasting before the steep section and main-
tain the coast phase beyond the end of the steep section. Speed will decrease before the steep
section, increase on the steep section, then decrease back to the hold speed after the steep
section.
105
Chapter 8. Hold-coast-hold for a single steep downhill gradient 106
vc
V V
J0
vb
J1
J2
p b c q
Figure 8.1: Track profile and speed profile
The equation of motion of a train during coasting is
dv
v = −r(v) + g(x). (8.1)
dx
The modified adjoint equation of the system during coasting is
dθ ψ(v) ψ(V )
= 3 θ− .
dx v v3
As in previous chapters, we let η = θ − 1 so that the optimal coast phase starts and finishes
with η = 0. The adjoint equation becomes
dη ψ(v) ψ(v) − ψ(V )

− 3 η= . (8.2)
dx v v3
Suppose the track is steep downhill on the interval [b, c], and non-steep everywhere else. It
is known from Howlett [20] that the switch from speed hold to coast must occur at some
point p before the steep section [b, c] begins, and the switch back to speed hold must occur at
some point q after the steep section finishes, as shown in Figure 8.1. We want to find optimal
switching points p and q.
8.2 Necessary conditions for an optimal coast phase
Howlett [20] has shown that an optimal coast phase must have
dη
η= = 0 and v = V (8.3)
dx
at the start p and the end q.
Let v0 (x) be the optimal speed profile. Integrating the adjoint equation over [p, q] gives
q
ψ(v) − ψ(V )
I0 (x)dx = 0 (8.4)
p v3
where the integrating factor I0 (x) is defined by
x
ψ(v0 )
I0 (x) = C exp dξ . (8.5)
p0 v03
Equation (8.4) is a necessary condition for an optimal coast phase on a steep downhill sec-
tion. It is similar to the condition used in Chapter 5. We want to find an alternative necessary
condition using a variational argument.
8.3 An alternative necessary condition
We will use the same cost function and the same variation as we used in Chapter 5, because
the derivative of J is the same. The cost function is
q
ψ(V )
J(v) = + r(v) − ϕ (V ) dx. (8.6)
p v
Using a similar argument to that used in Chapter 5 we have the perturbation formula
q0
ψ(V )
δJ = − 2 + r (v0 ) δv · dx
p0 v0
where v0 is the optimal trajectory and δv is an allowable variation. From the alternative
optimality condition given in equation (5.11) it follows that δJ = 0 for the optimal speed
profile. Hence the optimal speed profile v0 (x) is the one that minimises the functional J(v).
9.1778
9.1778
9.1778
9.1778
9.1778
J
9.1778
9.1778
9.1778
9.1778
9.1778
1060 1062 1064 1066 1068 1070 1072
Coasting points
Figure 8.2: Cost function J for various coasting locations.
Example 8.3.1 Suppose the desired holding speed is V = 20 and we have track with gradi-
ent acceleration ⎧
⎪
⎪
⎪
⎪γ0 = 0.01, x < b,
⎪
⎨
g(x) = γ1 = 0.2, x ∈ [b, c], (8.7)
⎪
⎪
⎪
⎪
⎪
⎩γ2 = 0.012, x > c
where b = 5000 is the start of the steep downhill section and c = 6000 is the end of the steep
section. The train resistance parameters are r 0 = 6.75 × 10−3 , r1 = 0, and r2 = 5 × 10−5 .
For any start location p < b we calculate J by first calculating a coasting profile v starting
with v(p) = V and finishing at some location q > c with v(q) = V . We then calculate J for
the speed profile v using (8.6). The result is shown in Figure 8.2.
Alternatively, we can find the optimal solution by using a numerical differential equation
solver to calculate v and θ trajectories for various starting points, and then use the bisection
method to find the starting point that gives v and θ trajectories that satisfy the necessary
conditions (8.3). This is essentially the method used by Howlett et al [24]. Using this
method, we find that the optimal starting point for the coast phase is p = 1065. The plot of
function J shown in Figure 8.2 illustrates that J is minimised at this point.
8.4 Key equation
In this section we analyse the optimal coast phase when a hold phase is interrupted by a
single steep downhill gradient and derive alternative necessary conditions for the optimal
coast phase.
Suppose the hold speed for the journey is V . The gradient acceleration is
⎧
⎪
⎪
⎪
⎪ γ0 if x < b
⎪
⎨
g(x) = γ1 if x ∈ [b, c] (8.8)
⎪
⎪
⎪
⎪
⎪
⎩γ2 if x > c .
We assume γ0 and γ2 are non-steep gradient accelerations at speed V , and that the gradient
acceleration γ1 is steep downhill at speed V . That is, γ 0 − r(V ) < 0, γ1 − r(V ) > 0, and
γ2 − r(V ) < 0.
We proceed as we did in section 5.3.1. We wish to minimise the cost function

q
ψ(V )
J(v) = + r(v) − ϕ (V ) dx.
P v
Using separation of variables for the equation of motion of the train (8.1), we obtain
V
vdv
p=b− (8.9)
vb r(v) − γ0
where vb is the speed of the train at the top of the steep section, and
vc
vdv
q =c+ (8.10)
V r(v) − γ2
where vc is the speed of the train at the bottom of the steep section. Integrating (8.1) from b
to c gives vc
vdv
c−b= . (8.11)
vb γ1 − r(v)
Integrating (8.1) from p to b we have
b b vb
vb2 − V 2 vdv
=− r(v)dx + γ0 (b − p) = − r(v)dx + γ0 . (8.12)
2 p p V γ0 − r(v)
From b to c we have c
vc2 − vb2
=− r(v)dx + γ1 (c − b). (8.13)
2 b
From c to q, we obtain
q q vc
V 2 − vc2 vdv
=− r(v)dx + γ2 (q − c) = − r(v)dx + γ2 . (8.14)
2 c c V γ2 − r(v)
By adding (8.12), (8.13) and (8.14) and then rearranging the result, we obtain
q vb vc
vdv vdv
r(v)dx = γ0 + γ1 (c − b) + γ2 . (8.15)
p V γ0 − r(v) V γ2 − r(v)
From (8.1) we can write

⎧
⎪
⎪
dv
if x ∈ [p, b)
⎪
⎪
⎪
⎪ γ0 − r(v)
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
1 dv
dx = if x ∈ [b, c] (8.16)
v ⎪
⎪ γ1 − r(v)
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
dv
if x ∈ (c, q].
γ2 − r(v)
By substituting (8.15) and (8.16) into the cost functional
q
ψ(V )
J(v) = + r(v) − ϕ (V ) dx
p v
we obtain
V vc vc
dv dv dv
J(vb , vc ) = ψ(V ) + +
vb r(v) − γ0 vb γ1 − r(v) V r(v) − γ2
vb V
vdv vdv
+ γ0 + γ1 (c − b) + γ2
V γ0 − r(v) vc γ2 − r(v)
vc V
vdv vdv
− ϕ (V ) c − b + + . (8.17)
V r(v) − γ2 vb r(v) − γ0
We want to minimise J subject to (8.11). We define the Lagrangian function
vc
vdv
J (vb , vc ) = J(vb , vc ) + λ c − b − . (8.18)
vb γ1 − r(v)
where λ is Lagrange multiplier. Applying the Karush-Kuhn-Tucker conditions we have

∂J ∂J
= 0 and =0
∂vb ∂vc
and the complementary slackness condition
vc
vdv
λ c−b− = 0.
vb γ1 − r(v)
If we weaken condition (8.11) to
vc
vdv
c−b− ≤0 (8.19)
vb γ1 − r(v)
the solution remain the same because to minimise fuel consumption we will not travel further
than the required distance [b, c]. For the weakened problem we can guarantee that λ ≥ 0. We
now have

∂J 1 1 γ0 − ϕ (V ) λ
= ψ(V ) − + vb [ +
∂vb γ0 − r(v) γ1 − r(v) γ0 − r(v) γ1 − r(v)
= ψ(V )(γ1 − γ0 ) + vb [(γ0 − ϕ (V ))(γ1 − r(v)) + λ(γ0 − r(v))] .
With ∂J /∂vb = 0 we obtain
[ϕ(vb ) − γ0 vb ] λ = [ϕ (V ) − γ0 ][ϕ(vb ) − γ1 vb ] + ψ(V )(γ1 − γ0 ). (8.20)

Similarly, from ∂J /∂vc = 0 we obtain
[ϕ(vc ) − γ2 vc ] λ = [ϕ (V ) − γ2 ][ϕ(vc ) − γ1 vc ] + ψ(V )(γ1 − γ2 ). (8.21)
Let
µ = λ − ϕ (V ) + γ1 .
Equations (8.20) and (8.21) become
[ϕ(vb ) − γ0 vb ] µ = [ϕ(vb ) − LV (vb )] (γ1 − γ0 ) (8.22)
and
[ϕ(vc ) − γ2 vc ] µ = [ϕ(vc ) − LV (vc )] (γ1 − γ2 ) (8.23)
where y = LV (v) is the tangent to y = ϕ(v) at v = V . That is
LV (v) = ϕ(V ) + ϕ (V )(v − V ). (8.24)
Equations (8.22) and (8.23) along with the constraint (8.11) are necessary conditions for
minimising cost function (8.6) for a train coasting on a single steep downhill section.
8.5 Existence and uniqueness
We now wish to show that an optimal coast phase exists, and is unique. As before, we use
two approaches: a geometric approach, and an algebraic approach.
8.5.1 Geometric approach
From equation (8.22) we have
γ0 µ − ϕ (V )(γ1 − γ0 ) ψ(V )(γ1 − γ0 )

ϕ(vb ) = vb +
µ − γ1 + γ0 µ − γ1 + γ0
which can be rewritten as

(ϕ (V ) − γ0 )(γ1 − γ0 ) ψ(V )(γ1 − γ0 )
ϕ(vb ) = γ0 − vb + .
µ − γ1 + γ0 µ − γ1 + γ0
Similarly, from equation (8.23) we have

(ϕ (V ) − γ2 )(γ1 − γ2 ) ψ(V )(γ1 − γ2 )
ϕ(vc ) = γ2 − vc + .
µ − γ1 + γ2 µ − γ1 + γ2
Define

(ϕ (V ) − γ0 )(γ1 − γ0 ) ψ(V )(γ1 − γ0 )
Lγ0 ,γ1 ,µ (vb ) = γ0 − vb + . (8.25)
µ − γ1 + γ0 µ − γ1 + γ0
and

(ϕ (V ) − γ2 )(γ1 − γ2 ) ψ(V )(γ1 − γ2 )
Lγ2 ,γ1 ,µ (vc ) = γ2 − vc + . (8.26)
µ − γ1 + γ2 µ − γ1 + γ2
We can see the functions (8.25) and (8.26) are linear functions of vb and vc respectively.
The functions (8.25) and (8.26) go through the points

ψ(V ) ψ(V )
P1 = , γ0 (8.27)
ϕ (V ) − γ0 ϕ (V ) − γ0
and
ψ(V ) ψ(V )
P2 = , γ2 (8.28)
ϕ (V ) − γ2 ϕ (V ) − γ2

respectively, as shown in Figure 8.3. By substituting these two points into (8.24), we can see
that these two points also lie on the tangent line L V , as shown in brown in Figure 8.3. Since
γ0 is a non-steep gradient acceleration, we have
dv
v = −r(v) + γ0 < 0.
dx
So
ϕ(V ) > γ0 V.
Similarly,
ϕ(V ) > γ2 V.
We can see that the two fixed points P1 and P2 have the form (W0 , W0 γ0 ) and (W2 , W2 γ2 )
where
ψ(V ) ψ(V )
W0 = and W2 = .
) − γ0
ϕ (V ϕ (V ) − γ2
This means that the points (8.27) and (8.28) are below the curve y = ϕ(v) as shown in the
Figure 8.3.
When µ = 0 the two lines (8.25) and (8.26) become
L(v) = ϕ (V )(v − V ) + ϕ(V )
which is the tangent to the curve y = ϕ(v) at v = V . The slope

(ϕ (V ) − γi )(γ1 − γi)
Si (µ) = γi −
µ − γ1 + γi
of each line is a monotone increasing function of µ, so when µ increases, the slope also
increases. Note that vb < V < vc . When µ increases, the line (8.25) rotates anticlockwise
around point (8.27), and so vb decreases as shown in Figure 8.3. Meanwhile the line (8.26)
also rotates anticlockwise around point (8.28) but v c increases. However, by the constraint
(8.11), if vb decreases then vc also decreases. That implies there is only one value of µ that
satisfies condition (8.11). Therefore there is only one optimal solution to the system.
8.5.2 Algebraic approach
We want to find an optimal speed profile that satisfies equations (8.11), (8.22) and (8.23).
To solve these equations for vb , vc and µ, we use a method similar to that used in Chapter 5.
Note that µ can be calculated using equation (8.23
[ϕ(vc ) − LV (vc )](γ1 − γ2 )
µ = M(vc ) = . (8.29)
ϕ(vc ) − γ2 vc
Using (8.22), we define
f (vb ) = [ϕ(vb ) − γ0 vb ] M(vc ) − [ϕ(vb ) − LV (vb )] (γ1 − γ0 ). (8.30)

y M ( v) y LV (v)
y
xP
2
P1 x
vb V vc v
Figure 8.3: Illustration of the geometric proof.
Note that vc can be expressed as a function of vb . The optimal solutions v b , vc and M(vc )
must satisfy the equation
f (vb ) = 0. (8.31)
Existence of an optimal solution
To prove the existence of the optimal solution, we want to show that function f is monotone
in the range of vb . We know that the maximum value v b can have is the holding speed V .
There are two cases for the minimum bound v b , illustrated in Figures 8.4 and 8.5. If there
exists a speed profile v on [b, c] with v(c) = V and v(b) ≥ 0 then v b = v(b) is a lower
bound for the domain of f as illustrated in Figure 8.4. In this case we can find v b by solving
equation (8.11) backwards from (x, v) = (c, V ). Any speed profile with v(b) < v b will have
v(c) < V, and the speed profile will not return to the holding speed after the steep section.
If the speed profile v on [b, c] with v(b) = 0 has v(c) ≥ V the v b = 0 is a lower bound for
the domain of f (vb ) as illustrated in Figure 8.5.
v
V
vb
0 b c x
Figure 8.4: Diagram of minimum vb > 0.
v(c) > V
V
0 b c x
Figure 8.5: Diagram of minimum vb = 0.

We now check the signs of the function at the two ends of its domain.
When vb = V , we have
f (V ) = [ϕ(V ) − γ0 V ]M(vc ).
The gradient acceleration γ0 is non-steep at speed V , and so
ϕ(V ) − γ0 V > 0.
Now we need to check the sign of M(vc ). It is defined in (8.29) as
[ϕ(vc ) − LV (vc )](γ1 − γ2 )

M(vc ) = .
ϕ(vc ) − γ2 vc
Recall that γ1 > γ2 and gradient acceleration γ2 is non-steep and so
ϕ(vc ) − γ2 vc > 0.
The curve y = ϕ(v) is convex and y = LV (v) is the tangent at v = V and so we have
ϕ(v) − LV (v) > 0. Hence M(vc ) > 0 and so f (V ) > 0.
We now consider f at v b . When vc = V , from (8.29) we have M(vc ) = 0 and v(b) = v b .

Then
f (vb ) = −[ϕ(v b ) − LV (v b )](γ1 − γ0 ).
We know
ϕ(v b ) − LV (v b ) > 0
because LV (v) is tangent to the convex function y = ϕ(v), and since γ 1 > γ0 then f (v b ) < 0.
If v b = 0, we have
f (0) = −(γ1 − γ0 )ψ(V ) < 0.
The function f is positive when v(b) = V and negative when v(b) = v b and is continuous.
Therefore there exists at least one solution to f (v b ) = 0 in the domain of f , [v b , V ].
Uniqueness of an optimal solution
To prove uniqueness we need to show that the function f from (8.30) is a strictly increasing
function. We will prove that the first derivative of the function is positive. We have
df dM(vc ) dvc
= [ϕ(vb ) − γ0 vb ] . +[ϕ (vb ) − γ0 ] M(vc )+[ϕ (V )−ϕ (vb )](γ1 −γ0 ). (8.32)
dvb dvc dvb
Consider ϕ (v) − γ0 . Recall that ϕ(v) = vr(v). Because r(v) − γ0 > 0 we have
ϕ (v) − γ0 = r(v) − γ0 + vr (v) > 0.
Recall that, from the result of the previous section, M(v c ) > 0 and so then the second term
[ϕ (vb ) − γ0 ] M(vc ) > 0. The third term is also positive because V > v b and ϕ(v) is convex.
We now consider dM(vc )/dvc . Taking derivative of (8.29) we have
dM(vc ) (γ1 − γ2 )
= · A(vc ) (8.33)
dvc (ϕ(vc ) − γ2 vc )2
where
A(vc ) = [ϕ (vc ) − ϕ (V )][ϕ(vc ) − γ2 vc ] − [ϕ(vc ) − ϕ(V ) − ϕ (V )(vc − V )][ϕ (vc ) − γ2 ].
Rearranging A(vc ) gives
A(vc ) = [ϕ (vc ) − ϕ (V )][ϕ(vc ) − γ2 vc ]
−[ϕ(vc ) − ϕ(V ) − ϕ (V )(vc − V )][ϕ (vc ) − γ2 ]
= −ϕ (vc )γ2 vc + γ22 vc − ϕ (V )[ϕ(vc ) − γ2 vc ] + ϕ(vc )γ2
−γ22 vc + [ϕ(V ) + ϕ (V )(vc − V )][ϕ (vc ) − γ2 ]
= [ϕ (vc ) − γ2 ][ϕ(V ) + ϕ (V )(vc − V ) − γ2 vc ] − [ϕ (V ) − γ2 ][ϕ(vc ) − γ2 vc ].
Let
π(v) = [ϕ (v) − γ2 ][ϕ(V ) + ϕ (V )(v − V ) − γ2 v] − [ϕ (V ) − γ2 ][ϕ(v) − γ2 v] (8.34)
=⇒ π (v) = ϕ (v)[ϕ(V ) − V ϕ (V ) + vϕ (V ) − γ2 v].

When π (v) = 0, we have
[ϕ (V ) − γ2 ]v + ϕ(V ) − V ϕ (V ) = 0.
The solution to this equation is
V ϕ (V ) − ϕ(V )
v0 = .
ϕ (V ) − γ2
Because γ2 is non-steep downhill then
ϕ(V ) > γ2 V or − ϕ(V ) < −γ2 V
and so
V ϕ (V ) − ϕ(V ) < V ϕ (V ) − γ2 V = V (ϕ (V ) − γ2 ).
Since ϕ (V ) − γ2 > 0 it follows that
V ϕ (V ) − ϕ(V )
V > = v0 .
ϕ (V ) − γ2
We have π (v) > 0 for vc > v0 . From (8.34) we have
π(V ) = 0.
Because vc > V > in coast phase, we have
π(vc ) > π(V ) = 0.
That means A(vc ) > 0 and hence from (8.33) we have
dµ(vc )
> 0.
dvc
Now consider dvb /dvc . By differentiating (8.11) with respect to vb we obtain
vb vc dvc
= ·
γ1 − r(vb ) γ1 − r(vc ) dvb
and so
dvc vb γ1 − r(vc )
= · > 0.
dvb vc γ1 − r(vb )
Hence the first term of (8.32) is positive, and so
df
> 0.
dvb
That means f is a monotone increasing function of v b . Therefore the solution to the equation
f (vb ) = 0
is unique.
8.5.3 Numerical solution
To find an optimal coast phase we need to solve equations (8.22), (8.23) and (8.11) for v b , vc
and µ. The equations are
[ϕ(vb ) − γ0 vb ] µ = [ϕ(vb ) − LV (vb )] (γ1 − γ0 ),
[ϕ(vc ) − γ2 vc ] µ = [ϕ(vc ) − LV (vc )] (γ1 − γ2 )
and vc
vdv
c−b=
vb γ1 − r(v)
respectively. These equations cannot be solved analytically. Instead we define a function as
in (8.30). It is
f = [ϕ(vb ) − γ0 vb ] µ(vc ) − [ϕ(vb ) − LV (vb )] (γ1 − γ0 ).
The domain of the function is [v b , V ] as discussed in Section 8.5.2. We proved in the previous
section that the values of the function at the two ends of the domain are different in sign, and
that there is only one solution in the interval, and so we can use the Bisection method to find
the solution, as we have done in Chapters 5, 6 and 7.
8.6 Example
Example 8.6.1 Coasting on a single steep downhill gradient.
The gradient acceleration of the track is

⎧
⎪
⎪
⎪
⎪ 0.01, x < 5000
⎪
⎨
g(x) = 0.2 x ∈ [5000, 6000] (8.35)
⎪
⎪
⎪
⎪
⎪
⎩0.012 x > 6000.
The hold speed is V = 20. The gradient accelerations 0.01 and 0.012 are non-steep; 0.2 is
steep at the holding speed V.
We use a calculation scheme similar to that used in Chapter 5 for v b ∈ [13, 20]. In Figure 8.6
the optimal speed profile is displayed on the right hand side and function f is shown on the
left. The optimal speed at the top of the steep section is 17.06 which is also the zero of the
function f as seen in the Figure.
The results of the numerical iteration are shown in Table 8.1. J is the objective function to
be minimised.
8.7 Conclusion
For a train approaching a steep downhill section with a single gradient, there is a unique
optimal location at which to start coasting. We can use a numerical procedure similar to
that developed in the previous chapters to calculate the optimal speed profile for the steep
downhill section. The calculation scheme again converges quickly to the unique solution.
In the next chapter, we will consider a steep downhill section with two gradients.
26 26
25 25
24 24
23 23
22 22
vb21 21
20 20
19 19
18 18
17 17
16 16
15 15
−0.1 −0.05 0 0.05 0.1 0 5000 10000
f(v ) distance
b
Figure 8.6: Optimal speed profile calculated from example 8.6.1

Table 8.1: Experimental results of hold-coast-hold on a single steep downhill section.

p vb J f (vd )
197.95 16.500000 9.5838272984 -0.00280125
2770.89 18.250000 10.6524352617 0.00545279
1534.68 17.375000 9.2928020903 0.00150987
880.01 16.937500 9.1960529377 -0.00059866
1210.61 17.156250 9.1888353298 0.00046724
1046.15 17.046875 9.1779579444 -0.00006278
1128.58 17.101562 9.1798505852 0.00020295
1087.42 17.074218 9.1780084420 0.00007026
1066.79 17.060546 9.1777580628 0.00000378
1056.47 17.053710 9.1778015735 -0.00002948
1061.63 17.057128 9.1777657291 -0.00001284
1064.21 17.058837 9.1777583760 -0.00000452
1065.50 17.059692 9.1777573397 -0.00000036
1066.15 17.060119 9.1777574813 0.00000170
1065.83 17.059906 9.1777573555 0.00000066
1065.67 17.059799 9.1777573339 0.00000014
1065.58 17.059745 9.1777573333 -0.00000011
1065.63 17.059772 9.1777573327 0.00000002
Chapter 9
Hold-coast-hold on a downhill section

with two steep gradients
In this chapter we consider the optimal strategy when a holding phase is interrupted by a steep
downhill section with two constant gradients. The structure of this chapter is similar to that
of Chapter 8: we formulate the problem, derive necessary conditions for an optimal control,
prove the existence of an optimal solution, develop a numerical procedure for calculating an
optimal control, and give some examples. As in Chapter 6, the second gradient in the steep
section increases the complexity the problem and means that we can no longer prove that
the optimal coast phase is unique. However, we develop the new necessary conditions for an
optimal coast phase and use them to construct the new numerical method for calculating an
optimal coast phase.
124
Chapter 9. Hold-coast-hold on a downhill section with two steep gradients 125
The track compromises two steep downhill gradients between two non-steep gradients. The
gradient acceleration is given by
⎧
⎪
⎪
⎪
⎪ γ0 if x < b
⎪
⎪
⎪
⎪
⎨γ11 if x ∈ [b, d)
g(x) = (9.1)
⎪
⎪
⎪
⎪ γ12 if x ∈ [d, c)
⎪
⎪
⎪
⎪
⎩γ if x > c
2
where assume that γ1j > r(V ) for each j = 1, 2 and γi < r(V ) for i = 0, 2.
The equation of motion of the train while coasting is
dv
v = −r(v) + g(x).
dx
Using η = θ − 1, from equation (2.19), the equation of the modified adjoint variable η is

− 3 η= .
dx v v3
We need to start coasting at a point p before the steep section and switch back to speed hold
at a point q after the steep section. We wish to find the optimal interval [p, q] for the coast
phase.
Figure 9.1 shows a typical optimal speed profile for two steep downhill gradients. However,
if the speed on the first steep gradient increases above the limiting speed of the second steep
gradient, speed will decrease on the second steep gradient, as shown in Figure 9.2.
vc
V V
vd
J0
vb
J 11
J 12
J2
p b d c q
Figure 9.1: Speed profile on a steep downhill section with two steep gradients.
Limiting speed
vd
vc
V Limiting speed V
J0 vb
J 11
J 12 J2
b d c
Figure 9.2: An alternative speed profile on a steep downhill section with two steep gradients.
9.2 Necessary conditions for an optimal strategy
We want to find necessary conditions for minimising the cost function (8.6) for a train coast-
ing on a downhill section with two steep gradients. We will use a method similar to that used
in Chapter 8.
By integrating equation (9.2) on [p, b] and [c, q] we obtain

V
vdv
p=b−
vb r(v) − γ0
and vc
vdv
q =c+
V r(v) − γ2
where p and q are the starting and ending points of the coast phase. By integrating equation
(9.9) on [b, d] and [d, c] we obtain the distance constraints
vd
vdv
d−b= (9.2)
vb γ11 − r(v)
and vc
vdv
c−d= . (9.3)
vd γ12 − r(v)
Integrating (9.2) from p to b we have
b b vb
vb2 − V 2 vdv
=− r(v)dx + γ0 (b − p) = − r(v)dx + γ0 , (9.4)
2 p p V γ0 − r(v)
from b to d we have d
vd2 − vb2
=− r(v)dx + γ11 (d − b), (9.5)
2 b
from d to c we have c
vc2 − vd2
=− r(v)dx + γ12 (c − d), (9.6)
2 d
from c to q, we obtain
q q vc
V 2 − vc2 vdv
=− r(v)dx + γ2 (q − c) = − r(v)dx + γ2 . (9.7)
2 c c V γ2 − r(v)
Adding (9.4), (9.5), (9.6) and (9.7) together, then rearranging, gives
q vb vc
vdv vdv
r(v)dx = γ0 + γ11 (d − b) + γ12 (c − d) + γ2 . (9.8)
p V γ0 − r(v) V γ2 − r(v)
From (9.2) we have ⎧

⎪
⎪ dv
⎪
⎪ if x < b
⎪
⎪ r(v) − γ0
⎪
⎪
⎪
⎪ dv
⎨ if x ∈ [b, d)
dx = r(v) − γ11
1
(9.9)
v ⎪
⎪ dv
⎪
⎪ if x ∈ [d, c)
⎪
⎪ r(v) − γ12
⎪
⎪
⎪
⎪ dv
⎩ if x > c.
r(v) − γ2
By substituting (9.8) and (9.9) into the cost function

q
ψ(V )
J(v) = + r(v) − ϕ (V ) dx
p v
we obtain
vb
vdv
J(vb , vd , vc ) = ψ(V )A(vb , vd , vc ) + γ0
V γ0 − r(v)
V
vdv
+γ11 (d − b) + γ12 (c − d) + γ2
vc γ2 − r(v)
vc V
vdv vdv
−ϕ (V ) c − b + + . (9.10)
V r(v) − γ2 vb r(v) − γ0
where
V vd vc vc
dv dv dv dv
A(vb , vd , vc ) = + + + .
vb r(v) − γ0 vb γ11 − r(v) vd γ12 − r(v) V r(v) − γ2
To minimise J subject to (9.2) and (9.3) we define the Lagrange function

vd
vdv
J (vb , vc ) = J(vb , vc ) + λ1 d − b −
vb γ11 − r(v)
vc
vdv
+ λ2 c − d − (9.11)
vd γ12 − r(v)
where λ1 and λ2 are Lagrangian multipliers. Applying the Karush-Kuhn-Tucker conditions,

we have
∂J ∂J ∂J
=0 , = 0 and =0
∂vb ∂vd ∂vc
and the complementary slackness conditions
vd
vdv
λ1 d − b − =0
vb γ11 − r(v)
and
vc
vdv
λ2 c − d − = 0.
vd γ12 − r(v)
As in the previous chapter, the Lagrange multipliers are still positive and our solution remains
unchanged when we weaken the conditions (9.2) and (9.3) to
vd
vdv
d−b− ≤0
vb γ11 − r(v)
and vc
vdv
c−d− ≤0
vd γ12 − r(v)
and hence we can guarantee that λ1 , λ2 ≥ 0. We have

∂J 1 1 γ0 − ϕ (V ) λ1
= ψ(V ) − + vb +
∂vb γ0 − r(v) γ11 − r(v) γ0 − r(v) γ11 − r(v)
= ψ(V )(γ11 − γ0 ) + vb [(γ0 − ϕ (V ))(γ11 − r(v)) + λ1 (γ0 − r(v))]
and so
∂J
=0
∂vb
can be rearranged as
[ϕ(vb ) − γ0 vb ] λ1 = [ϕ (V ) − γ0 ][ϕ(vb ) − γ11 vb ] + ψ(V )(γ11 − γ0 ). (9.12)
Similarly with ∂J /∂vc = 0 we obtain
[ϕ(vc ) − γ2 vc ] λ2 = [ϕ (V ) − γ2 ][ϕ(vc ) − γ12 vc ] + ψ(V )(γ12 − γ2 ). (9.13)

With ∂J /∂vd = 0 we have

∂J 1 1 λ1 λ2
= ψ(V ) − − vd −
∂vd γ11 − r(vd ) γ12 − r(vd ) γ11 − r(vd ) γ11 − r(vd )
= ψ(V )(γ12 − γ11 ) − vd [λ1 (γ12 − r(vd )) − λ2 (γ11 − r(vd ))] = 0
which can be rearranged to give
[ϕ(vd ) − γ11 vd ]λ2 − [ϕ(vd ) − γ12 vd ]λ1 = ψ(V )(γ12 − γ11 ). (9.14)
Let
µi = λi − ϕ (V ) + γ1i .
Then equations (9.12) – (9.14) become
[ϕ(vb ) − γ0 vb ] µ1 = [ϕ(vb ) − LV (vb )] (γ11 − γ0 ), (9.15)
[ϕ(vd ) − γ11 vd ]µ2 − [ϕ(vd ) − γ12 vd ]µ1 = (γ12 − γ11 )[ϕ(vd ) − LV (vd )] (9.16)
and
[ϕ(vc ) − γ2 vc ] µ2 = [ϕ(vc ) − LV (vc )] (γ12 − γ2 ) (9.17)
where y = LV (v) is the equation of the tangent to the curve y = ϕ(v) at v = V ,
LV (v) = ϕ(V ) + ϕ (V )(v − V ) = ϕ (V )v − ψ(V ).
Equations (9.15), (9.16) and (9.17) are necessary conditions for an optimal coast phase. By
deriving µ1 and µ2 from (9.15) and (9.17) as
[ϕ(vb ) − LV (vb )](γ11 − γ0 )

µ1 = M1 (vb ) = (9.18)
ϕ(vb ) − γ0 vb
and
[ϕ(vc ) − LV (vc )](γ12 − γ2 )
µ2 = M2 (vc ) = (9.19)
ϕ(vc ) − γ2 vc
then substituting into (9.16) we obtain
f (vd ) = [ϕ(vd )−γ11 vd ]M2 (vc )−[ϕ(vd )−γ12 vd ]M1 (vb )−(γ12 −γ11 )[ϕ(vd )−LV (vd )] (9.20)
where vb and vc can be expressed as functions of vd . An optimal solution v b , vc , vd , µ1 and

µ2 must satisfies the equation
f (vd ) = 0. (9.21)
This is a alternative necessary condition for an optimal coast phase on a steep downhill track
with two steep sections of constant gradient.
9.3 Existence and uniqueness of an optimal solution
In this section we want to prove existence and uniqueness of an optimal solution. We can
prove existence, but a proof of uniqueness eludes us.
9.3.1 Existence of an optimal solution
We want to find an optimal speed profile that satisfies equations (9.2), (9.3), (9.15), (9.16)
and (9.17). To solve these equations for vb , vc , vd , µ1 and µ2 , we use a method similar to that
used in Chapter 8. That is, for a given vb we can calculate vd and vc using (9.2) and (9.3)
respectively. µ1 and µ2 can be calculated using (9.18) and (9.19) respectively. If they belong
to an optimal coast phase they must satisfy equation (9.21). Hence, we define a function f
as seen in (9.20).
To prove the existence of the optimal solution, we show that function f has different signs
at the two ends of its domain and that f is continuous on its domain. The domain of the
function can be calculated using speeds vb and vc . Since we start coasting before reaching
the steep section then the maximum possible value of v b is V. Speed vd is also maximised
when vb = V . The minimum possible of v d can be calculated as follows: if the coast phase
with vb = 0 has vc > V then this coast phase defines the minimum value of v d , otherwise
the coast phase that finishes with vc = V defines the minimum value of vd .
Note that for coasting downhill we have

dv
v = γ − r(v)
dx
which is positive if γ is steep downhill and negative otherwise. So we have
γ11 − r(v) > 0 =⇒ γ11 v − ϕ(v) > 0 (9.22)
and
γ12 − r(v) > 0 =⇒ γ12 v − ϕ(v) > 0. (9.23)
For non-steep gradient accelerations γ0 and γ2 we have
γ0 v − ϕ(v) < 0 (9.24)
and
γ2 v − ϕ(v) < 0. (9.25)
We consider two cases: γ11 < γ12 and γ11 > γ12 .
Case 1: γ11 < γ12
When vb = V, M1 (vb ) = 0 and vd > V. Then
f = [ϕ(vd ) − γ11 vd ]M2 (vc ) − (γ12 − γ11 )[ϕ(vd ) − LV (vd )].
As seen in (9.19) we have

[ϕ(vc ) − LV (vc )](γ12 − γ2 )
M2 (vc ) =
ϕ(vc ) − γ2 vc
We know that γ12 < γ2 and ϕ(vc ) − γ2 vc < 0. Hence µ2 (vc ) > 0. Since ϕ(vd ) − γ11 vd < 0
then f (vd ) < 0.
When vc = V, M2 (vc ) = 0 and vd < V. Then
f = −[ϕ(vd ) − γ12 vd ]M1 (vb ) − (γ12 − γ11 )[ϕ(vd ) − LV (vd )].
We now investigate the characteristics of M1 (vb ). As seen in (9.18),
ϕ(vb ) − LV (vb )
M1 (vb ) = (γ11 − γ0 ) .
ϕ(vb ) − γ0 vb
Differentiating gives
dM1 γ11 − γ0
= A(vb )
dvb (ϕ(vb ) − γ0 vb )2
where
A(vb ) = [ϕ (vb ) − ϕ (V )][ϕ(vb ) − γ0 vb ] − [ϕ(vb ) − LV (vb )][ϕ (vb ) − γ0 ]
= −ϕ (vb )γ0 vb + γ02 vb − ϕ (V )[ϕ(vb ) − γ0 vb ]
+ϕ(vb )γ0 + LV (vb )[ϕ (vb ) − γ0 ] − γ02 vb
= [ϕ (vb ) − γ0 ][LV (vb ) − γ0 vb ] − [ϕ (V ) − γ0 ][ϕ(vb ) − γ0 vb ]
and
A (vb ) = ϕ (vb )[LV (vb ) − γ0 vb ].
Solving A (vb ) = 0 gives

ψ(V )
vb∗ = > 0.
) − γ0
ϕ (V
Since vb > vb∗ then A (vb ) > 0. Since A(V ) = 0 and vb < V then
A(vb ) < A(V ) = 0.
That means
dM1
< 0.
dvb
Hence M1 (vb ) is a monotone decreasing function. That means we have M 1 (vd ) < M1 (vb )
because vd > vb . Hence,
f > [γ12 vd − ϕ(vd )]M1 (vd ) − (γ12 − γ11 )[ϕ(vd ) − LV (vd )]
= (γ11 − γ0 )[ϕ(vd ) − LV (vd )][γ12 vd − ϕ(vd )]
−(γ12 − γ11 )[ϕ(vd ) − LV (vd )][ϕ(vd ) − γ0 vd ]
= [ϕ(vd ) − LV (vd )] [(γ11 − γ0 )[γ12 vd − ϕ(vd )] − (γ12 − γ11 )(ϕ(vd ) − γ0 vd )]
= [ϕ(vd ) − LV (vd )](γ12 − γ0 )[γ11 vd − ϕ(vd )] > 0
Therefore there exists at least one solution to f (v d ) = 0 in the domain.
Case γ11 > γ12
When vc = V , we have M2 (vc ) = 0 and vd < V . Then
f = [γ12 vd − ϕ(vd )]M1 (vb ) + (γ11 − γ12 )[ϕ(vd ) − LV (vd )].
We have
ϕ(vb ) − LV (vb )
M1 (vb ) = (γ11 − γ0 ) >0
ϕ(vb ) − γ0 vb
because γ11 > γ0 and ϕ(vb ) − γ0 vb > 0. Hence f (vd ) > 0 when vc = V.
When vb = V , we have M1 (vb ) = 0 and vd > V . Then
f = [ϕ(vd ) − γ11 vd ]M2 (vc ) + (γ11 − γ12 )[ϕ(vd ) − LV (vd )].
Now consider
ϕ(vc ) − LV (vc )
M2 (vc ) = (γ12 − γ2 ) .
ϕ(vc ) − γ2 vc
dM2 γ12 − γ2
= B(vc )
dvc (ϕ(vc ) − γ2 vc )2
where
B(vc ) = [ϕ (vc ) − ϕ (V )](ϕ(vc ) − γ2 vc ) − [ϕ(vc ) − LV (vc )](ϕ (vc ) − γ2 )
= −ϕ (vc )γ2 vc − ϕ (V )(ϕ(vc ) − γ2 vc ) + ϕ(vc )γ2 + LV (vc )(ϕ (vc ) − γ2 )
= −γ2 vc (ϕ (vc ) − γ2 ) − ϕ (V )(ϕ(vc ) − γ2 vc )
+γ2 (ϕ(vc ) − γ2 vc ) + LV (vc )(ϕ (vc ) − γ2 )
= (ϕ (vc ) − γ2 )(LV (vc ) − γ2 vc ) − (ϕ (V ) − γ2 )(ϕ(vc ) − γ2 vc )
B (vc ) = ϕ (vc )(LV (vc ) − γ2 vc ).
Let
m(v) = LV (v) − γ2 v
and hence
m(V ) = ϕ(V ) − γ2 V > 0
and
m (v) = ϕ (V ) − γ2 > 0.
So m(vc ) > m(V ) > 0 because vc > V . That means B (vc ) > 0. Since B(V ) = 0 then
B(vc ) > B(V ) = 0. Hence, dM2 /dvc > 0. Because V < vd < vc then M2 (vd ) < M2 (vc ).
Recall that ϕ(vd ) − γ11 vd < 0, then we can have
f < [ϕ(vd ) − γ11 vd ]M2 (vd ) + (γ11 − γ12 )[ϕ(vd ) − LV (vd )]
= (γ12 − γ2 )[ϕ(vd ) − LV (vd )][ϕ(vd ) − γ11 vd ]
+ (γ11 − γ12 )[ϕ(vd ) − LV (vd )](ϕ(vd ) − γ2 vd )
= [ϕ(vd ) − LV (vd )] [(γ12 − γ2 )[ϕ(vd ) − γ11 vd ] + (γ11 − γ12 )(ϕ(vd ) − γ2 vd )]
= [ϕ(vd ) − LV (vd )](γ11 − γ2 )(ϕ(vd ) − γ12 vd ) < 0.
Therefore in this case we also conclude that there exists at least one solution to f = 0 in the
domain.
vb vd vc
P1 f P2
Figure 9.3: Calculation sequence for f (vd ) in case of coasting over a 2 steep gradient section.
9.3.2 Uniqueness
Proving that the optimal solution is unique becomes much more complicated as the number
of steep gradients increases from one to two. We have not been able to construct a proof.
However, our numerical calculations indicate that the solution is unique.
We want to solve the system of equations (9.2), (9.3), (9.15), (9.16) and (9.17) for v b , vc , vd ,
µ1 and µ2 .
For convenience, we will search for the value of vb that gives f = 0, where vd is calculated
from vb by solving the differential equation (9.2) and where µ 1 = M1 (vb ) and µ2 = M2 (vc ).
First we need to find the feasible range for vb . The largest possible value of vb is the holding
speed V . The smallest value of vb is calculated by setting vc = V and solving (9.2) and (9.3)
from vc to vd and then to vb or vb = 0. For each candidate value of vb , we must calculate vd ,
then vc , µ2 and µ1 before we can evaluate f . The value for vd can be found using a numerical
DE solver to solve the equation of motion (9.2) forwards from (x = b, v = v b ) to x = d.
Then the value of vc is calculated by solving equation (9.3) forward from (x = d, v = v d )
to x = c. The value for µ1 and µ2 are calculated from (9.18) and (9.19) respectively. The
calculation sequence is illustrated in Figure 9.3.
9.5 Examples
Example 9.5.1 γ11 > γ12
The gradient acceleration of the track is

⎧
⎪
⎪
⎪
⎪ 0.01, x < 5000
⎪
⎪
⎪
⎪
⎨0.22, x ∈ [5000, 5600]
g(x) = (9.26)
⎪
⎪
⎪
⎪ 0.2, x ∈ [5600, 6000]
⎪
⎪
⎪
⎪
⎩0.012, x > 6000.
The train parameters and holding speed are the same as for previous problem.
The range of vb computed is [13.5, 20]. The zero of function f that is graphed against v b is
vb = 16.94. This value matches the optimal speed v b as shown in Figure 9.4. The optimal
speed profile and graph of function f are displayed on the right and the left respectively in
Figure 9.4. The graph of function f shows the uniqueness of solution of equation f = 0 in
its domain. The results obtained from the iteration are displayed in the table 9.5.
9.6 Conclusion
For a steep downhill section with two gradients, we can prove the existence of an optimal
solution but are unable to find a theoretical proof of the uniqueness of the optimal coast
phase. However, we were able to develop a numerical scheme for calculating optimal coast
phases. We have shown two examples. These and other examples indicate that the necessary
condition f = 0 has only one solution. In the next chapter we will extend the problem for a
train coasting downhill a track of multiple constant gradient accelerations.
26 26
25 25
24 24
23 23
22 22
vb
21 21
20 20
19 19
18 18
17 17
16 16
−1 −0.5 0 0.5 1 0 2000 4000 6000 8000 10000 12000
f distance
Figure 9.4: f and the optimal speed profile for Example 9.5.1
Table 9.1: Experimental results of an optimal journey on a double steep downhill section, Example
9.5.1.
p vb J f (vd )
1350.28 17.25000 11.0372147786 -0.03266703
297.16 16.562500 11.1077618762 0.04660625
832.23 16.906250 10.9211120895 0.00372951
1093.30 17.07812 10.9437761122 -0.01521615
963.29 16.992187 10.9233032185 -0.00593756
897.89 16.949218 10.9198845774 -0.00115354
865.09 16.927734 10.9199127615 0.00127548
881.50 16.938476 10.9197528800 0.00005786
889.70 16.943847 10.9197823565 -0.00054861
885.60 16.941162 10.9197585159 -0.00024557
883.55 16.939819 10.9197534212 -0.00009390
Chapter 10
Hold-coast-hold on a steep downhill

section with many gradients
In this chapter we consider the optimal control strategy when a holding phase is interrupted
by a steep downhill section with piecewise constant gradient where we allow many gradient
changes before the steep section, on the steep section, and after the steep section. We formu-
late the problem, derive the set of necessary conditions, then use them to develop a method
for numerical calculation of the optimal speed profile.
Finally, we present a numerical example.
Suppose the train is travelling on a piecewise constant gradient comprising a sequence of

gradients that are non-steep at the hold speed V , followed by a sequence of gradients that are
steep downhill at speed V , then a final sequence of gradients that are non-steep at speed V .
139
Chapter 10. Hold-coast-hold on a steep downhill section with many gradients 140
steep part
J1 J n 1
f p1 p2 pg 1 pg pr ps ph ph 1 pn 1 pn f
Figure 10.1: Gradient change points for a general steep downhill section.
Let
• p1 , . . . , pn be the locations in the interval (−∞, ∞) where the gradient changes;
• v1 , . . . , vn be the velocities of the train at the points p 1 , . . . , pn ;
• γ0 , γ1 , γ2 , . . . , γn−1 , γn be the respective gradients accelerations of segments [−∞, p1 ),

[p1 , p2 ), . . . , [pn , ∞);
• pr and ps be the starting point and ending point of the steep part and 1 ≤ r, s ≤ n;
• a be the point where coasting starts and b the point where speedhold resumes;
• pg−1 ≤ a ≤ pg where pg is the first gradient change point during the coast phase;
• ph ≤ b ≤ ph+1 be the last gradient change point during the coast phase;
The notation is illustrated in Figure 10.1.
The cost function is b

ψ(V )
J= + r(v) − ϕ (V ) dx. (10.1)
a v
The equation of motion for the coast phase is
dv
v = −r(v) + γj (10.2)
dx
for x ∈ (pj , pj+1 ). By integrating the equation of motion as we did in Chapters 8 and 9, we
obtain
b
h−1 vg vh
vdv vdv
r(v)dx = γi (pi+1 − pi ) + γg + γh . (10.3)
a i=g+1 V γg − r(v) V γh − r(v)
By substituting (10.3) into (10.1) we have

b
h−1
dx
J(vg , . . . , vh ) = ψ(V ) −γg (pg −a)− γi−1 (pi+1 −pi )−γh (b−ph )−ϕ (V )(b−a)
a v i=g+1
where
b V vh h−1 vi+1
dx dv dv dv
= + + ,
a v vg r(v) − γg V r(v) − γh i=g vi −r(v) + γi
vg
vdv
pg − a = ,
V −r(v) + γg
V
vdv
b − ph = ,
vh −r(v) + γh
and so
V vg
vdv vdv
b − a = p h − pg + + . (10.4)
vh −r(v) + γh V −r(v) + γg
Hence the cost function can be written as
vh h−1 vi+1

V
dv dv dv
J(vg , . . . , vh ) = ψ(V ) + +
vg r(v) − γg V r(v) − γh i=g vi −r(v) + γi
vg
h−1 V
vdv vdv
−γg − γi−1 (pi+1 − pi ) − γh
V −r(v) + γg i=g+1 vh −r(v) + γh
V vg
vdv vdv
−ϕ (V )(ph − pg + + ).
vh −r(v) + γh V −r(v) + γg
The optimal velocities at each pair of adjacent points p i and pi+1 must satisfy the constraint
vi+1
vdv
pi+1 − pi = . (10.5)
vi −v + γi
We want to minimise J(vg , vg+1 , . . . , vh ) subject to (10.5). First we define the Lagrange
function
J (vg , vg+1 , . . . , vh ) = J(vg , vg+1 , . . . , vh )
h−1
vi+1
vdv
+ λi pi+1 − pi − (10.6)
i=g vi −r(v) + γi
where λi are Lagrange multipliers. We now apply the Karush-Kuhn-Tucker conditions

∂J
=0
∂vi
and the complementary slackness condition
vi+1
vdv
λi pi+1 − pi − =0
vi γ1 − r(v)
for i = g + 1, . . . , h − 1 to minimise the cost function J. Note that our solution remains
unchanged if we weaken the conditions (10.5) to
vi+1
vdv
pi+1 − pi − ≤0 (10.7)
vi −v + γi
because in order to minimise fuel consumption we should not travel further than the required
distance pi+1 − pi . Thus we can guarantee that λi ≥ 0. By setting ∂J /∂vi = 0 as we did in
Chapter 8, we obtain at each point pi with g < i < h the equation
[ϕ(vi ) − γi vi ]µi − [ϕ(vi ) − γi+1 vi ]µi−1 = [ϕ(vi ) − LV (vi )](γi+1 − γi ) (10.8)
where
µi = λi − ϕ (V ) − γi+1 .
At the first and last gradient change points during the coast phase we get
[ϕ(vi ) − γi ]µi = [ϕ(vi ) − LV (vi )](γi+1 − γi) (10.9)
where i = g, h. Speed vi can be found from speed vi−1 by solving equation (10.5) forward
or from vi+1 by solving (10.5) backward. Given an initial guess for the speed v r at the top of
the steep downhill section, we can calculate vi and µi that satisfy all equations except one.
The remaining equation can be written as f (vr+1 ) = 0, where
f (vr+1 ) = [ϕ(vr+1 ) − γr vr+1 ]µr+1 − [ϕ(vr+1 ) − γr+1vr+1 ]µr
−[ϕ(vr+1 ) − LV (vr+1 )](γr+1 − γr ). (10.10)
Note that the remaining equation can be at any point j ∈ {g + 1, . . . , h − 1} because it

has the same form as (10.10). However for calculation consistency we choose the remaining
equation at the point r+1 where the steep section begins. The domain [v r+1 , v̄r+1 ] of function
f can be calculated based on the slowest and fastest possible coasting speed profiles. For
convenience and consistency we search for vr = v(pr ) that satisfies f (vr+1 ) = 0 in the
following section.
10.1.1 Numerical solution
Before we can search for the value of vr for which f (vr+1 ) = 0, we must find a suitable
interval [v r , v̄r ] in which to search.
For a coast phase, we cannot start the steep downhill section with a speed that is greater than
the hold speed V . Suppose we start the steep downhill section with v r = V . The limiting
speed on each of the gradient sections on the steep section is greater than V , and so the speed
profile v with v(pr ) = vr = V will have v(ps ) > V ; that is, speed will start at the hold speed
V at the top of the steep section and will finish above the hold speed V at the bottom of the
steep section. So v̄r = V is a valid upper bound.
The lower bound vr is found as follows:
• if the speed profile v with v(pr ) = 0 at the top of the steep section reaches the bottom
of the steep section with v(ps ) > V then the lower bound for the speed vr is vr = 0;
otherwise
• the lower bound for the speed vr is vr = v(pr ), where v is the the speed profile that
reaches the bottom of the hill with v(p s ) = V .
For a given value vr , we can calculate forward all speeds at the points pr+1 , . . . , ph and
backward all speeds from point pr−1 to point pg using equation (10.5). From equation (10.9)
we can calculate µh−1 and µg using vh and vg respectively. Then µg+1 can be calculated from
equation (10.8) using vg+1 and µg . We can calculate µg+2 , . . . , µr and µh−1 , . . . , µr+1 in a
vg v g 1 vr vr 1 vr 2 vh1 vh
Pg P g 1 Pr f Pr+1 Ph 2 Ph 1
Figure 10.2: Calculation sequence for f .
similar way. Finally we substitute v r+1 , µr and µr+1 into equation (10.10). The calculation
scheme is illustrated in Figure 10.2. We use the Bisection method to find v r that gives
f (vr+1 ) = 0.
10.2 Example
Example 10.2.1 In this example we use three gradients that are non-steep at speed V , then
thirteen gradients that are steep downhill at speed V , then three gradients that are non-steep
at speed V . The track data and optimal speed are shown in Table 10.1.
Figure 10.3 shows shows that f is monotonic and has a single minimum. The optimal speed
profile of Example 10.2.1 is displayed at the bottom of the figure. Notice that there is a non-
steep part where the speed increases and also some steep parts where the speed decreases.
The data of these cases are displayed in bold in Table 10.1.
Table 10.1: Track data and results of hold-coast-hold on a multi steep downhill section.
p γ v vL
−∞ 0.010 8.06225774829855
9900 0.020 17.16560752770548 8.06225774829855
10900 0.015 17.08320058289678 16.27882059609971
11500 0.050 16.86562787306989 12.84523257866513
12400 0.040 18.28706387803024 29.41088233970548
13500 0.070 19.20552524744577 25.78759391645525
14700 0.056 21.68382350711548 35.56683848755748
15600 0.030 22.68253608306716 31.38470965295043
16700 0.048 22.56859541912852 21.56385865284782
17600 0.060 23.16268231910735 28.72281323269014
18600 0.120 24.22400336900532 32.63433774416144
19500 0.075 27.04150185853915 47.59201613716318
20400 0.100 28.03193806777925 36.94590640382234
21200 0.050 29.47478912746264 43.18564576337837
22200 0.105 29.46871355930902 29.41088233970548
23100 0.040 31.02882053740335 44.32832051860300
23900 0.020 30.65766946886469 25.78759391645525
24900 0.010 29.59169082093850 16.27882059609971
26200 0.012 27.87192157428342 8.06225774829855
∞ 0.012 10.24695076595960
−3
x 10
3
f −1
−2
−3
−4
−5
16 16.5 17 17.5 18 18.5 19 19.5 20
v
3
32
30
28
26
v 24
22
20
18 steep downhill
16
0.5 1 1.5 2 2.5 3 3.5
x x 10
4
Figure 10.3: Plot of f (vr+1 ) (top) and speed profile of Example 10.2.1 (bottom).
10.3 Conclusion
We have shown that the new numerical method for finding an optimal coast phase for a steep
downhill track can be extended to handle any number of gradients. Although we have not
been able to prove that the solution found using our method is unique, we have used our
numerical method to plot the function f for many examples; in each example f = 0 appears
to have only one solution on the domain of feasible speeds.
Part III
Phase trajectories
148
Chapter 11
Phase trajectories on a steep uphill

section
In this chapter we take an alternative view of the necessary conditions for an optimal power
phase on a steep uphill section. By constructing (v, η) phase trajectories where η = θ −1 and
θ is the modified adjoint variable defined in Chapter 2, we are able to gain further insights
into optimal strategies.
We assume piecewise constant gradient and use an exact integration of the adjoint equation
to find an explicit relation between v and η. We then form alternative necessary conditions
for an optimal power phase by following this relation along the various sections of constant
gradient between the switching points, and consider properties of the relationship between v
and η. Finally, we investigate phase diagrams of v and η for various track configurations.
149
Chapter 11. Phase trajectories on a steep uphill section 150
11.1 An exact integral for the adjoint equation
The speed of a train powering along a section with constant gradient acceleration γ is given
by the equation
dv P r(v) − γ
= 2− . (11.1)
dx v v
The modified adjoint equation of a train in power mode, derived in Chapter 2, is
dθ P + ψ(v) P + ψ(V )
− · θ = (−1) .
dx v3 v3
By substituting η = θ − 1 we obtain an alternative adjoint equation
dη ψ(v) + P ψ(v) + P ψ(V ) + P
− 3
·η = 3
− . (11.2)
dx v v v3
Using η as the adjoint variable, an optimal power phase for a steep uphill section must start
and finish with η = 0.
We will find a direct relationship between the speed v and the modified adjoint variable η.
The integrating factor for the adjoint equation (11.2) is

ψ(v) + P
IF = exp − dx
v3
from which we deduce

d ψ(v) + P ψ(v) + P ψ(V ) + P ψ(v) + P
η. exp − dx = − exp − dx .
dx v3 v3 v3 v3
(11.3)
If we neglect the arbitrary constant of integration we can show that IF = vdv/dx where
v = v(x) is the solution to (11.1). This gives us a useful link between the equation of motion
and the adjoint equation. We start by differentiating (11.1), which gives

d2 v 2P r (v) r(v) − γ dv
= − 3 − + .
dx2 v v v2 dx
Dividing by dv/dx gives

d2 v/dx2 1 P r(v) − γ P r (v)
= − − − −
dv/dx v v2 v v3 v

1 dv P r (v)
= − · − 3− . (11.4)
v dx v v
Recall that ψ(v) = v 2 r (v). Equation (11.4) can be rewritten as
ψ(v) + P 1 dv d2 v/dx2
− = + .
v3 v dx dv/dx
Integrating both sides gives

ψ(v) + P dv dv
− 3
dx = ln v + ln = ln (v )
v dx dx
and we obtain
ψ(v) + P dv
exp − 3
dx = v . (11.5)
v dx
Since

d ψ(v) + P ψ(v) + P ψ(v) + P
exp − dx =− exp − dx
dx v3 v3 v3
then (11.5) shows us that

d dv ψ(v) + P ψ(v) + P
v =− exp − dx . (11.6)
dx dx v3 v3
Using (11.6) we can rewrite (11.3) as

d dv ψ(v) + P ψ(v) + P ψ(V ) + P dv
v ·η = 3
exp − 3
dx − 3
·v
dx dx v v v dx

ψ(v) + P ψ(v) + P ψ(V ) + P dv
= 3
exp − 3
dx − ·
v v v2 dx

d dv d ψ(V ) + P
= − v +
dx dx dx v
and hence
dv dv ψ(V ) + P
· η = −v
v + +C
dx dx v
P P ψ(V ) + P
=⇒ − r(v) + γ η = − + r(v) − γ + +C
v v v
=⇒ [P − ϕ(v) + γv] η = ϕ(v) − γv + ψ(V ) + Cv. (11.7)
Since
ψ(V ) = −ϕ(V ) + V ϕ (V ),
we have
[P − ϕ(v) + γv] η = ϕ(v) − [ϕ(V ) + ϕ (V )(v − V )] + (ϕ (V ) − γ + C)v
= ϕ(v) − LV (v) + (ϕ (V ) − γ + C)v (11.8)
where LV (v) = ϕ(V ) + ϕ (V )(v − V ). Hence
ϕ(v) − LV (v) + (ϕ (V ) − γ + C)v

η= . (11.9)
P − ϕ(v) + γv
Equation (11.9) shows the dependence of η on v for power phases on a track section with
constant gradient acceleration γ. By selecting different values for the constant C we can
select different curves (v, η). We will use this feature later to ensure that η is continuous at
gradient change points.
11.2 Necessary conditions for an optimal power phase
We can use (11.9) to develop necessary conditions for an optimal power phase for a steep
uphill section. Note that η is non-negative during an optimal power phase. If we let
Ci = ϕ (V ) − γi + C (11.10)
where γi is the gradient acceleration on the interval [xi , xi+1 ] then (11.9) can be rewritten on
this interval in the general form
ϕ(v) − LV (v) + Ci v εV (v) + Ci v

η= = (11.11)
P − ϕ(v) + γi v hi (v)
where
εV (v) = ϕ(v) − LV (v) (11.12)
and
hi (v) = P − ϕ(v) + γi v. (11.13)
The constant Ci will be different on different gradient sections, but η must be continuous at
points where the gradient changes. Note that dη/dv may have a jump discontinuity at these
points but will be continuous elsewhere.
Suppose a power phase starts on track section j with gradient acceleration γ j . We can use
this continuity property to calculate a sequence of values {C j , Cj+1 , Cj+2, . . .} that define
a feasible speed profile for the steep uphill section. Later, we will show that C j = 0 gives
a speed profile that satisfies the necessary condition η = 0 at the starting point x = p ∈
(pj , pj+1) for an optimal power phase.
11.2.1 A single steep gradient
First we consider a simple case of a track with three constant gradient sections. The first and
last are non-steep and the middle section is steep at speed V . Thus
P − ϕ(V ) + γiV > 0
for i = 0, 2 and
P − ϕ(V ) + γ1 V < 0.
The steep section starts at x1 and ends at x2 . The track is non-steep on (−∞, x1 ] and [x2 , ∞).
The train powers from point p < x1 to point q > x2 , as shown in Figure 11.1.
First consider the interval [p, x1 ). We rewrite (11.11) for i = 0 in the form
h0 (v)η = εV (v) + C0 v. (11.14)
For an optimal strategy, we must have v = V and η = 0 at x = p ∈ (−∞, x1 ]. Since

εV (V ) = 0 equation (11.14) shows that C 0 = 0. Hence (11.14) becomes
h0 (v)η = εV (v). (11.15)

v1
V V
v2
J2
J1
J0
p x1 x2 q
Figure 11.1: Speed and elevation profiles with one steep gradient.
At x = x1 , set η(x1 ) = η1 and v(x1 ) = v1 . Then we obtain
h0 (v1 )η1 = εV (v1 ). (11.16)
On the interval [x1 , x2 ] we rewrite (11.11) for i = 1 in the form
h1 (v)η = εV (v) + C1 v. (11.17)
At x = x1 we have
h1 η1 = εV (v1 ) + C1 v1 . (11.18)
Equation (11.16) gives η1 in terms of v1 and if we substitute this expression into (11.18) we
obtain
h1 (v1 )
· εV (v1 ) = εV (v1 ) + C1 v1 .
h0 (v1 )
Hence we express C1 in terms of v1 :

1 h1 (v1 )
C1 = − 1 εV (v1 ). (11.19)
v1 h0 (v1 )
At x = x2 we obtain
εV (v1 )
h1 (v2 )η2 = εV (v2 ) + (γ1 − γ0 ) · · v2 . (11.20)
h0 (v1 )
Now continue in the same way on the next interval to find C 2 . On the interval [x2 , ∞) using
(11.11) for i = 2 we have
h2 (v)η = εV (v) + C2 v. (11.21)
At x = x2 we have
h2 (v2 )η2 = εV (v2 ) + C2 v2 . (11.22)
From (11.20) we find η2 and then substitute into (11.22) to express C 2 in term of v2 , giving

h2 (v2 ) εV (v1 )
εV (v2 ) + (γ1 − γ0 ) · v2 = εV (v2 ) + C2 v2 .
h1 (v2 ) h0 (v1 )
εV (v2 ) h2 (v2 ) εV (v1 )

=⇒ C2 = (γ2 − γ1 ) · + (γ1 − γ0 ) · · . (11.23)
h1 (v2 ) h1 (v2 ) h0 (v1 )
An optimal strategy finishes at x = q with v = V and η = 0. At x = q we have ε V (V ) = 0,
and so from (11.21) we have C2 = 0.
εV (v2 ) εV (v1 )
(γ2 − γ1 ) · + (γ1 − γ0 ) = 0.
h2 (v2 ) h0 (v1 )
Therefore a necessary condition for an optimal strategy is
εV (v2 ) εV (v1 )
(γ2 − γ1 ) + (γ1 − γ0 ) = 0. (11.24)
h2 (v2 ) h0 (v1 )
We will now show that the necessary conditions (11.24) is equivalent to condition (5.46) in
Chapter 5. The condition (5.46) is
[P − ϕ(vb ) + γ0 vb ]µ(vc ) − [ϕ(vb ) − LV (vb )](γ0 − γ1 ) = 0.
Recall that
hi (v) = P − ϕ(v) + γiv
and
εV (v) = ϕ(v) − LV (v)
= ϕ(v) − [ϕ(V ) + ϕ (V )(v − V )]
= ϕ(v) − vϕ (V ) + ψ(V ).
By substituting these expressions into (11.24) we have

ϕ(v2 ) − v2 ϕ (V ) + ψ(V ) ϕ(v1 ) − v1 ϕ (V ) + ψ(V )
(γ2 − γ1 ) + (γ1 − γ0 ) = 0.
P − ϕ(v2 ) + γ2 v2 P − ϕ(v1 ) + γ1 v1
Multiplying through by P − ϕ(v 1 ) + γ1 v1 gives

ϕ(v2 ) − v2 ϕ (V ) + ψ(V )
(γ2 − γ1 )[P − ϕ(v1 ) + γ1 v1 ]
P − ϕ(v2 ) + γ2 v2
+ (γ1 − γ0 )[ϕ(v1 ) − v1 ϕ (V ) + ψ(V )] = 0. (11.25)
Note that v1 and v2 are equivalent to vb and vc used in Chapter 5. From (6.14) we have
ϕ(vc ) − ϕ (V )vc + ψ(V )
µ = (γ2 − γ1 )
P − ϕ(vc ) + γ2 vc
and hence (11.25) can be rewritten as
[P − ϕ(v1 ) + γ1 v1 ]µ + (γ1 − γ0 )[ϕ(v1 ) − v1 ϕ (V ) + ψ(V )] = 0.
Thus (11.24) is equivalent to equation (5.46) in Chapter 5.
In summary, for a single steep uphill gradient we have
C0 = C2 = 0 ⇐⇒ (11.24) ⇐⇒ (5.46).
Note that
h1 (v1 ) P + γ1 v1 − ϕ1 (v1 ) − P − γ0 v1 + ϕ(v1 ) (γ1 − γ0 )v1
−1 = =
h0 (v1 ) P + γ0 v1 − ϕ(v1 ) h0 (v1 )
and hence (11.19) shows that
(γ1 − γ0 )εV (v1 )
C1 = < 0.
h0 (v1 )
This is consistent with the fact that we cannot switch from power back to speedhold on the
steep section (since we cannot speedhold on the steep section).
11.2.2 Two steep gradient sections
In this section our aim is to find new necessary conditions for an optimal strategy on a track
that consists of two steep sections with constant gradient accelerations γ 1 and γ2 and two
v1
V v2 V
v3
J3
J2
J1
J0
p x1 x2 x3 q
Figure 11.2: Speed and elevation profile with two steep gradients.
non-steep sections having gradient accelerations γ 0 and γ3 . We assume that [x1 , x2 ] and
[x2 , x3 ] are steep at speed V and (−∞, x1 ] and [x3 , ∞) are non-steep. The track is shown in
Figure 11.2.
As in Section 11.2.1 we argue that on (p, x1 ) that C0 = 0 and that
h0 (v)η = εV (v).
On the interval (x1 , x2 ) we have
h1 (v)η = εV (v) + C1 v
and
εV (v1 )
C1 = (γ1 − γ0 ) (11.26)
h0 (v1 )
and v1 = v(x1 ). On (x2 , x3 ) we have
h2 (v)η = εV (v) + C2 v
and by substituting v = v 2 in the expressions for h1 (v) and h2 (v) we can see that
εV (v2 ) h2 (v2 )
C2 = (γ2 − γ1 ) + C1
h1 (v2 ) h1 (v2 )
εV (v2 ) h2 (v2 ) εV (v1 )
= (γ2 − γ1 ) + (γ1 − γ0 ) (11.27)
h1 (v2 ) h1 (v2 ) h0 (v1 )
and v2 = v(x2 ). On (x3 , x4 ) we have
h3 (v)η = εV (v) + C3 v (11.28)
and by substituting v = v 3 in the expressions for h2 (v) and h3 (v) we see that
εV (v3 ) h3 (v3 )
C3 = (γ3 − γ2 ) + C2
h2 (v3 ) h2 (v3 )
εV (v3 ) h3 (v3 ) εV (v2 )
= (γ3 − γ2 ) + (γ2 − γ1 ) ·
h2 (v3 ) h2 (v3 ) h1 (v2 )
h3 (v3 ) h2 (v2 ) εV (v1 )
+(γ1 − γ0 ) · · (11.29)
h2 (v3 ) h1 (v2 ) h0 (v1 )
and v3 = v(x3 ). As in Subsection 11.2.1, in order to obtain an optimal strategy we must have
v = V and η = 0 at x = q, and since εV (V ) = 0 it follows that from (11.28) that C 3 = 0.
Therefore a necessary condition for an optimal strategy is

εV (v3 ) h3 (v3 ) εV (v2 ) h3 (v3 ) h2 (v2 ) εV (v1 )
(γ3 −γ2 ) +(γ2 −γ1 ) · +(γ1 −γ0 ) · · = 0. (11.30)
h2 (v3 ) h2 (v3 ) h1 (v2 ) h2 (v3 ) h1 (v2 ) h0 (v1 )
As before, we show that condition (11.30) is equivalent to (6.15) in Chapter 6. Condition

(6.15) can be written as
[ϕ(vd ) − γ11 vd − P ]µ2 − [ϕ(vd ) − γ12 vd − P ]µ1 − [ϕ(vd ) − LV (vd )](γ12 − γ11 ) = 0. (11.31)
Using the notation of (11.12) and (11.13) and equations (6.10) and (6.11) we have
εV (v)
µ1 = (γ0 − γ1 )
h0 (v)
and
εV (v)
µ2 = (γ3 − γ2 )
h3 (v)
and so (11.31) is the same as (11.30).
Therefore in summary we have
C0 = C3 = 0 ⇐⇒ (11.30) ⇐⇒ (6.15)
and so C0 = C3 = 0 is a necessary condition for an optimal power phase.

11.3 Properties of η(v)
In this section we use the relationship (11.11) between v and η to show how η(v) behaves on
various parts of a power phase.
Lemma 11.3.1 Consider a track section with constant gradient acceleration γ. Let
ϕ(v) − LV (v) + Cv
η= (11.32)
P − ϕ(v) + γv
where C is a constant. Then for each fixed point C
1. the equation dη/dv = 0 has a unique solution;
dη
2. the solution to the equation dv
= 0 is given by v = V if and only if C = 0; and
3. the function η(v) has a maximum turning point on a steep uphill section of track with
P − ϕ(V ) + γV < 0 and a minimum turning point on a non-steep section of track with
P − ϕ(V ) + γV > 0.
Proof:
First, we show that dη/dv = 0 has a unique solution for each value of C.
Differentiating (11.32) gives
dη (ϕ (v) − γ)(ψ(V ) + P ) − (ϕ (V ) − γ − C)(ψ(v) + P )

= . (11.33)
dv [P − ϕ(v) + γv]2
Setting dη/dv = 0 gives

ϕ (v) − γ ϕ (V ) − γ − C
= (11.34)
ψ(v) + P ψ(V ) + P
ϕ (V ) − γ − C
f (v) − =0 (11.35)
ψ(V ) + P
where
ϕ (v) − γ
f (v) = . (11.36)
ψ(v) + P
Recall that
ϕ(v) = vr(v) =⇒ vϕ (v) = ϕ(v) + ψ(v)
and
ψ(v) = v 2 r (v) =⇒ ψ (v) = vϕ (v).
Differentiating (11.36) we have
ϕ (v) (ϕ (v) − γ)vϕ (v)

f (v) = −
ψ(v) + P (ψ(v) + P )2
ϕ (v)(ψ(v) + P ) − (ϕ (v) − γ)vϕ (v)
=
(ψ(v) + P )2
ϕ (v)[P − ϕ(v) + vϕ (v) − vϕ (v) + γv]
=
(ψ(v) + P )2
ϕ (v)[P + γv − ϕ(v)]
=
(ψ(v) + P )2
> 0 if P − ϕ(v) + γv > 0.
This means f (v) is a continuous monotone increasing function and so equation (11.34) has
only one solution for each value of C. Therefore, for any given value of C, there is only one
value of v that gives dη/dv = 0.
Now we show that dη/dv(V ) = 0 ⇐⇒ C = 0.
If v = V and C = 0 then by substituting into (11.33) we have dη/dv = 0.
If dη/dv = 0 then we obtain equation (11.34). We can easily see that equation (11.34) is
satisfied with v = V and C = 0. So dη/dv = 0 if and only if v = V and C = 0.
We now check the nature of the optimal switching point (η, v) = (0, V ).
The second derivative of (11.32) is

dη
d2 η ϕ (v)(ψ(V ) + P ) − (ϕ (V ) − γ − C)vϕ (v) 2 dv · (ϕ (v) − γ)
= +
dv 2 [P − ϕ(v) + γv]2 P − ϕ(v) + γv
dη
ϕ [(P − ϕ(v) + γv) + ϕ(v) − LV (v) + Cv] 2 dv · (ϕ (v) − γ)

= +
(P − ϕ(v) + Cv)2 P − ϕ(v) + γv
ϕ (v)[η + 1] + 2 dη
dv
· (ϕ (v) − γ)
= . (11.37)
P − ϕ(v) + γv
But dη/dv = 0 at v = V , and so
d2 η ϕ (V )[η + 1]
(V ) = . (11.38)
dv 2 P − ϕ(V ) + γV
Recall that η > 0 for a power phase and ϕ(v) is a convex function.
d2 η
If dη/dv = 0 at v = V on a steep section with P − ϕ(V ) + γV < 0 then dv2
< 0 and
we have a maximum turning point. If dη/dv = 0 at v = V on a non-steep section with
d2 η
P − ϕ(V ) + γV > 0 then dv2
> 0 and we have a minimum turning point.
We can use the results of Lemma 11.3.1 to analyse the behaviour of η(v) during a power
phase on a steep uphill section.
The power phase must start on a non-steep section with v = V . From the Lemma, we know
that this implies that C = 0 and dη/dv = 0. In Section 11.2.1 we showed that C = 0 is a
necessary condition for the start of the power phase, and so dη/dv = 0 is also a necessary
condition for the start of the power phase.
Similarly, the power phase must start on a non-steep section with v = V , C = 0 and
dη/dv = 0.
Figure 11.3 shows η(v) for the start (blue) and finish (green) of an optimal power phase. To
get from the start of the power phase to the end of the power phase we require an η curve as
shown in red. Speed must decrease on this section, and so the gradient must be steep uphill.
Furthermore, part 3 of the Lemma says that this η curve has a maximum turning point, as
shown in the diagram.
−3
x 10
20
15
10
−5
17.5 18 18.5 19 19.5 20 20.5 21 21.5 22
v
Figure 11.3: Phase diagram of a power phase on a single steep uphill section.
11.4 Phase diagram for (v, η)
To help us to construct optimal strategies, we want to plot the curves η = η(v) for all possible
values of C.
To understand the behaviour of η, we first consider (11.11) as the speed v approaches the
limiting speed for the gradient. When x → ∞ we have v → W , the limiting speed of the
train on the gradient γ where P = ϕ(W ) − γW . Note that since γ is a non-steep gradient
acceleration then W > V . If η has a limit as v → W then because
ϕ(v) − LV (v) + Cv
η=
P − ϕ(v) + γv
and because the denominator is zero when v = W the numerator must also be zero. Thus
ϕ(W ) − LV (v) + CW = 0
1
=⇒ C = [LV (W ) − ϕ(W )].
W
We denote this critical value C ∗ . In this critical case we have

v
ϕ(v) − LV (v) − [ϕ(W ) − LV (W )] · W
η= . (11.39)
γ(v − W ) + ϕ(W ) − ϕ(v)
Since
v v
LV (v) − LV (W ) = [ϕ(V ) + ϕ (V )(v − V )] − [ϕ(V ) + ϕ (V )(W − V )] ·
W W
v v
= ϕ(V )(1 − ) + ϕ (V ) v − V − (W − V )
W W
v v
= ϕ(V )(1 − ) − V ϕ (V )(1 − )
W W
v
= −ψ(V )(1 − )
W
then (11.39) becomes

1
W
[(W ϕ(v) − vϕ(W )) + ψ(V )(W − v)]
η =
γ(v − W ) + ϕ(W ) − ϕ(v)

1 W ϕ(v)−vϕ(W )
W
· W −v
+ ψ(V )
= ϕ(W )−ϕ(v)
. (11.40)
W −v
−γ
Now we consider the limit of this part of the numerator of (11.40). Using L’Hopitals rule we
have
1 W ϕ(v) − vϕ(W ) 1 W ϕ (v) − ϕ(W ) ψ(W )

lim · = lim · =− (11.41)
v→W W W −v v→W W −1 W
and the limit of a part of the denominator is
ϕ(W ) − ϕ(v) −ϕ (v)

lim = lim = ϕ (W ).
v→W W −v v→W −1
So we have
ψ(V ) − ψ(W )
lim η = . (11.42)
v→W W (ϕ (W ) − γ)
If γ is a non-steep gradient acceleration then V < W and ϕ (W ) − γ > 0, and so the limit
is negative. If γ is a steep gradient acceleration then W > V and ϕ (W ) − γ < 0, and hence
the limit is positive.
We now want to consider η when C takes values on either side of the critical value C ∗ . When
C = C ∗ equation (11.34) becomes
ϕ (v) − γ ϕ (V ) − γ − C ∗
= . (11.43)
ψ(v) + P ψ(V ) + P
If
LV (W ) − ϕ(W )
C < C∗ =
W
then from (11.43) we obtain
ϕ (v) − γ ϕ (V ) − γ − C
<
ψ(v) + P ψ(V ) + P
ϕ (v) − γ ϕ (V ) − γ − C
=⇒ − < 0,
ψ(v) + P ψ(V ) + P
and hence from (11.33) we have
dη
< 0.
dv
1
Substituting C = C ∗ − ε = W
[LV (W ) − ϕ(W )] − ε into (11.32) we have
ϕ(v) − LV (v) + [ LV (WW

)−ϕ(W )
− ε]v
η= . (11.44)
P + γv − ϕ(v)
When v → W we have
P = ϕ(W ) − γW
and the numerator of (11.44) can be written as

ψ(V ) ϕ(W )
N = ϕ(W ) − LV (W ) + ϕ (V ) − γ − − + γ W − εW
W W
= −εW.
So as v → W − , P − ϕ(v) + γv = ϕ(W ) − γW − ϕ(W − ) + γW − > 0 then η ↓ −∞ and

when v → W + , P − ϕ(v) + γv = ϕ(W ) − γW − ϕ(W + ) + γW + > 0 then η ↑ ∞.
If
LV (W ) − ϕ(W )
Cγ > C ∗ =
W
then using a similar argument as above we have η ↑ ∞ as v → W − and η ↓ −∞ as v → W + .

For both steep and non-steep gradient accelerations γ we have
ψ(V )
η=
P
when v = 0 and the limit of η when v tends to infinity can be obtained by applying L’Hopitals
rule
ϕ(v) − LV (v) + Cv
lim η = lim
v→∞ v→∞ P + γv − ϕ(v)
ϕ (v) − ϕ (V ) + C
= lim
v→∞ γ − ϕ (v)
ϕ (v)
= lim = −1. (11.45)
v→∞ −ϕ (v)
The behaviour of η on non-steep and steep gradients is illustrated in Figure 11.4. In both
cases, the η curves move up with increasing values of C when v < W and move down when
v > W. For example, when γ = −0.075 and P = 3, the limiting speed is 25.9765 and the
critical value C ∗ = −0.0045. For γ = −0.2 and P = 3, the limiting speed is 13.8656 and
the critical value C ∗ = −0.0073. The curves passing through v = W are the critical curves
with C = C ∗ .
Recall that dη/dv = 0 when v = V and C = 0. Hence, from (11.38) we have
d2 η
<0
dv 2
because P + γV − ϕ(V ) < 0 for a steep gradient. Therefore η has its maximum value of 0
at v = V when γ is a steep gradient acceleration. Similarly η has a minimum of 0 at v = V
for a non-steep gradient acceleration γ.
Figure 11.5 shows the phase diagram of η against speed v of a train powering over a steep
uphill section comprising two constant slopes with gradient accelerations γ 11 = 0.5 and
1.5
C= − 0.05
C= − 0.025
C=0
C=0.025
1 C=0.05
C=C*
0.5
η
0
−0.5
−1
−1.5
0 5 10 15 20 25 30 35 40 45 50
v
1.5
C= − 0.01
C= − 0.005
C=0
C=0.005
1 C=0.01
C=C*
0.5
η
0
−0.5
−1
−1.5
0 5 10 15 20 25 30
v
Figure 11.4: Phase plot with different values of C for the case of non-steep gradient (top) and steep
gradient (bottom).
0.14
0.12
0.1
0.08
η
0.06
0.04
0.02
−0.02
10 15 20 25
v
Figure 11.5: Phase diagrams of an optimal strategy obtained obtained using the traditional method
and the explicit formula of η
γ12 = 0.2. The holding speed is V = 20. The optimal power phase plotted in this diagram
was calculated by solving the equation of motion (11.1) and the adjoint equation (11.2)
simultaneously using a Runga-Kuta method and then using a shooting method to find the
starting point of the power phase which satisfies the necessary conditions at the switching
points.
The thin lines on each edge of the diagram are plots of η calculated using (11.32); these lines
coincide with the optimal phase plot on each track section.
0.15
0.1
η
0.05
−0.05
12 14 16 18 20 22 24
v
Figure 11.6: Phase diagrams for one optimal and two non-optimal strategies
Figure 11.6 shows phase plots for one optimal and two non-optimal profiles. The thick blue
line is optimal. The green profile starts powering too soon, and returns to v = V with η > 0.
The red profile starts powering too late, and returns to v = V with η < 0.
The critical curve of a gradient acceleration γ is calculated using the formula (11.32) with
the limiting speed W of that gradient and the critical value of C
LV (W ) − ϕ(W )
C∗ = .
W
Figure 11.7 shows the critical curves for each of the steep uphill gradients in the previous
example. The blue curve is the original phase plot for the optimal power phase. The green
curve is the phase plot for a steep section where the first gradient is lengthened by a factor
of two. The pink curve is the phase plot for a steep section where the second gradient is
lengthened by a factor of four.
0.3
0.25
critical curve for γ12 critical curve for γ11

0.2
0.15
η
0.1
0.05
limiting speed for γ limiting speed for γ12

−0.05 11
−0.1
5 10 15 20 25 30
v
Figure 11.7: Phase diagrams and critical curves

11.5 Phase diagrams of combinations of long and short steep

sections
In this section we systematically vary the lengths L 11 and L12 of steep uphill sections to see
the effect on the optimal strategy. We consider eight cases, summarised in table 11.5. In each
case, we have γ0 = γ2 = 0.075. For each of the examples, we calculate the phase trajectories
Table 11.1: Eight combinations of long and short steep sections.

Figure γ11 γ12 L11 L12
11.8 0.3 0.2 500 {400, 1000, 1600, . . . , 3400}
11.9 0.3 0.2 1500 {200, 600, . . . , 2200}
11.10 0.3 0.2 {200, 600, . . . , 2200} 1000
11.11 0.3 0.2 {700, 1200, . . . , 3200} 1600
11.12 0.2 0.3 1200 {200, 600, . . . , 2200}
11.13 0.2 0.3 4000 {200, 600, . . . , 2200}
11.14 0.2 0.3 {600, 1400, . . . , 4600} 400
11.15 0.2 0.3 {200, 600, . . . , 2200} 1200
by finding the optimal power phase using Runge-Kutta method.
In the figure for each example, the trajectory on the first gradient is shown in blue and the
trajectory on the second gradient is shown in red.
11.5.1 γ11 = 0.3, L11 = 500, γ12 = 0.2, L12 ∈ {400, 1000, . . . , 3400}
Figure 11.8 shows phase trajectories where the length of the second, less steep section is
varied. As the length of the second section is increased:
• the trajectory on the second steep section (red) approaches the critical curve;
• the speed of the optimal trajectory increases in the first section; and
• the starting location for the optimal power phase occurs further before the start of the
steep sections.
0.3
0.25
0.2 critical curve of the first slope
0.15
η
0.1 critical curve of the second slope
0.05
limiting speed of the first slope

−0.05
limiting speed of the second slope

−0.1
5 10 15 20 25 30
v
Figure 11.8: Phase diagrams for optimal power phases for tracks with γ11 = 0.3, L11 = 500,
γ12 = 0.2, L12 ∈ {400, 1000, . . . , 3400}.
11.5.2 γ11 = 0.3, L11 = 1500, γ12 = 0.2, L12 ∈ {200, 600, . . . , 2200}
In Figure 11.9, the first steep section is longer than in the previous example. As the length
of the second steep section is increased:
• the speed of the optimal trajectory increases on both sections; and
steep sections.
0.3
0.25
0.2
critical curve of the first slope

0.15
η
0.1
0.05
0
critical curve of the second slope
−0.05
−0.1
5 10 15 20 25 30
v
Figure 11.9: Phase diagrams for optimal power phases with tracks γ11 = 0.3, L11 = 1500, γ12 = 0.2,
L12 ∈ {200, 600, . . . , 2200}.
11.5.3 γ11 = 0.3, L11 ∈ {200, 600, . . . , 2200}, γ12 = 0.2, L12 = 1000
In Figure 11.10, the length of the first steep gradient varies while the length of the second
steep gradient is constant. As the length of the first steep gradient is increased:
• the trajectory on the first steep section (blue) approaches the critical curve;
• the speed of the optimal trajectory increases at the start of the first steep section, but
decreases at the end of this section;
• the speed of the optimal trajectory decreases at both ends of the second steep section;
and
steep sections.
0.3
0.25
0.2
0.15
η
0.1
0.05
0
−0.05
−0.1
5 10 15 20 25 30
v
Figure 11.10: Phase diagrams for optimal power phases for tracks with γ11 = 0.3, L11 ∈
{200, 600, . . . , 2200}, γ12 = 0.2, L12 = 1000.
11.5.4 γ11 = 0.3, L11 ∈ {700, 1200, . . . , 3200}, γ12 = 0.2, L12 = 1600
Figure 11.11 shows phase trajectories of optimal power phases where the length of the second
steep section is longer. As the length of the first steep gradient is increased:
• the trajectory on the first steep section (blue) approaches the critical curve;
decreases at the end of the first steep section;
• the speed of the optimal trajectory decreases at both ends of the second steep section;
and
steep sections.
0.3
0.25
0.2
0.15
η
0.1
0.05
0
−0.05
−0.1
5 10 15 20 25 30
v
{700, 1200, . . . , 3200}, γ12 = 0.2, L12 = 1600.
11.5.5 γ11 = 0.2, L11 = 1200, γ12 = 0.3, L12 ∈ {200, 600, . . . , 2200}
In Figure 11.12, as the length of the second steep section is increased:
• the speed of the optimal trajectory increases at the both ends of the first section;
• the speed of the optimal trajectory increases at the start of the second steep section, but
decreases at the end of the second steep section; and
steep sections.
0.3

0.25
0.2
0.15
η
0.1
0.05
0

−0.05
−0.1
5 10 15 20 25 30
v
γ12 = 0.3, L12 ∈ {200, 600, . . . , 2200}.
11.5.6 γ11 = 0.2, L11 = 4000, γ12 = 0.3, L12 ∈ {200, 600, . . . , 2200}
In Figure 11.13, the first steep section is longer. As the length of the second steep section is
increased:
• the speed of the optimal trajectory increases on the first section;
• the speed of the optimal trajectory increases at the start of the second section, and
decreases at the end of the second section; and
steep sections.
0.3

0.25
0.2
0.15
η
0.1
0.05
0

−0.05
−0.1
5 10 15 20 25 30
v
γ12 = 0.3, L12 ∈ {200, 600, . . . , 2200}.
11.5.7 γ11 = 0.2, L11 ∈ {600, 1400, . . . , 4600}, γ12 = 0.3, L12 = 400
In Figure 11.14, the length of the first steep gradient varies while the length of the second
steep gradient is constant. As the length of the first steep gradient is increased:
• the trajectory on the first steep section approaches the critical curve;
decreases at the end of the first steep section;
• the speed of the optimal trajectory decreases at the end of the second steep section;
steep sections.
0.3

0.25
0.2
0.15
η
0.1
0.05
0

−0.05
−0.1
5 10 15 20 25 30
v
{600, 1400, . . . , 4600}, γ12 = 0.3, L12 = 400.
11.5.8 γ11 = 0.2, L11 ∈ {200, 600, . . . , 2200}, γ12 = 0.3, L12 = 1200
Figure 11.15 displays the results when the second steep gradient is steeper and the length
of the first is allowed to vary. In this case, something unexpected happens as the length of
the first gradient is increased. Initially, the optimal speed at the bottom of the steep section
increases with the length of this part, and the optimal trajectories (blue) of the first gradient
move away from its critical curve. But as the length of the first gradient increases further,
the trajectories regress towards the critical curve. That is, the length of the run-up for the
steep sections initially increases with the length of the first gradient, but then decreases as
the length of the first gradient continues to increase. It is not obvious why this happens.
0.3

0.25
0.2
0.15
η
0.1
0.05
0

−0.05
−0.1
5 10 15 20 25 30
v
{200, 600, . . . , 2200}, γ12 = 0.3, L12 = 1200.
11.6 Conclusion
Integrating the adjoint equation gives a direct relationship between speed v and adjoint vari-
able η on any gradient. The (v, η) trajectory followed by the train depends on a constant
Ci that can be chosen for each gradient section i. To construct an optimal power phase for
a steep uphill section, we must find a sequence of constants C i starting and finishing with
Ci = 0, such that speed v and adjoint variable η are continuous at each gradient change
point.
By studying the optimal power phase trajectories, we can gain further insights into the nature
of an optimal journey.
Chapter 12
Phase trajectories on a steep downhill

section
In this chapter we repeat the analysis of Chapter 11, but this time for an optimal coast phase
on a steep downhill section of track.
12.1 An exact integral for the adjoint equation
The speed of a train coasting on a track section with constant gradient acceleration γ is given
by the equation
dv r(v) − γ
=− . (12.1)
dx v
As seen in Chapter 2, the modified adjoint equation is
dθ ψ(v) ψ(V )
− 3 ·θ =− 3 . (12.2)
dx v v
By substituting η = θ − 1, we obtain the alternative adjoint equation
− 3 ·η = . (12.3)
dx v v3
180
Chapter 12. Phase trajectories on a steep downhill section 181
Proceeding as we did in Section 11.1, we can show that the integrating factor for (12.2) is

r (v) dv
IF = exp − dx = v (12.4)
v dx
and the modified adjoint variable η is an expression in terms of v, similar to the one we found
in Chapter 11
ϕ(v) − LV (v) + (ϕ (V ) − γ + C)v
η= . (12.5)
γv − ϕ(v)
12.2 Necessary conditions for an optimal coast phase
If we define
Ci = ϕ (V ) − γi + C (12.6)
then (12.5) can be rewritten in the general form

ϕ(v) − LV (v) + Ci v εV (v) + Ci v
η= = (12.7)
γi v − ϕ(v) hi (v)
where
εV (v) = ϕ(v) − LV (v) (12.8)
and
hi (v) = γiv − ϕ(v). (12.9)
Equation (12.9) is the same as (11.13), with P = 0 because we are coasting.
If the coast phase starts on track section i, we need to have v = V and η = 0. Because v
and η are continuous, given Ci we can calculate the sequence of values {Ci, Ci+1 , Ci+2 , . . . }.
The values of the sequence {C}k+1
i can be used as an alternative condition for an optimal
strategy.
As in Chapter 11, suppose a coast phase starts on track section i with gradient acceleration
γi . We wish to find a sequence of values {Ci , Ci+1 , . . .} that define a continuous speed profile
for the coast phase.
v2
V V
J0
v1
J1
J2
p x1 x2 q
Figure 12.1: Coast phase for a single steep downhill gradient.
12.2.1 A single steep gradient section
First, we consider a track with a non-steep gradient acceleration γ0 on the interval (−∞, x1 ),
a single steep downhill gradient acceleration γ 1 on the interval [x1 , x2 ], and a non-steep
gradient acceleration γ2 on the interval (x2 , ∞). The train starts coasting at p < x1 and
finishes coasting at q > x2 , as shown in Figure 12.1.
Using the same argument as Subsection 11.2.1, we can show that the necessary condition for
an optimal coast phase is
εV (v2 ) εV (v1 )
(γ2 − γ1 ) + (γ1 − γ0 ) =0 (12.10)
h2 (v2 ) h0 (v1 )
which is equivalent to condition (8.31),
[ϕ(vb ) − γ0 vb ] µ − [ϕ(vb ) − LV (vb )] (γ1 − γ0 ) = 0.
As in Chapter 11, the optimal coast phase has C 0 = C2 = 0.
12.2.2 A section of two steep gradients
We now consider a track with two steep gradient accelerations γ1 and γ2 lying between two
non-steep gradient accelerations γ0 and γ3 as seen in Figure 12.2. The coast phase starts at p
v3
V V
v2
J0 v1
J1
J2
J3
p x1 x2 x3 q
Figure 12.2: Coast phase for a track with two steep downhill gradients.
and ends at q.
Using a similar argument to that of Subsection 11.2.1, we can show that the necessary con-
dition for an optimal coast phase is
εV (v3 ) εV (v2 ) h3 (v3 ) εV (v1 ) h2 (v2 ) h3 (v3 )
(γ3 −γ2 ) + (γ2 −γ1 ) · + (γ1 −γ0 ) · · = 0 (12.11)
h2 (v3 ) h1 (v2 ) h2 (v3 ) h0 (v1 ) h1 (v2 ) h2 (v3 )
which is equivalent to the necessary condition (9.21), (9.18) and (9.19)
[ϕ(vd ) − γ11 vd ]M2 (vc ) − [ϕ(vd ) − γ12 vd ]M1 (vb ) − (γ12 − γ11 )[ϕ(vd ) − LV (vd )] = 0
obtained by minimising the cost function J as seen in Chapter 9. Once again, the optimal
coast phase has C0 = C3 = 0.
12.3 Properties of η(v)
In this section we observe that an optimal coast phase starts and finishes on a non-steep
section only and at the switching points, C = 0. The following lemma is more or less
identical to the corresponding lemma in Chapter 11.
Lemma 12.3.1 If
ϕ(v) − LV (v) + Cv
η= (12.12)
γv − ϕ(v)
then
1. The equation dη/dv = 0 has a unique solution.
dη
2. dv
(V ) = 0 ⇐⇒ C = 0.
3. The function η(v) has a minimum turning point when dη/dv = 0 at v = V on a steep
downhill track with γV − ϕ(V ) > 0, and a maximum turning point when dη/dv = 0
at v = V on non-steep gradients with γV − ϕ(V ) < 0.
The proof of this Lemma is similar to the proof of Lemma 11.3.1. We can also use the results
of Lemma 12.3.1 to analyse the behaviour of η(v) during a coast phase on a steep downhill
section, similar as we did in Chapter 11.
The coast phase must start on a non-steep section with v = V . From the Lemma, we know
that this implies that C = 0 and dη/dv = 0. In Section 12.2.1 we showed that C = 0 is a
necessary condition for the start of the power phase, and so dη/dv = 0 is also a necessary
condition for the start of the coast phase. Similarly, the coast phase must start on a non-steep
section with v = V , C = 0 and dη/dv = 0.
Figure 11.3 shows η(v) for the start (blue) and finish (green) of an optimal coast phase. To
get from the start of the coast phase to the end of the coast phase we require an η curve
as shown in red. Speed must increase on this section, and so the gradient must be steep
downhill. Furthermore, part 3 of the Lemma says that this η curve has a minimum turning
point, as shown in the diagram.
−0.02
−0.04
−0.06
η
−0.08
−0.1
−0.12
−0.14
−0.16
17 18 19 20 21 22 23 24 25 26
v
Figure 12.3: Phase diagram of a coast phase on a single steep downhill section.
12.4 Phase diagram for (η, v)
We now consider the function η(v) for different values of C. As in Chapter 11, we find that
there is a critical value
LV (W ) − ϕ(W )
C∗ =
W
for which the adjoint variable η approaches a limit as speed v approaches the limiting speed
W for the gradient.
For both non-steep and steep gradient accelerations γ, if C < C ∗ then η ↓ −∞ as v → W −

and η ↑ +∞ as v → W + ; if C > C ∗ , η ↑ +∞ as v → W − and η ↓ −∞ as v → W + . Note
that since
ϕ(v) − LV (v) + Cv
η=
γv − ϕ(v)
then
lim η = +∞.
v→0
By applying L’Hopital’s rule we have
ϕ(v) − LV (v) + Cv + εv
lim η = lim
v→∞ v→∞ γv − ϕ(v)
ϕ (v) − ϕ (V ) + C + ε

= lim
v→∞ γ − ϕ (v)
ϕ (v)
= lim = −1. (12.13)
v→∞ −ϕ (v)
The characteristics of η are illustrated in Figure 12.4. The η curves have been plotted for
various values of C. The critical curve of η is the one obtained when v = W and the
corresponding C ∗ . This critical curve is shown in pink. The black vertical line is the limiting
speed, W . The dashed horizontal line is the limit of η when v → ∞.
Figure 12.5 shows the optimal coast phase for an example track with two steep downhill
gradients. The blue curve is calculated by solving the state and adjoint equations using a
numerical solver, then using a shooting method to find initial conditions which satisfy the
necessary conditions for an optimal coast phase. The thin curves are calculated using (12.5).
Figure 12.6 shows an example phase plots of an optimal coast phase and two non-optimal
coast phases: one starts coasting too late and one starts coasting too early. The phase plot
of the coast phase that starts too late is shown in red. On the steep section, the η curve of
this strategy is below than the optimal η curve. At the point where v = V , η < 0 and so
we cannot switch back to speed hold. The η curve of the coast phase that starts too early is
shown in green. On the steep section the η curve is above than the optimal η curve. Hence
the train cannot switch back to speed hold when v = V because η > 0 at that moment.
C= − 0.3
140 C= − 0.2
C=−0.1
C=0
120 C=0.1
C=0.2
C=0.3
C=C*
100
80
η
60
40
20
0 W
−20
0 5 10 15 20 25 30
v
6
C= − 0.3
C= − 0.2
C=−0.1
4 C=0
C=0.1
C=0.2
C=0.3
2 C=C*
0
W
η
−2
−4
−6
−8
−10
0 20 40 60 80 100 120
v
Figure 12.4: Adjoint variable plot with different values of C

−0.2
−0.4
η
−0.6
−0.8
−1
−1.2
10 20 30 40 50 60
v
Figure 12.5: An example phase plot calculated using shooting method and using (12.5).
0.02
−0.02
−0.04
−0.06
η
−0.08
−0.1
−0.12
−0.14
−0.16
16 17 18 19 20 21 22 23 24 25 26
v
Figure 12.6: Optimal and non-optimal profiles η. Too early: green, too late: red.
12.5 Phase diagrams for combinations of long and short

steep downhill gradients.
We consider eight cases:
Table 12.1: Eight combinations of long and short steep sections.

Figure γ11 γ12 L11 L12
12.7 0.2 0.4 1200 {800, 4000, . . . , 23200}
12.8 0.2 0.4 30000 {1600, 3200, . . . , 8000}
12.9 0.2 0.4 {18000, 19200, . . . , 24000} 400
12.10 0.2 0.4 {12000, 18000, . . . , 42000} 1600
12.11 0.4 0.2 1800 {2400, 4800, . . . , 14400}
12.12 0.4 0.2 32000 {2400, 4800, . . . , 14400}
12.13 0.4 0.2 {1800, 5400, . . . , 19800} 800
12.14 0.4 0.2 {5600, 9400, 13000, 198000} 8000
In all eight figures the blue curve indicates the adjoint variable values on the first steep section
and the red part indicates the one on the second steep section. The green part presents the
remainder of the coast phase. The limiting speeds and critical curves of the two slope are
presented in the same colour code as above. In each case, we have γ0 = γ2 = 0.01.
12.5.1 γ11 = 0.2, L11 = 1200; γ12 = 0.4, L12 ∈ {800, 4000, . . . , 23200}.
Since γ11 < γ12 then the limiting speed of the first slope is less than the limiting speed of the
second slope. When we increase the length of the second slope
• the trajectory of the first slope goes past its critical curve even all speed on this slope
is far from its liming speed.
• The speed at the start of the second slope slightly decreases but the speed at the end of
the second slope quickly increases.
• The trajectory of the second slope approaches the critical curve.
• the speed decreases at the start but increases at the end of the second slope.
• the starting location for the optimal coast phase occurs further before the start of the
steep sections.
−0.1
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
critical curves
−0.8
−0.9
−1
10 20 30 40 50 60 70 80 90
v
Figure 12.7: Phase diagrams for optimal coast phases for tracks with γ11 = 0.2, L11 = 1200,
γ12 = 0.4, L12 ∈ {800, 4000, . . . , 23200}.
12.5.2 γ11 = 0.2, L11 = 3000; γ12 = 0.4, L12 ∈ {1600, 3200, . . . , 8000}.
In Figure 12.8 the length of the first slope is longer and kept constant while the length of the
second slope is varied and longer than that represented by Figure 12.7. Since the length of
the first slope is very long then the trajectory on this slope approaches the critical curve(in
blue). When we increase the length of the second slope
• the speed decreases at the start but increases at the end of the first slope, however the
rate of change is very small.
• the speed increases at the start of the second slope and also increases at the end of the
slope but the rate of change is much larger.
−0.1
limiting speed for the first slope
limiting speed for the second slope
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
−0.9 critical curves
−1
10 20 30 40 50 60 70 80 90 100
v
Figure 12.8: Phase diagrams for optimal coast phases for tracks with γ11 = 0.2, L11 = 3000,
γ12 = 0.4, L12 ∈ {1600, 3200, . . . , 7200}.
12.5.3 γ11 = 0.2, L11 ∈ {18000, 19200, . . . , 24000} ; γ12 = 0.4, L12 =
400.
In this case we vary the length of the first slope and keep the length of the second slope
constant. When the length of the first slope is changed
• the speed increases at the start of the first slope and also increases but faster at the end
of this slope.
• the speed increases at both ends of the second slope at nearly the same rate.
−0.1
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
−0.9 critical curves of the two slopes
−1
10 20 30 40 50 60 70 80 90 100
v
Figure 12.9: Phase diagrams for an optimal coast phase for tracks with γ11 = 0.2, L11 ∈
{18000, 19200, . . . , 24000} , γ12 = 0.4, L12 = 400.
12.5.4 γ11 = 0.2, L11 ∈ {12000, 18000, . . . , 42000} , γ12 = 0.4, L12 =
1600.
In Figure 12.10 we made the length of the second slope longer compared to the one in Figure
12.9. The trajectories appear to be similar to those in Figure 12.9.
−0.1
−0.2 limiting speed of the second slope
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
−0.9 critical curve
−1
10 20 30 40 50 60 70 80 90 100
v
Figure 12.10: Phase diagrams for optimal coast phases for tracks with γ11 = 0.2,L11 ∈
{12000, 18000, . . . , 42000} , γ12 = 0.4, L12 = 1600.
12.5.5 γ11 = 0.4, L11 = 1800; γ12 = 0.2, L12 ∈ {2400, 4800, . . . , 14400}.
In Figure 12.11 the length of the first slope is constant while the length of the second slope
is varied. When we increase the length of the second slope
• the speed decreases at the start and the end of the first slope at slow rates.
• the speed decreases at the start but increases at the end of the second slope at a faster
rate.
−0.1
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
−1
10 20 30 40 50 60 70 80
v
Figure 12.11: Phase diagrams for optimal coast phases for tracks with γ11 = 0.4, L11 = 1800 ,
γ12 = 0.2, L12 ∈ {2400, 4800, . . . , 14400}.
12.5.6 γ11 = 0.4, L11 = 3200; γ12 = 0.2, L12 ∈ {2400, 4800, . . . , 14400}.
In Figure 12.12 we make the length of the first slope very long compared to the one in Figure
12.11. Hence the trajectories of the first slope go past the limiting speed of the second slope.
Recall that we now have limiting speed of the first slope is bigger than that of the second
slope. When the length of the first slope increases
• the speed decreases at the start and the end of the first slope at slow rates.
• the speed at the end of the second slope decreases faster than at the start of the slope.
−0.1
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
−0.9
critical curves
−1
10 20 30 40 50 60 70 80 90 100
v
Figure 12.12: Phase diagrams for optimal coast phases for tracks with γ11 = 0.4, L11 = 3200 ,
γ12 = 0.2, L12 ∈ {2400, 4800, . . . , 14400}.
12.5.7 γ11 = 0.4, L11 ∈ {1800, 5400, . . . , 19800}; γ12 = 0.2, L12 = 800.
We now vary the length of the first slope and keep the length of the second slope unchanged.
When we increase the length of the first slope
• the speed increases at the start of the second slope and also increases at the end of the
second slope but at a very much faster rate.
• the speed decreases at the start of the first slope and increases very much faster at the
end of the slope.
−0.1 limiting speed of the second slope
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
−1
10 20 30 40 50 60 70 80 90 100
v
Figure 12.13: Phase diagrams for optimal coast phases for tracks with γ11 = 0.4, L11 ∈
{1800, 5400, . . . , 19800} , γ12 = 0.2, L12 = 800.
12.5.8 γ11 = 0.4, L11 ∈ {5600, 9400, 13000, 19800}; γ12 = 0.2, L12 =
5600.
This time we make the length of the second slope longer than to that in Figure 12.13 but still
keep it unchanged. The trajectories in Figure 12.14 are similar to those in Figure 12.13.
−0.1
limiting speed of the second section
−0.2
−0.3
−0.4
η
−0.5
−0.6
−0.7
−0.8
critical curves
−0.9
−1
10 20 30 40 50 60 70 80 90 100
v
Figure 12.14: Phase diagrams for optimal coast phases for tracks with γ11 = 0.4, L11 ∈
{5600, 9400, 13000, 19800} , γ12 = 0.2, L12 = 5600.
12.6 Conclusion
Constructing the direct relationship of v and η by integrating the adjoint variable gives an
alternative view to how to obtain an optimal strategy and reveals some new properties of
this relationship. This new approach leads us to a new set of necessary conditions that is
equivalent to those conditions obtained in Chapters 8 and 9. To construct an optimal coast
phase for a steep downhill section, we must find a sequence of constants {C j }k+1
j starting
with Ci = 0 and finishing with Ck+1 = 0, given that speed v and adjoint variable η are
continuous at each gradient change point.
Chapter 13
Coasting and braking at the end of a

journey
In Chapters 11 and 12 we found a direct relationship between train speed v and the adjoint
variable η = θ − 1 for a power phase on a steep uphill section and for a coast phase on a
steep downhill section. We also showed how this direct relationship can be used to determine
necessary conditions for optimal journey segments.
In this chapter we find a relationship between the train speed v and the adjoint variable θ
for the final coast and brake phases of a journey on a track with piecewise constant gradient,
and show how this relationship can be used to derive necessary conditions for optimal coast
and brake phases. Using that relationship we construct a nice relationship between coasting
speed VC and braking speed VB of a journey on a track with a constant non-steep gradient.
This result extends the result obtained by Howlett [20] for level tracks.
Recall from Chapter 2 that an optimal journey finishes with a coast phase followed by a brake
phase. It is convenient to use θ rather than η because the optimal control changes from coast
to brake at θ = 0. The coast phase starts with v = VC and θ = 1 and finishes with v = VB
199
Chapter 13. Coasting and braking at the end of a journey 200
and θ = 0; the brake phase starts with v = VB and θ = 0 and finishes with v = 0. In this
chapter we will consider the final coast phase and find a recursive relationship between the
speed where the final coast phase begins and the speeds at which gradients changes occur
and the speed where the final coast phase ends.
13.1 Coasting and braking on a constant gradient
In this section we consider the relationship between v and θ as the train coasts and brakes on
a track with constant gradient acceleration γ. We assume that the speed of the train decreases
during the brake phase, but may increase or decrease during the coast phase. The equation
of motion during the brake phase is
dv
v = −Q + γ − r(v) (13.1)
dx
and the adjoint equation is
dθ ψ(v) ψ(V )
− 3 ·θ =− 3 . (13.2)
dx v v
By arguing as we did in Section 11.1 we can show that

dv ψ(v)
v = exp − dx (13.3)
dx v3

d dv ψ(V ) dv d ψ(V )
=⇒ v θ =− 3 v = (13.4)
dx dx v dx dx v
dv ψ(V )
=⇒ v θ= + C. (13.5)
dx v
From (13.1), (13.2) and (13.3) it follows that
[−Qv + γv − ϕ(v)]θ = ψ(V ) + Cv (13.6)
on the braking interval. Let VB be the speed at which braking starts. Since θ = 0 at the start
of braking, we have
−ψ(V )
ψ(V ) + CVB = 0 =⇒ C = .
VB
The equation of motion of the train when coasting on a track with constant gradient acceler-
ation γ is
dv
v = γ − r(v). (13.7)
dx
By using the same technique as we did in Section 11.1 and in (13.3)–(13.5), we get
dv ψ(V )
v ·θ = +C
dx v

ψ(V )
[γ − r(v)]θ = + C. (13.8)
v
At the end of the coast phase, we have v = VB and θ = 0, and so
−ψ(V )
C= ,
VB
which is consistent with the constant C obtained for the adjoint equation (13.6) of the brake
phase. Hence,
1 1
[γ − r(v)]θ = ψ(V ) − . (13.9)
v VB
At the point where coasting begins, we have v = V C and θ = 1, and so

1 1
γ − r(VC ) = ψ(V ) − . (13.10)
VC VB
This gives a relationship between the coasting speed V C and the braking speed VB for optimal
final coast and brake phases on a constant gradient.
The following lemma shows that for the typical case where the final coast phase starts at the
hold speed V , as V increases so does VB .
Lemma 13.1.1 Let V and VB be the holding speed and braking speed for an optimal driving
strategy of a train travelling on a track with constant gradient acceleration γ. If the track is
non-steep and the final coast phase starts at the hold speed V then as the holding speed V
increases, so does the braking speed V B .
Differentiating both sides of (13.10) we have

1 dVC 1 r (VC ) dVC [γ − r(VC )]ψ (V ) dV
− = − − −
VC2 dVB VB2 ψ(V ) dVB [ψ(V )]2 dVB
which can be rearrange to give

ψ(V ) − ψ(VC ) dVC 1 [γ − r(VC )]ψ (V ) dV
= + . (13.11)
VC2 ψ(V ) dVB VB2 [ψ(V )]2 dVB
Since the train starts coasting from the hold speed V then (13.11) becomes

[γ − r(V )]ψ (V ) dV 1
− = 2.
[ψ(V )]2 dVB VB
Since γ − r(V ) < 0 then dV /dVB > 0.
In the next section we consider coasting and braking on a track with piecewise constant
gradient acceleration.
13.2 Final coasting and braking phases on a piecewise con-

stant gradient
Suppose now we have piecewise constant gradient. The gradient changes at locations x 1 , x2 ,
. . . , xn . The train is required to stop at point x = X where X ∈ [xn , xn+1 ]. The gradient
acceleration on (xi , xi+1 ) is γi . The gradient sections are shown in Figure 13.1.
Let Vn = v(xn ) and θn = θ(xn ) where v is the optimal speed trajectory and θ is the corre-
sponding adjoint trajectory. From (13.9) we have

1 1
[γn − r(Vn )]θn = ψ(V ) − . (13.12)
Vn VB
J0 J1 Jn2 Jn1 Jn
x1 x2 x3 xn2 xn1 xn xB X xn1
Figure 13.1: Track diagram.
On the interval [xn−1 , xn ], with gradient acceleration γn−1, (13.8) gives

ψ(V )
[γn−1 − r(Vn )]θn = + Cn−1 (13.13)
Vn
at point xn and at point xn−1 we have
ψ(V )
[γn−1 − r(Vn−1)]θn−1 = + Cn−1 . (13.14)
Vn−1
From (13.13) and (13.12) we obtain
ψ(V )
Cn−1 = [γn−1 − r(Vn )]θn −
Vn

1 1
ψ(V ) Vn
− VB ψ(V )
= [γn−1 − r(Vn )] −
γn − r(Vn ) Vn

γn−1 − r(Vn ) 1 1 1
= ψ(V ) − − .
γn − r(Vn ) Vn VB Vn
Substituting into (13.14) gives

γn−1 − r(Vn ) 1 1 1 1
[γn−1 − r(Vn−1 )]θn−1 = ψ(V ) − + −
γn − r(Vn ) Vn VB Vn−1 Vn

1 1 γn−1 − r(Vn ) 1 1
= ψ(V ) − + ψ(V ) · −
Vn−1 Vn γn − r(Vn ) Vn VB

1 1
= ψ(V ) − + [γn−1 − r(Vn )]θn . (13.15)
Vn−1 Vn
By applying the same formulae we used in (13.13) and (13.14) on the interval [x n−2 , xn−1 ]
at the point xn−1 we obtain
ψ(V )
[γn−2 − r(Vn−1 )]θn−1 = + Cn−2 (13.16)
Vn−1
and at point xn−2 we have

ψ(V )
[γn−2 − r(Vn−2)]θn−2 = + Cn−2 . (13.17)
Vn−2
By similar arguments to those above
ψ(V )
Cn−2 = [γn−2 − r(Vn−1)] · ·
γn−1 − r(Vn−1)

γn−1 − r(Vn ) 1 1 1 1
− + −
γn − r(Vn ) Vn VB Vn−1 Vn

γn−2 − r(Vn−1 ) γn−1 − r(Vn ) 1 1
= ψ(V ) · −
γn−1 − r(Vn−1 ) γn − r(Vn ) Vn VB

γn−2 − r(Vn−1 ) 1 1 ψ(V )
+ψ(V ) − − (13.18)
γn−1 − r(Vn−1 ) Vn−1 Vn Vn−1
and

1 1
[γn−2 − r(Vn−2)]θn−2 = ψ(V ) − + [γn−2 − r(Vn−1)]θn−1 . (13.19)
Vn−2 Vn−1
In general, on the interval [xn−j , xn−j+1] we have

1 1
[γn−j − r(Vn−j )]θn−j = ψ(V ) − + [γn−j − r(Vn−j+1)] θn−j+1 (13.20)
Vn−j Vn−j+1
and finally on [xn−j−1 , xn−j ], where the train starts coasting, we have

1 1
[γn−j−1 − r(V )] = ψ(V ) − + [γn−j−1 − r(Vn−j )]θn−j . (13.21)
V Vn−j
We can use these equations to calculate an optimal final phase. First, we pick a braking
speed VB , then use (13.9) to calculate Vn . We then apply (13.20) to calculate Vn−1 , Vn−2 , . . . ,
Cn−1 , Cn−2 , . . . and θn−1 , θn−2 , . . . until we reach an interval on which we can have θ = 1.
We can then calculate the coasting speed VC . If we require VC = V then we may need to use
a shooting method to find V B that gives VC = V .
Lemma 13.2.1 Let V and VB be the holding speed and braking speed for an optimal driving
strategy of a train on a piecewise constant gradient section comprising a sequence of gradi-
ent accelerations {γ1, γ2 , . . . , γn−1, γn }. Assume that the train starts coasting from holding
speed V on a section γn−1 and starts and finishes braking on the interval [x n , xn−1 ] as shown
in Figure 13.2. If we increase the holding speed then the braking speed also increases; that
is
dV
> 0.
dVB
Let v be the optimal speed trajectory and θ be the corresponding adjoint trajectory. The final
coast phase starts at xC with v(xC ) = VC = V . Speed decreases on the non-steep interval
[xC , xn ]. Speed increases on the interval [xn , xB ] as shown in Figure 13.2 if γn is a steep
downhill gradient acceleration. If γn is non-steep then speed continues decreasing on the last
section as shown in Figure 13.3. The final braking phase starts at x B with speed v(xB ) = VB .
The braking phase finishes at X with v(X) = 0.
VB
V
Vn
J n 1
Jn
v 0
xC xn xB X
Figure 13.2: Track and speed profile for the case γn is steep.
On [xC , xn ] the adjoint variable θ is given by
ψ(V )
[γn−1 − r(v)]θ = + Cn−1
v
where
ψ(V )
Cn−1 = [γn−1 − r(Vn )]θn − .
Vn
Therefore

1 1
[γn−1 − r(v)]θ = ψ(V ) − + [γn−1 − r(Vn )]θn .
v Vn
V
Vn
VB
J n 1
Jn v 0
xC xn xB X
Figure 13.3: Track and speed profile for the case γn is non-steep.
At xn , (13.12) gives
1 1
ψ(V ) Vn
− VB
θn =
γn − r(Vn )
where VB is the speed at which braking begins, and so

[γn−1 − r(Vn )]ψ(V ) 1
− 1
1 1 Vn VB
[γn−1 − r(v)]θ = ψ(V ) − +
v Vn γn − r(Vn )

1 γn−1 − r(Vn ) 1 γn−1 − γn 1
= ψ(V ) − · + · .
v γn − r(Vn ) VB γn − r(Vn ) Vn
For all points in this interval we have γ n−1 < r(v). At the start of the coast phase θ = 1 and
so
1 γn−1 − r(Vn ) 1 γn−1 − γn 1
[γn−1 − r(V )] = ψ(V ) − · + ·
V γn − r(Vn ) VB γn − r(Vn ) Vn
which we rearrange to obtain
1 γn−1 − r(V ) γn−1 − r(Vn ) 1 γn−1 − γn 1

− = · − ·
V ψ(V ) γn − r(Vn ) VB γn − r(Vn ) Vn

1 γn−1 − γn 1 1
= + · − .
VB γn − r(Vn ) VB Vn
Differentiating with respect to VB gives
1 dV r (V ) dV γn−1 − r(V ) dV
− + + · ψ (V ) · =
2
V dVB ψ(V ) dVB [ψ(V )]2 dVB

1 γn−1 − γn dVn 1 1 γn−1 − γn 1 1 dVn
− 2+ · r (Vn ) · · − + − 2+ 2
VB [γn − r(Vn )]2 dVB VB Vn γn − r(Vn ) VB Vn dVB
and so we have
r(V ) − γn−1 dV γn−1 − r(Vn ) 1

· ψ (V ) · = ·
[ψ(V )]2 dVB γn − r(Vn ) VB2

γn−1 − γn 1 1 γn − r(Vn ) dVn
− · r (Vn ) − + · . (13.22)
[γn − r(Vn )]2 VB Vn Vn2 dVB
At the braking point xB we have
dVB
VB = −Q − r(VB ) + γn . (13.23)
dxB
The speed profile on the interval [xn , xB ] must satisfy the condition
VB
vdv
= xB − xn .
Vn γn − r(v)
Differentiating both sides with respect to x B gives
VB dVB
dxB
dVn
Vn dx
− B
= 1.
γn − r(VB ) γn − r(Vn )
Dividing by dVB /dxB and using (13.23) gives

dV
VB Vn dVBn −VB
− =
γn − r(VB ) γn − r(Vn ) Q + r(VB ) − γn
and hence
dVn
Vn dV QVB
B
=
γn − r(Vn ) [γn − r(VB )](Q + r(VB ) − γn )
from which
dVn QVB (γn − r(Vn ))
= .
dVB Vn [γn − r(VB )](Q + r(VB ) − γn )
Let
Q
C= . (13.24)
Q + r(VB ) − γn
Hence we have
dVn CVB (γn − r(Vn ))
= . (13.25)
dVB Vn (γn − r(VB ))
Consider equation (13.22). Let

1 1 γn − r(Vn )
A(VB , Vn ) = r (Vn ) − +
VB Vn Vn2
ψ(Vn ) − VB Vn r (Vn ) + VB (γn − r(Vn ))
=
VB Vn2
ψ(Vn ) + VB γn − VB ϕ (Vn )
=
VB Vn2
VB γn − LVn (VB )
=
VB Vn2
VB (γn − r(VB ))
>
VB Vn2
γn − r(VB )
> . (13.26)
Vn2
By substituting (13.25) and (13.26) into (13.22) we have
r(V ) − γn−1 dV γn−1 − r(Vn ) 1 γn − γn−1 CVB

· ψ (V ) · > · 2+ · . (13.27)
[ψ(V )]2 dVB γn − r(Vn ) VB γn − r(Vn ) Vn3
If γn is a steep gradient acceleration then γn −r(B) > 0 and so from (13.24) we have C > 1.
We can write (13.27) in the form

r(V ) − γn−1 dV γn − γn−1 1 γn − γn−1 CVB
· ψ (V ) · > 1− 2
+ ·
[ψ(V )]2 dVB γn − r(Vn ) VB γn − r(Vn ) Vn3

γn − γn−1 CVB 1 1
= − 2
+ .
γn − r(Vn ) Vn3 VB VB2
We have
CVB VB Vn 1
> > 3 = 2
Vn3 Vn3 Vn Vn
because Vn < VB , and so

CVB 1
3
− 2 > 0.
Vn VB
So we have
r(V ) − γn−1 dV 1 dV
· ψ (V ) · > 2 > 0 =⇒ >0 (13.28)
[ψ(V )]2 dVB VB dVB
because r(V ) − γn−1 > 0 and ψ (V ) > 0. So for the case γn−1 is non-steep and γn is steep,
dV
we have proved dVB
> 0.
We now consider the case where γn is non-steep. We use a perturbation analysis similar to
that in Chapter 4. Suppose v∗ is an optimal strategy on a non-steep track with piecewise
constant gradient and suppose that the point x C at which coasting begins lies in the interval
(xn−1 , xn ) and the point xB at which braking begins lies in the interval (x n , ∞). Now let

ψ(V ) 1 1
θ∗ = · −
r(v∗ ) − γn VB v∗
for v∗ < Vn and

ψ(V ) 1 1 r(Vn ) − γn−1 1 1
θ∗ = − + −
r(v∗ ) − γn−1 Vn v∗ r(Vn ) − γn VB Vn
for v∗ > Vn . By the same perturbation analysis as we used in Chapter 4 we have

d ψ(v∗ ) ψ (v∗ ) 3ψ(v∗ ) 3ψ(V ) ψ (V )
(δθ) − δθ = − θ∗ + δv − δV
dx v∗3 v∗3 v∗4 v∗4 v∗3
where we note that

ψ (v∗ ) 3ψ(v∗ ) 3ψ(V ) ψ (V )
m = − θ∗ + δv − δV
v∗3 v∗4 v∗4 v∗3
ψ (v∗ ) ψ (V )
≥ δv − δV
v∗3 v∗3
> 0
if δV ≤ 0. Similar to Chapter 4, we can show that if δV ≤ 0 then we cannot have a control

change on the perturbed profile. The integrating factor for the DE
d ψ(v∗ )
(δθ) − (δθ) = m
dx v∗3
is

ψ(v∗ )
I(x) = e− v∗ dx
dv∗
= v∗
dx
⎧
⎪
⎨−[r(v∗ ) − γn ] for x > xn ,
=
⎪
⎩−[r(v∗ ) − γn−1] for x < xn−1
from which we deduce that

d
[I(x)δθ] = I(x)m.
dx
Although we have written I = I(x), δθ = δθ(x) and m = m(x), we can regard all of these
expressions as functions of v∗ . Thus
⎧
⎪
⎨−[r(v∗ ) − γn ] for v∗ < Vn ,
I(v∗ ) =
⎪
⎩−[r(v∗ ) − γn−1 ] for v∗ > Vn
and
ψ (v∗ ) 3ψ(v∗ ) 3ψ(V ) ψ (V )
m(v∗ ) = − θ∗ + δv − δV
v∗3 v∗4 v∗4 v∗3
where θ∗ is given by the expression above. Now we have
d dx
[I(v∗ )δθ(v∗ )] = I(v∗ ) · m(v∗ ) · .
dv∗ dv∗
But v∗ dv
dx
∗
= I(v∗ ) and so
d
[I(v∗ ) · δθ(v∗ )] = v∗ m(v∗ ).
dv∗
By integrating from v∗ = VB and δθ(VB ) = 0 to v∗ = VC we have
VC
I(VC )δθ(VC ) = v∗ m(v∗ )dv∗ > 0.
VB
Since I(VC ) = (−1)[r(VC ) − γn−1 ] < 0 it follows that δθ(VC ) < 0. Hence it is not possible
to have [θ + δθ](V + δV ) = 1 if δV ≤ 0, and so θ∗ (x) must be associated with holding speed
dV
W > V . This means dVB
> 0.
13.3 Example
Example 13.3.1 Braking on steep downhill section.
In this example we consider a track with a non-steep gradient acceleration γ 0 = 0.01 for
x < 4000 and steep gradient acceleration γ 1 = 0.22 for x > 4000. The train must start at
X0 = −10000 and stop on the steep section at X = 6000.
For a given hold speed V , we choose braking speed VB , then solve (13.7) to get V1 = v(x1 ).
We then use (13.12) to find

1 1
ψ(V ) V1
− VB
θ1 = θ(x1 ) = .
γ1 − r(V 1)
Next we use (13.13) to calculate
ψ(V )
Cn−1 = [γ0 − r(V1 )]θ1 − .
V1
Finally we use (13.14) to calculate

ψ(V )
Vn−1
+ C1
θ = θ(xC ) =
γ0 − r(V )
and check if θ = 1. If not, we adjust VB and repeat the procedure. The result is shown
in Figure 13.4. The trips with holding speed V ∈ {20, 22, 25} have coasting points at
−4766.81, −4527.08, and −4277.20 respectively.
30
25
20
v 15
10
0
−10000 −8000 −6000 −4000 −2000 0 2000 4000 6000
x
Figure 13.4: Speed profiles of the journeys of three different holding speeds {20, 22, 25}.
13.4 Conclusion
For a track with a constant gradient acceleration γ there is a direct relationship between the
speed VC at which the final coast phase starts, and the speed VB at which the final coast
phase finishes. If the final coast phase starts from the hold speed V C = V then VB increases
as V increases. For a track with piecewise constant gradient acceleration we have derived a
recursive relationship between the final coasting speed VC and the braking speed VB . For the
case where we start coasting with VC = V and has one gradient change before braking, we
can show that VB increase as V increases. More general cases are more difficult to prove.
Part IV
Parameter estimation
214
Chapter 14
Parameter estimation
The University of South Australia and TMG International have developed an in-cab system,
called FreightMiser, which calculates optimal driving strategies for long-haul trains and dis-
plays driving advice to help the driver stay on time and minimise energy use. The location
and speed of the train are determined from GPS measurements. The system uses a shooting
method to find a driving strategy from the current location and speed to the next scheduled
stop that satisfies the necessary conditions described in subsection 2.2.1 in Chapter 2. If the
train strays from the ideal speed profile, the strategy is recalculated. The information dis-
played to the driver is shown in Figure 1.1 in Chapter 1. In trials of the system conducted
between Adelaide and Melbourne, a distance of over 800km, drivers using FreightMiser
achieved fuel savings of about 12%.
For the FreightMiser system to provide the best possible driving advice, it needs the best pos-
sible estimates for the parameters in the train performance model. These parameters include
a rolling resistance coefficient and an aerodynamic drag coefficient. Estimating values for
these parameters is difficult because they depend on the length and mass of the train, which
will vary from one journey to the next. Furthermore, the aerodynamic drag on the train will
depend on prevailing winds, which can change even during a journey.
215
Chapter 14. Parameter estimation 216
One way to overcome this difficulty is to estimate the parameter values in real-time from
observations of the actual train performance, and then to use the estimated parameter values
in the journey optimisation calculations to give a driving strategy tailored to the prevailing
conditions. When we calculate an optimal control profile we assume that the estimated
parameters will remain constant for the remainder of the journey, because we do not have
any information about how the parameters might change in the future. However, the optimal
speed profiles are recalculated regularly, each time using the most recent estimate of the
resistance parameter values. In this chapter we formulate the parameter estimation problem,
and introduce some mathematical filters that could be used to solve the problem.
The motion of the train is given by the differential equation

dv P
m = − r(v) + g(x)
dt v
where m is the mass of the train, v is the speed of the train, t is time, P is the power of the
locomotives, r(v) is the resistance to motion at speed v, and g(x) is the force on the train
due to the track gradient at location x. Train mass and locomotive power are known before
the train departs. Speed and location can be determined with good accuracy using GPS, and
gradient force can be determined from location and route data. Resistance, however, is more
difficult to predict.
Resistance is usually modelled by a quadratic
r(v) = r0 + r1 v + r2 v 2 . (14.1)
If there is no wind, then the coefficient r0 depends mainly on the mass of the train and on the
rolling resistance of the wheels, the coefficient r1 depends mainly on the weight on each axle
and the type, design and lubrication of the wheel bearings, and the coefficient r 2 depends
mainly on the aerodynamic drag of the train.
One way to determine rolling resistance coefficients is to allow the train to coast from high
speed down to low speed on a long, level track with no wind and to record a sequence of
(t, v) pairs. In this simple case, the equation of motion is
dv
m = −[r0 + r1 v + r2 v 2 ].
dt
We can estimate dv/dt at each instant t by fitting a low-degree polynomial to surrounding

(t, v) points and differentiating. The coefficients r0 , r1 and r2 can then be found using a
least-squares method.
In practice, the term r1 v is small compared to the other two terms, and so will be ignored.
Clearly, it is not practical to do a coast-down test on every train before its journey. Instead,
we must estimate parameter values from observations of the performance of the train during
the journey, and refine our estimates with each new observation. We can do this using a
mathematical filter.
Consider the simplest case where we assume that r1 = 0, and wish to estimate coefficients
r0 and r2 from observations of speed while the train is coasting. The state variables are v, r 0
and r2 . Train mass m and gradient force g(x) are known. Furthermore, we will assume that
the mass is m = 1. The state equations are
dv = [g(x) − r0 − r2 v 2 ]dt +dϕ1

αdr0 = 0 dt +dϕ2
dr2 = 0 dt +dϕ3
where ϕ1 represents the error in our equation of motion, and ϕ 2 and ϕ3 represent the errors
in our assumption that r 0 and r2 are constants. Our observation of the system at time t is
y(t) = v(t) + ν(t)
where v(t) is the true speed of the train at time t, and ν(t) is the observation noise at time t.
The discrete-time versions of the state equations are

tk+1
v(k + 1) = v(k) + tk
[g(x) − r0 − r2 v 2 ]dt +ϕ1 (k)
r0 (k + 1) = r0 (k) +ϕ2 (k)
r2 (k + 1) = r2 (k) +ϕ3 (k)

xk+1 = f (xk ) + ϕk (14.2)
where x = [v, r0 , r2 ]T is the state vector and ϕ = [ϕ1 , ϕ2 , ϕ3 ]T is the process noise vector.
Similarly, the observation equation can be written as
yk = g(xk ) + νk (14.3)
where y is the observation vector and ν is the observation noise vector.
The state of the system is not known precisely, and so is represented, in general, by a prob-
ability density function. A particular mathematical filter may use a simpler approximation,
such as a mean and covariance, to represent the state. The general filter process is:
1. Apply the process function f to the state distribution x k to obtain x̂k+1 , the predicted
state at time k + 1.
2. Predict the observation ŷk+1 = g(x̂k+1 ).
3. Use the difference between the actual observation yk+1 and the predicted observation
ŷk+1 to form a correction to x̂k+1 , giving the corrected state estimate xk+1 .
14.2 Kalman Filter
The Kalman Filter [31, 32] is an algorithm for calculating a sequence of state estimates {x k }
from a sequence of observations {yk } when the process model and observation model are
both linear. The state and observation equations are
xk+1 = F xk + ωk
yk = Hxk + νk .
The process noise ω and the observation noise ν are assumed to be Gaussian with probability
distributions
p(ω) ∼ N(0, Q),
p(ν) ∼ N(0, R).
In practice, the process noise covariance matrix Q and measurement noise covariance matrix
R might change with each time step or measurement. However, they are are often assumed
to be constant.
The Kalman filter is not directly applicable to a nonlinear problem. However, many nonlinear
filters are based on the Kalman filter.
The Kalman filter operates as follows. The state probability distribution at time k is repre-
sented by a mean xk and covariance matrix Pk . The predicted next state is given by
x̂k+1 = F xk
and the predicted covariance by
P̂k+1 = E[(xk − x̂k+1 )(xk − x̂k+1 ) ].
After taking a measurement yk+1 we perform a correction to give the corrected state
xk+1 = x̂k+1 + Kk (yk − H x̂k+1 )
and the corrected covariance

Pk+1 = (I − Kk H)P̂k+1
where the Kalman gain matrix Kk is given by
Kk = Pk H (HPk H + R)−1 .
14.3 Extended Kalman Filter
For a nonlinear system with Gaussian noise, an extended Kalman filter (EKF) is often used.
The EKF is a Kalman filter that linearizes about the current mean and covariance.
Suppose the nonlinear stochastic system has state x k ∈ Rn with state equation
xk+1 = f (xk , uk+1 , ωk )
and measurement yk ∈ Rm with

yk = h(xk , νk )
where ωk+1 and νk+1 are Gaussian random variables representing the process and measure-
ment noise respectively. The predicted next state and predicted measurement are given by
the linearised equations
x̂k+1 = Axk
and
ŷk+1 = H x̂k+1
where A and H are the Jacobian matrix of partial derivatives of f and h with respect to x
respectively. The correction step is the same as with the Kalman filter.
The operation of the EKF is similar to the Kalman filter, but the mathematical justification is
questionable.
14.4 Nonlinear Projection Filter
The Nonlinear Projection Filter (NPF) was first proposed by Beard et al in 1999 [3].
The state of a stochastic dynamical system is a conditional probability density function that
evolves between observations according to Kolmogorov’s forward equation. At each mea-
surement, the state function is updated using Bayes formula to incorporate the information
provided by the new measurement.
The NPF uses Galerkin’s method [6, 38] to approximate the evolution of the state function
between measurements. The state density function is approximated by a finite cosine se-
ries. The evolution of the state between measurements can then be found by straightforward
solution of a system of ordinary differential equations. Galerkin’s method is also used to cal-
culate the measurement update. In practice, one would expect the filter to be implemented
using Fast Fourier Transforms.
The NPF is described in more detail in Chapter 15. However, the method is computationally
expensive, and difficult to implement for systems with more than one state variable.
14.5 Unscented Kalman Filter
The Unscented Kalman Filter (UKF) for nonlinear systems was developed by Julier and
Uhlmann [28] in 1997. According to Julier and Uhlmann [30], the UKF is far superior to the
EKF, both in theory and practice. The key idea of the UKF is that the state density function
can be approximated by a few well-chosen points that have the same mean and variance (and
possibly higher-order moments) as the true density function. The evolution of the state can
be calculated by simply applying the nonlinear state equation to each of the points. The
UKF does not have the linearisation errors of the EKF, nor does it require the computation
of Jacobians, and it is easy to implement [28, 30].
The UKF has been shown to be superior to the EKF in many state estimation, dual estimation
and parameter estimation problems [47, 48]. It has been used in applications, such as state
estimation for road vehicles [29], induction motors [1], quaternion motion of human orien-
tation [26], visual contour tracking [37], and parameter estimation for time series modelling
[47] and neural network training [15, 46].
Lefebvre et al [36] note that the UKF ‘sigma points’, used to approximate the state distribu-
tion, correspond to the regression points in their Linear Regression Kalman filter (LRKF),
and that the UKF is a special case of the LRKF. The UKF is able to approximate Gaussian
state functions with third-order accuracy, and non-Gaussian state functions to at least second
order [48].
The Unscented Kalman Filter is used to solve the train problem in Chapter 16.
14.6 H∞ Filter
The H∞ filter is also called the minimax filter [44]. Its aim is to minimise the maximum of
the estimation error. In other words, the minimax filter minimises the maximum value of the
transfer function from the system noise to the estimation error. Unlike the Kalman filter and
EKF, the H∞ filter does not require the noise to have any particular structure.
Assume we have a linear dynamic system
xk+1 = Axk + Buk + ωk
with observations
yk = Cxk + νk
where A, B and C are known matrices, u is the known input to the system, and ω and ν are
process noise and observation noise respectively. The H ∞ filter finds the state estimate x̂
that solves
min max J.
x̂ ω,ν
Suppose we are given the worst possible values of ν and ω, we want to find an estimate x̂
that will minimise the maximum estimation error caused by ν and ω. Hence, the H ∞ filter is
sometimes called the minimax filter. The function J is defined as
ave xk − x̂k Q
J=
ave wk W + ave vk V
where ave xk − x̂k Q is the average taken over all time samples k and is defined as
1
n
ave xk − x̂k Q = (xk − x̂k )Q.
n
k=1
To minimise the maximum estimate error, the matrices Q, W and V are chosen for use in the
weighted norms so that the desired trade-off is obtained. This problem is not easy to solve.
Instead the related problem
1
find x̂k such that J <
γ
where γ is some chosen constant is solved. That is, we wish to find a state estimate such that
no matter what the values of noise ω and ν, the maximum of J is always less than 1/γ. The
H∞ filter equations are
Lk = (I − γQPk + C T V −1 CPk )−1
Kk = APk Lk C T V −1
x̂k+1 = Ax̂k + Buk + Kk (yk − C x̂k )
Pk+1 = APk Lk AT + W
where x̂k is the estimated mean of the state at step k, P k is the estimated covariance of the
state at time k, and Kk is the H∞ gain matrix applied to the observation error at time k.
The initial state covariance matrix P0 must be carefully chosen to obtain acceptable filter
performance, and the constant γ should be set such that all eigenvalues of the P matrix
have magnitudes less than one. If we choose very large γ then the solution of the H ∞
problem does not exist. So while the H ∞ seems easy to use and promises good performance,
we encountered some difficulties associated with ‘tuning’ the filter when we tried to apply
the filter for our train problem. The filter performance depends on the initial value of a
covariance matrix for each particular application. This tuning work is very time consuming
and quite insensitive compared to the tuning of UKF. Thus we preferred the UKF for our
problem.
14.7 Conclusion
To calculate optimal driving strategies for a train, we need to identify the rolling resistance
and aerodynamic drag coefficients of the train as each journey progresses. One way to do
this is to use a mathematical filter. The train performance model is nonlinear in the param-
eters, and so we require a nonlinear state estimator. The Nonlinear Projection Filter and the
Unscented Kalman Filter were thought to be promising approaches, and will be discussed in
more detail in the next two chapters.
Chapter 15
Nonlinear Projection Filter
15.1 Introduction
In the previous chapter we described the general filter process, and some of the methods used
to solve the filtering problem. In this chapter we consider the use of the Nonlinear Projection
Filter (NPF) [3], which is designed to estimate the state of a continuous nonlinear dynamic
system with discrete nonlinear observations.
For a linear dynamic system, the probability density function used to describe the state of
the system is Gaussian and so can be described by just two parameters: the mean and the
standard deviation. For nonlinear dynamic systems, however, the state probability density
function may not be able to be described by a finite number of parameters.
As discussed in Chapter 14, the filter process comprises two stages: prediction and cor-
rection. The NPF uses a partial differential equation, Kolmogorov’s forward equation, to
describe the evolution of the state probability density function between observations. Bayes
formula is then used to describe how the probability density function is modified by the in-
formation from each observation. The NPF uses Galerkin’s method to solve Kolmogorov’s
225
Chapter 15. Nonlinear Projection Filter 226
forward equation, expanding the exact solution to an infinite sum of basis functions. An
approximate solution is found by projecting the true solution into the space spanned by a
truncated, finite set of basis functions. The algorithm and more details of the filter can be
found in [3].
Because of the generally stable nature of projection we felt that this filter may provide a
useful technique for our parameter estimation problems. Unfortunately, as the discussion
below will show, we were unable to convince ourselves that the prediction stage of the filter
could be implemented.
15.2 A simple example without diffusion
The NPF uses Kolmogorov’s forward equation in the prediction stage to determine the evolu-
tion of the state probability density function between measurements. Kolmogorov’s forward
equation is
∂p ∂ σ2 ∂2
= − (f p) + (p) (15.1)
∂t ∂x 2 ∂x2
where p is the probability density function of the state, f is the known process operator
and σ is the covariance of the process noise. The Kolmogorov equation is a well-known
partial differential equation but it is not easy to solve. In this section we consider a simple
dynamic system where the true solution is known, and use three different numerical methods
to solve Kolmogorov’s forward equation: a Fast Fourier Transform (FFT), a Discrete Cosine
Transform (DCT), and a Fourier Cosine Series (FCS). Our main purpose was to consider
equations where an analytic solution was known to see how well the proposed method would
work. In retrospect we found the method was not sufficiently specified for it to be used on
equations where the solution was not known beforehand.
The system we are considering is a wave travelling with unit velocity (f = 1) and no diffu-
sion, (σ = 0),
dx = dt. (15.2)
15.2.1 Exact solution
Kolmogorov’s forward equation (KFE) for our example is
∂p ∂p
=− (15.3)
∂t ∂x
which is derived from (15.1) with f = 1 and σ = 0.
We consider the Kolmogorov equation (15.3) in the region x ∈ (−π, π) and t ∈ [0, ∞). We
use a Fourier series to find the solution. Substitute

p(x, t) = ck (t)eikx (15.4)
k∈Z
where π
1
ck (t) = p(x, t)eikx dx
2π −π
into the partial differential equation (15.3) to give

ck
[ + ikck (t)]eikx ] = 0
k∈Z
dt
from which we deduce that

ck (t) = ck (0)e−ikt (15.5)
where π
1
ck (0) = p(x, 0)e−ikx dx. (15.6)
2π −π
We now have from (15.5) and (15.6)

p(x, t) = ck (0)eik(x−t)
k∈Z
= p(x − t, 0). (15.7)
The solution is a travelling wave with the same shape as the initial probability. Note that
the induced periodicity means that the portion of the wave that appears to exit from the right
hand of the interval simultaneously appears to enter from the left hand end. The condition
π
p(x, 0)dx = 1
−π
with (15.6) implies that

1
c0 (0) =
2π
which, in turn, from (15.5) means that
1
c0 (t) =
2π
and hence ensures that π
p(x, t)dx = 1.
−π
15.2.2 Approximation using a Fast Fourier Transform
In this section we will use a Fast Fourier Transform (FFT) to approximate the solution to the
KFE for our simple example. The solution to (15.3) can be approximated by
1
N
p̂(xj , t) = ck (t)e2πi(j−1)(k−1)/N (15.8)
N
k=1
on the region x ∈ (−π, π) where j = 1, 2, . . . , N,

(2j − 1)π
xj = −π +
N
and where ck (t) is Fourier coefficients. For convenience and ease of calculating we rewrite
j in terms of xj to give
N(xj + π) + π
j=
2π
and substitute this into (15.8) to obtain
1
N
p̂(xj , t) = ck (t)ei[xj +(1−1/N )π](k−1) . (15.9)
2N k=1
Substituting (15.9) into the partial differential equation (15.3) gives
N
ck
[ + i(k − 1)ck (t)]ei[xj +(1−1/N )π](k−1) = 0
k=1
dt
which can be simplified to a simple linear first-order differential equation
ck
+ i(k − 1)ck (t) = 0 (15.10)
dt
and hence we can deduce that ck (t) = ck (0)e−i(k−1)t . If we define
1
N
ck (0) = p(xj , 0)e−i[xj +(1−1/N )π](k−1)
N j=1
and

N
p̂(x, 0) = ck (0)ei[x+(1−1/N )π](k−1)
k=1
then p(xj , 0) = p̂(xj , 0) for all j = 1, 2, . . . , N. This is the fundamental representation

theorem for the FFT. Note that in general we will have p(x, 0) = p̂(x) for x = xj . Using the
above expression for ck (t) and (15.9) it follows that

N
p(xj , t) = ck (0)ei[(xj −t)+(1−1/N )π](k−1) = p̂(xj − t) (15.11)
k=1
for each j = 1, 2, . . . , N. Therefore the solution is a travelling wave with the same shape as
the initial representation p̂(x). If xj − t = xm for some m ∈ Z then
p(xj , t) = p̂(xj − t) = p̂(xm ) = p(xm , 0) = p(xj − t, 0)
but if xj − t = xm for all m ∈ Z then we may have
p(xj , t) = p̂(xj − t) = p(xj − t, 0).
This means the values of the probability density function at the sample points may be differ-
ent to the true solution depending on the time step we choose. This is illustrated in Figures
15.1, 15.2, 15.3 and 15.5.
Figure 15.1 shows an initial probability distribution on the interval [0, 6]. We represent this
distribution by a vector of 250 sample points.
Figure 15.2 shows an approximating function after one iteration of the method, where the
time step is such that each initial sample point x j maps onto another sample point x m . The
value of the approximating function is correct at each of the sample points. Note that the
approximating function has p̂(x) = p(x) for x = xj .
Figure 15.3 shows the approximating function after two iterations of the method. Once again,
the value of the approximating function is correct at each of the sample points.
Figure 15.4 shows the initial probability density function (blue) and the approximating so-
lution (green) of the KFE at t = 0. We solve the KFE using 10 sample points (blue). The
approximating solution using FFT goes through all sample points.
Figure 15.5 shows the true solution (blue) and the approximating solution (green) at time t =
x2 − x1 . The true solution has the same shape as the initial condition since the approximating
solution is correct at each of the sample points (green).
The approximating solution (red) is not correct at all sample points when we use t = x m −xj .
Hence the true solution has values (blue) that are quite different from the ones (red) we
expected at the sample points as seen in Figure 15.6
1.4
1.2
0.8
probability
0.6
0.4
0.2
−0.2
0 1 2 3 4 5 6
x
Figure 15.1: Solution to KFE using FFT without diffusion at t = 0.
1.4
1.2
0.8
probability
0.6
0.4
0.2
−0.2
0 1 2 3 4 5 6
x
Figure 15.2: Solution to KFE using FFT without diffusion after one iteration.
1.4
1.2
0.8
probability
0.6
0.4
0.2
−0.2
0 1 2 3 4 5 6
x
Figure 15.3: Solution to KFE using FFT without diffusion after two iterations.
0.6
0.5
0.4
0.3
probability
0.2
0.1
−0.1
0 1 2 3 4 5 6
x
Figure 15.4: Results using FFT without diffusion at t = 0.

0.6
0.5
0.4
0.3
probability
0.2
0.1
−0.1
0 1 2 3 4 5 6
x
Figure 15.5: Results using FFT without diffusion after one evolution such that time t = xj − xm .
0.6
0.5
0.4
0.3
probability
0.2
0.1
−0.1
0 1 2 3 4 5 6
x
Figure 15.6: Results using FFT without diffusion after one evolution such that time t = xj − xm .
15.2.3 Solution using a Fourier Cosine Series
In this section we use a Fourier Cosine Series (FCS) to approximate the evolution of our
example.
We consider the Kolmogorov equation
∂p ∂p
(x, t) = (−1) (x, t) (15.12)
∂t ∂x
in the region x ∈ (0, π) and t ∈ [0, ∞). If we write

∞
p(x, t) = c0 (t) + cm (t) cos mx (15.13)
m=1
then substitution into equation (15.12) gives

∞
dc0 dcm
+ cos mx − cm (t)m sin mx = 0. (15.14)
dt m=1
dt
Solving equation (15.14) is not straightforward. To overcome the obvious difficulty we write

∞
sin mx = am0 + amn cos nx
n=1
where ⎧
⎪
⎪ 2
⎪
⎨ mπ
for m odd
am0 = (15.15)
⎪
⎪
⎪
⎩ 0 otherwise
and ⎧

⎪
⎪ 2 1 1
for m − k odd
⎪
⎨ π m+k
+ m−k
amk = (15.16)
⎪
⎪
⎪
⎩ 0 otherwise
which leads us to the equation

dc0 dcm
∞ ∞ ∞
− an0 ncn (t) + − anm ncn (t) cos mx = 0.
dt n=1 m=1
dt n=1
It follows that
dcm
∞
= anm ncn (t) (15.17)
dt n=1
for each m = 0, 1, 2, . . .. If we write

⎡ ⎤ ⎡ ⎤
c (t) 0 0 0 ···
⎢ 0 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ c1 (t) ⎥ ⎢ 0 1 0 ··· ⎥
c(t) = ⎢ ⎥, D = ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ c2 (t) ⎥ ⎢ 0 0 2 ··· ⎥
⎣ ⎦ ⎣ ⎦
.. .. .. .. ..
. . . . .
and ⎡ ⎤
0 0 0 0 ···
⎢ ⎥
⎢ ⎥
⎢ 1 0 ( 13 − 1) 0 ··· ⎥
⎢ ⎥
⎢ ⎥
2⎢ 0 ( 13 + 1) 0 ( 15 − 1) ··· ⎥
⎢
A= ⎢ ⎥
π⎢ 1 ⎥
0 ( 15 + 1) 0 ··· ⎥
⎢ 3 ⎥
⎢ ⎥
⎢ 0 ( 15 + 13 ) 0 ( 17 + 1) ··· ⎥
⎣ ⎦
.. .. .. .. ..
. . . . .
then equation (15.17) can be written in operator form as
dc
= AT Dc(t).
dt
The solution to this equation can be written as
T Dt
c(t) = eA c0 . (15.18)
In order to compute this solution, we want to show A T DAT = −D. That is,
⎧
⎪
⎪ −n, k = m
⎪
⎨
∞
anm nakn = (15.19)
⎪
⎪
n=1 ⎪
⎩ 0 k = m.
We notice that when m = 0 and k is an odd number then from (15.15) and (15.16) we have
anm akn = 0 for all n = 1, 2, . . . . Since

∞
a2k,n an0 n
n=1

2 1 1 2 2 1 1 2
= + · ·1+ + · ·3
π 2k + 1 2k − 1 π π 2k + 3 2k − 3 3π

2 1 1 2 2 1 1 2
+ + · ·5+ + · ·7
π 2k + 5 2k − 5 5π π 2k + 7 2k − 7 7π
..
.
∞
4 1 1 1 1
= ·1·1+ − · · (2p + 1)
π 2 (2k + 1) p=1
2p + 2k + 1 2p − 2k − 1 (2p + 1)

4 1 1 1 1 1
= + + + + +···
π 2 (2k + 1) (2k + 3) (2k − 1) (2k + 5) (2k − 3)

1 1 1 1
+ +1 + −1 + − +···
(4k + 1) (4k + 3) (4k + 5) 3

1 1 1 1
+ − + −
(6k + 1) (2k − 1) (6k + 3) (2k + 1)

1 1
+ − +···
(6k + 5) (2k + 3)
= 0
it follows that

∞
amn an0 n = 0
n=1
for all m = 1, 2, . . .. Hence the top row of the matrix AT DAT is all zeros. That means
dc0
∞
= an0 ncn (t) (15.20)
dt n=1
as required in (15.17).
Now we note that by (15.16), anm akn = 0 if either m − n is even or k − n is even. If

m = 2r + 1 and k = 2s + 1 then

∞
a2p,2r+1 · 2p · a2s+1,2p
p=1
∞
8 1 1 1 1
= + − ·p
π 2 p=1 2p + 2r + 1 2p − 2r − 1 2p + 2s + 1 2p − 2s − 1

8
∞
1 1
= p +
2
π p=1 (2p + 2r + 1)(2p + 2s + 1) (2p − 2r − 1)(2p + 2s + 1)

1 1
− − .
(2p + 2r + 1)(2p − 2s − 1) (2p − 2r − 1)(2p − 2s − 1)
If r = s we suppose without loss of generality that r < s. By standard partial fraction

analysis we have

∞
a2p,2r+1 · 2p · a2s+1,2p
p=1
∞
2 1 1 1 1
= (2s + 1) + · −
π 2 p=1 (s − r) (s + r + 1) 2p + 2s + 1 2p − 2s − 1

1 1 1 1
+(2r + 1) + · − .
(s − r) (s + r + 1) 2p − 2r − 1 2p + 2r + 1
Since
∞

1 1 −1
− =
p=1
2p + 2s + 1 2p − 2s − 1 2s + 1
and
∞

1 1 1
− =
p=1
2p − 2r − 1 2p + 2r + 1 2r + 1
it follows that

∞
a2p,2r+1 · 2p · a2s+1,2p = 0
p=1
when r = s.
If r = s then

∞
a2p,2s+1 · 2p · a2s+1,2p
p=1
∞
8 1 1 1 1
= + − ·p
π 2 p=1 2p + 2s + 1 2p − 2s − 1 2p + 2s + 1 2p − 2s − 1

8
∞
1 1
= p −
π 2 p=1 (2p + 2s + 1)2 (2p − 2s − 1)2

8 1 1 1 1
= − +2 − +···
π2 (2s + 3)2 (2s − 1)2 (2s + 5)2 (2s − 3)2

1 1 1 1
+s − 1 + (s + 1) − 1 + (s + 2) − +···
(4s + 1)2 (4s + 3)2 (4s + 5)2 9

1 1 1 1
+2s − + (2s + 1) −
(6s + 1)2 (2s − 1)2 (6s + 3)2 (2s + 1)2

1 1
+(2s + 2) − +···
(6s + 5)2 (2s + 3)2
8 ∞
1
= 2
(−1)(2s + 1)
π p=0
(2p + 1)2
= (−1)(2s + 1).
If m = 2r and k = 2s then similar arguments can be applied. In summary we have

⎧
⎪
⎪ −n if k = m
⎪
⎨
∞
anm nakn = (15.21)
⎪
⎪
n=1 ⎪
⎩ 0 otherwise
as required in (15.19). That means
AT DAT = −D =⇒ DAT = −AD =⇒ (AD)T = −AD
because AAT = I. It follows that
(AT D)2 = −D2 .

Therefore we can write the solution (15.18) in a nice form

AT Dt T (AT D)2 t2 (AT D)3 t3
c(t) = e c0 = I + A Dt + + + . . . c0
2! 3!

D2 t2 D 4 t4 T D3 t3
= I− + −... + A D − + ... c0
2! 4! 3!

= cos Dt + AT sin Dt c0 .
From this we have

AT Dc(t) = AT D cos Dt + AT sin Dt c0

= AT D cos Dt + AT DAT sin Dt c0

= AT D cos Dt − D sin Dt c0
dc
= .
dt
Alternatively we can rewrite the expression above in the form
dcm
∞
= anm ncn (t)
dt n=1
as required in (15.17). Hence the FCS method produces the expected solution.
Approximation by truncated FCS
To do numerical calculations we truncate the FCS. Figure 15.7 shows the solutions of the
KFE using Fourier Cosine Series. The initial probability is a step function shown in red; its
approximation calculated using Fourier Cosine Series is shown in green in the top part of
Figure 15.7. The solution of equation 15.3 after one iteration with time step 0.5, shown in
blue in the bottom part of the figure, is clearly the green curve shifted to the right 0.5 units.
0.5
0.5 1 1.5 2 2.5 3

probability
1
0.5
0 0.5 1 1.5 2 2.5 3 3.5

x
Figure 15.7: A simple example using FCS.

15.2.4 Approximation using Discrete Cosine Transform
Now we want to solve

∂p ∂p
(x, t) = − (x, t) (15.22)
∂t ∂x
by approximating p as a Discrete Cosine Series

N
p̂(xj , t) = ck (t) cos (k − 1)xj . (15.23)
k=1
As we proved in the previous section that if (15.13) is a solution to (15.12) then it follows
that
dc T
= AT Dc(t) =⇒ c(t) = eA Dt c0 (t). (15.24)
dt
We can use a similar method using Discrete Cosine Transform (DCT). If (15.23) is a solution
to (15.22) then
dc T
= B T Ec(t) =⇒ c(t) = eB Et c0 (t) (15.25)
dt
where B and E are calculated using a DCT.
The Discrete Cosine Transform formula
Before beginning we review the derivation of the Discrete Cosine Transform (DCT) and
introduce some important notation. Define
(2j − 1)π
xj = , for j = 1, 2, . . . , N
2N
and

N
f (xj ) = fk cos (k − 1)xj .
k=1
So,

N
N
N
f (xj ) cos (h − 1)xj = cos (h − 1)xj fk cos (k − 1)xj
j=1 j=1 k=1

N
fk
N
= [cos (h − k)xj + cos (h + k − 2)xj ] .
k=1
2 j=1
When 1 ≤ s ≤ N we have
N

N
cos sxj = ei(2j−1)πs/2N
j=1 j=1

iπs

N
= e 2n ei(j−1)πs/N
j=1

1 − (−1)s
= iπs iπs
e− 2N − e 2N
= 0
since
1 − (−1)s
iπs iπs
e− 2N − e 2N
is pure imaginary while [·] is the real part of a complex number.
When s = 0 we have

N
cos sxj = N.
j=1
Hence

N
1
N
f (xj ) = Nf1 =⇒ f1 = f (xj )
j=1
N j=1
and

N
fk 2
N
f (xj ) cos(k − 1)xj = N =⇒ fk = f (xj ) cos (k − 1)xj .
j=1
2 N j=1
We now return to equation (15.22). After substituting (15.23) into (15.22), we obtain

N
dck
N
cos (k − 1)xj = ck (t)(k − 1) sin (k − 1)xj .
k=1
dt k=1
But we can write

N
sin (k − 1)xj = skn cos (n − 1)xj
n=1
where
1
N
sk1 = sin (k − 1)xj
N j=1
1 i(2j−1)(k−1)π/2N
N
= e
N j=1

1 1 − (−1)k−1
=
N e−iπ(k−1)/2N − eiπ(k−1)/2N
⎧
⎪
⎪
⎪
⎨
0 if k odd
= (15.26)
⎪
⎪
⎪
⎩ 1/(N sin π(k−1) ) otherwise
2N
and
2
N
skn = sin (k − 1)xj cos (n − 1)xj
N j=1
1
N
= [sin (k − n)xj + sin (k + n − 2)xj ]
N j=1

1 1 − (−1)k−n 1 − (−1)k+n−2
= . + −iπ(k+n−2)/2N
N e−iπ(k−n)/2N − eiπ(k−n)/2N e − eiπ(k+n−2)/2N
⎧
⎪
⎪ 0 if k − n is even
⎪
⎨
1
= (15.27)
N⎪ ⎪
⎪
⎩ 1 1
π(k−n) + π(k+n−2) if k − n is odd
sin 2N
sin 2N
where [·] is the imaginary part of a complex number. From (15.27), in general we have
1 1
skn = π(k−1)
N sin
2N
cos 2N − cos π(k−1)
π(n−1)
2N
sin π(n−1)
2N
1 1
+ π(k−1)
N sin
2N
cos 2N + cos π(k−1)
π(n−1)
2N
sin π(n−1)
2N

1 1 1
= +
N s(k)c(n) − c(k)s(n) s(k)c(n) + c(k)s(n)

2 s(k)c(n)
= (15.28)
N [s(k)c(n)]2 − [c(k)s(n)]2
where s(k) = sin π(k−1)

2N
and c(n) = cos π(n−1)
2N
. Hence we have
2 s(n)c(k)
snk = ·
N [s(n)c(k)]2 − [c(n)s(k)]2
s(n)c(k)
= (−1) · skn
s(k)c(n)
or
c(n) c(k)
· snk = (−1) · skn . (15.29)
s(n) s(k)
That is
t(p)sqp = −t(q)spq .
To satisfy (15.25) we set

⎡ ⎤
0 0 0 0 ···
⎢ ⎥
⎢ ⎥
⎢ 0 t(1) 0 0 ··· ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 0 t(2) 0 ··· ⎥
E = ⎢ ⎥
⎢ ⎥
⎢ 0 0 0 t(3) · · · ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 0 0 0 ··· ⎥
⎣ ⎦
.. .. .. ..
. . . .
and
⎡ ⎤
0 0 0 0 ···
⎢ ⎥
⎢ 1 1 1 ⎥
⎢ s(1) 0 − 0 ··· ⎥
⎢ s(3) s(1) ⎥
⎢ 1 1 1 1 ⎥
⎢ 0 + 0 − ··· ⎥
B = [skn ] = ⎢
⎢ 1
s(3) s(1) s(5) s(1) ⎥.
⎥
⎢ s(3) 1 1
0 + 0 ··· ⎥
⎢ s(5) s(1) ⎥
⎢ ⎥
⎢ 0 1
+ 1
0 1 1
+ s(1) ··· ⎥
⎣ s(5) s(1) s(7) ⎦
.. .. .. ..
. . . .
(15.30)
Similar to the case of FCS, we have the property
(BE)T = −BE. (15.31)

Now we consider a general skew symmetric W that has pure imaginary eigenvalues. The
eigenvalues are
kπ
±i tan
N
for k = 1, . . . , N2 − 1 if N is even and the eigenvalues are
(2k − 1)π
±i tan
2N
for k = 1, . . . , N2 if N is odd.
We know that the eigenvalues of a skew symmetric matrix are pure imaginary and the eigen-
vectors can be chosen to form an orthonormal set [12, 16]. Thus for any skew symmetric
matrix W we have RH W R = iM where R = P + iQ is the matrix of eigenvectors and M
is an eigenvalue matrix. Hence,
(P T − iQT )W (P + iQ) = iM
=⇒ P T W P + QT W Q = 0 and P T W Q − QT W P = M.
Now the properties follow. Since (P T − iQT )(P + iQ) = I it follows that (P T − iQT ) =
(P + iQT )−1 and hence
W = (P + iQ)iM(P T − iQT ).
W 2 = (P + iQ)iM(P T − iQT )(P + iQ)iM(P T − iQT )
= (P + iQ)(−M 2 )(P T − iQT )(P + iQ)(−M 2 )(P T − iQT ).
Therefore
Wt W 2 t2 W 3 t3
c(t) = e = I + Wt + + + ...
2! 3!
M 2 t2 iM 3 t3 M 4 t4
= (P + iQ)[I + iMt − − + + . . . ](P T − iQT )
2! 3! 4!
= (P + iQ)(cos Mt + i sin Mt)(P T − iQT )
= P cos MtP T + Q cos MtQT + P sin MtQT − Q sin MtP T (15.32)

because eW t is real. Hence, from (15.32) and using W P = −QM and W Q = P M, we

obtain
W eW t = W (P cos MtP T + Q cos MtQT + P sin MtQT − Q sin MtP T )
= W P cos MtP T + W Q cos MtQT + W P sin MtQT − W Q sin MtP T
= −QM cos MtP T + W Q cos MtQT − QM sin MtQT − P M sin MtP T

dc
= . (15.33)
dt
Recall that B T E is skew symmetric, and so (15.33) can be applied. Setting W = B T E and
substituting c(t) = eW t from (15.32) into (15.33) gives
dc
B T Ec(t) = .
dt
This is the same as (15.25).
For convenience of numerical calculation we now try to compute eW t without using any
imaginary terms. Since
W = (P + iQ)iM(P T − iQT ) =⇒ W = P MQT − QMP T
because the imaginary part is equal to zero. Then
W 2 = P MQT P MQT − P MQT QMP T − QMP T P MQT + QMP T QMP T
= P MP T QMQT − P MQT QMP T − QMP T P MQT + QMQT P MP T
= −QMQT QMQT − P MQT QMP T − QMP T P MQT − QMQT QMQT
= −QMQT QMQT − P M(I − P T P )MP T
−QM(I − QT Q)MQT − QMQT QMQT
= −P M 2 P T − QM 2 QT
W 3 = (P MQT − QMP T )(−P M 2 P T − QM 2 QT )
= −P MQT P M 2 P T + QMP T P M 2 P T − P MQT QM 2 QT + QMP T QM 2 QT
= −P MP T QM 2 P T + QM(I − QT Q)M 2 P T
−P M(I − P T P )M 2 QT + QMQT P M 2 QT
= QM 3 P T − P M 3 QT .
Similarly we have
W 4 = P M 4 P T + QM 4 QT .
In general we have
⎧
⎪
⎨(−1)n/2 (P M n P T + QM n QT ) if n is even,
Wn = (15.34)
⎪
⎩(−1)n/4+1 (P M n QT − QM n P T ) if n is odd.
So,
W 2t W 3t W 4t
eW t = I + W t + + + + ...
2! 3! 4!
1
= I + (P MtQT − QMtP T ) + (−P M 2 tP T − QM 2 tQT )
2!
1 1
+ (QM tP − P M tQ ) + (P M 4 tP T + QM 4 tQT ) + . . .
3 T 3 T
3! 4!
M 2t M 4t M 2t M 4t
= P (I − + + . . . )P T + Q(I − + + . . . )QT
2! 4! 2! 4!
M 3t T M 3t
+P (Mt − + . . . )Q − Q(Mt − + . . . )P T
3! 3!
= P cos MtP + P cos MtQ + P sin MtQT − Q sin MtP T .
T T
(15.35)
Note that B T E is a skew symmetric matrix. So we can calculate the approximate solution
T Et
p̂(xj , t) by substituting c(t) = e B to give

N
B T Et
p̂(xj , t) = e cos (k − 1)xj
k=1
T Et
where eB is calculated using (15.35).
Figure 15.8 shows an initial probability distribution (blue) and its transformation (green) at
the top; the approximation of probability density function after one iteration is shown in red
at the bottom of the figure. We evaluated this distribution using N = 201 sample points in
the interval [0, π]. The points are transformed using formula (15.35). The time step used in
the example is a multiple of the step size of the sample points, π/200. If the time step is not
a multiple of the sample step size, t = xj − xm where xj and xm are some of the sample
points, then p(x − t) = p(x, 0).
1.2
0.8
0.6
0.4
0.2
0
probability
0.8
0.6
0.4
0.2
−0.2
0 0.5 1 1.5 2 2.5 3 3.5
x
Figure 15.8: Solution to the KFE using (15.35) at t = 0 and t = π/20
We can see the solution of (15.22) satisfies
p̂(xj , t) = p̂(xj − t, 0).
Figure 15.9 shows the solution of the KFE by applying the algorithm of the Nonlinear Projec-
tion Filter and using the DCT built-in function in MATLAB. The initial probability function
T Dt
after being transformed into coefficients c0 using DCT then is evolved by c(t) = eA c0 to
calculate c(t) where A and D are calculated using the formula in [3]. The evolved coeffi-
cients are then transformed back using the inverse DCT function to obtain the solution. We
can see the solution obtained contains unexpected oscillations.
15.3 A simple example with diffusion
In this section we consider the Kolmogorov equation
∂p ∂p σ2 ∂2 p
(x, t) = (−1) (x, t) + (x, t) (15.36)
∂t ∂x 2 ∂x2
with diffusion in the region x ∈ (−π, π) and t ∈ [0, ∞). We use a Fast Fourier Transform
(FFT) and sample points
(2j − 1)π
xj = −π +
N
for each j = 1, 2, . . . , N. Substituting

N
p(xj , t) = ck (t)ei[xj +(1−1/N )π](k−1)
k=1
into the partial differential equation (15.36) gives

N

dck σ 2 (k − 1)2
+ i(k − 1) + ck (t) ei[xj +(1−1/N )π](k−1) = 0
k=1
dt 2
2 (k−1)2 /2]t
and hence we deduce that ck (t) = ck (0)e−[i(k−1)+σ . It follows that

N
2 (k−1)2 t/2
p(xj , t) = ck (0)e−σ · ei[(xj −t)+(1−1/N )π](k−1) → ck (0)
k=1
as t → ∞.
Figure 15.10 shows the initial uniform distribution (red) and its solution (blue) after one time
step. The process covariance is 0.1 and time step is 2.
0.8
0.6
probability
0.4
0.2
−0.2
0 1 2 3 4 5 6
x
0.8
0.6
probability
0.4
0.2
−0.2
0 1 2 3 4 5 6
x
Figure 15.9: Initial condition and the solution to the KFE using DCT at t= 0.5
1.4
1.2
0.8
probability
0.6
0.4
0.2
−0.2
0 1 2 3 4 5 6
x
Figure 15.10: FFT solution to Kolmogorov’s equation with diffusion after one time step
15.4 A simple train example
The equation of a motion of a train when coasting is
dv = −r2 v 2 dt + σdω. (15.37)
We use Kolmogorov’s forward equation to solve equation (15.37) to see how the probability
density function of speed evolves between measurements. In this state equation, for sim-
plicity, we assume the parameter r2 is known. We also assume that the train starts coasting
at v = 20. The diffusion coefficient is σ = 0.1. We will calculate the probability density
function of speed at equal time steps. We set the initial probability density function as the
normal distribution shown in Figure 15.11.
To solve the Kolmogorov equation we use a DCT. We solve the KFE for several times; the
results are shown in Figures 15.11 and 15.12. We can see that the solutions after two steps
appears as normal, but after three steps something unexpected happens as seen at the right
end of Figure 15.12. We are unable to explain this bizarre behaviour.
0.5
0.4
0.3
Probability
0.2
0.1
-0.1
-0.2
0 5 10 15 20 25 30
Velocity
Figure 15.11: The initial PDF.
0.5
0.4
0.3
Probability
0.2
0.1
-0.1
-0.2
0 5 10 15 20 25 30
Velocity
Figure 15.12: The results of the train example after four iterations.
15.5 Conclusions
The NPF has high computational cost and is hard to implement because of difficulties in
solving Kolmogorov’s forward equation with Galerkin’s method even for a simple model as
in our example. We can solve the KFE in our example because we know the true solution
is p(x − t, 0). If we don’t know the true solution then there are too many uncertainties
in the numerical calculation for us to have any confidence in the results. Furthermore, the
computing complexity of the filter increases rapidly with the dimension of the state variable.
The authors [3] claimed that the filter can be implemented efficiently for low order system
using Fast Fourier Transform. However our experimental results did not support this claim.
Chapter 16
Unscented Kalman Filter
16.1 Introduction
The Unscented Kalman Filter (UKF) for non-linear systems was proposed by Julier and
Uhlmann [28] in 1997. The filter has a special feature that distinguishes it from others
in that it uses a new transformation, the Unscented Transformation, to predict the state of
the system. The detailed description of this transformation will be presented later. This
transformation is believed to help the filter outperform other filters and reduce computational
cost.
The UKF is one of many variations of the well-known Kalman filter. Details of the Kalman
filter appear in Chapter 14. In this chapter we first describe the Unscented Transformation
and use the UKF to estimate train resistance parameters. During this experiment we found
that filter performance depends significantly on the initial values of the covariance matrix
for the errors of the state variables. The covariance matrix is guaranteed to be non-negative
semi-definite provided the weights are non-negative. How to initiate this matrix to obtain
best result from the filter is an important issue we want to discuss. Our particular aim is to
254
Chapter 16. Unscented Kalman Filter 255
choose the covariance matrix that is best for our problem in this chapter.
16.1.1 The Unscented Transformation
The Unscented Transformation (UT) is a method for calculating the statistics of a random
variable which undergoes a nonlinear transformation and builds on the principle that it
is easier to approximate a probability distribution than an arbitrary non-linear function
[27]. The Unscented Transform is similar to a Monte Carlo method in that it applies the
process transformation to points representing the state distribution. Unlike the Monte Carlo
method, the Unscented Transform uses a few carefully chosen points called “sigma points” to
represent the state distribution instead of thousands of points. For the n-dimensional random
variable x with mean x̄ and covariance Pxx , we need 2n + 1 sigma points and the same
number of weighted points to approximate the state. Those points are given by
χ0 = x̄ W0 = κ/(n + κ)

χi = x̄ + (n + κ)Pxx , Wi = 1/2(n + κ) (16.1)
i

χi+n = x̄ − (n + κ)Pxx , Wi+n = 1/2(n + κ).
i
These points have sample mean x̄ and sample covariance Pxx . In the above equations, the
parameter κ is an extra degree of freedom to “fine tune” the higher order moments of the
approximation and can be used to reduce the prediction errors. According to the authors [3]
when κ satisfies the equation
n+κ=3

then the UKF gives a better approximation than the EKF [28]. The term (n + κ)Pxx
i
is the ith row or column of (n + κ)P xx and Wi is the weight for the point i. The process
function is applied to each point to yield the set of transformed sigma points
Yi = f [χi ].
Next, their mean is calculated by weighting the transformed points,

2n
ȳ = Wi Yi.
i=0
Finally the covariance is defined as the weighted outer product of the transformed points,

2n
Pyy = Wi {Yi − ȳ}{Yi − ȳ}T .
i=0
16.2 Estimating train parameters with an UKF
16.2.1 Define the problem
The motion of a train along a track can be described by the stochastic differential equation

P
mdv = − R(v) + G(x) dt + dω (16.2)
v
where m is the mass of the train, P is the tractive power applied at the wheels, v is the speed
of the train, R is the resistance force acting on the train, G is the gradient force acting on the
train, t is time, and ω is the process noise. The gradient force G(x) is
G(x) = mg sin(θ(x)) (16.3)
where θ(x) is the elevation angle of the gradient and the resistive force R(v) is
R(v) = r0 + r2 v 2 (16.4)
where r0 and r2 are resistance parameters. The parameter r0 typically depends on the mass
of the train and on the rolling resistance of the wheels; the parameter r 2 depends on the
aerodynamic drag of the train. Resistance is often modelled with an additional term r 1 v, but
in practice this term is small compared to the other two terms. The parameters r 0 and r2 are
generally unknown. We wish to estimate their values from (noisy) measurements of speed
using a mathematical filter. Equation (16.2) is non-linear in v, and so we require a non-linear

filter.
The state equations for our problem are the set of stochastic differential equations

1 P
dv = − R(v) + G(x) dt + dω (16.5)
m v
dr0 = dη0 (16.6)
dr2 = dη2 (16.7)
where ω, η0 and η2 are process noise variables that represent errors in the model.
For simplicity we assume that the train mass m is known, and scale the other variables so
that m = 1. We also assume that the filter is used only when the train is coasting, and so we
set P = 0. This gives us the simplified system
dv = [−R(v) + G(x)] dt + dω (16.8)
dr0 = dη0 (16.9)
dr2 = dη2 (16.10)
16.2.2 Implementation
We implemented the Unscented Kalman Filter in Matlab. To test the filter, we simulated
the motion of the train for known values of r 0 and r2 , then added observation noise to the
calculated speed sequence. The noisy speed sequence was then used as an input to the filter,
which calculated a sequence of estimates {(rˆ0 , rˆ2 )}.
For simplicity, we tested the filter for the train coasting on a flat track and with no process
noise. This means G(x) = 0 and ω = η0 = η2 = 0, and so the state equations are simply
dv = [−r0 − r2 v 2 ]dt (16.11)
dr0 = 0 (16.12)
dr2 = 0 (16.13)
This set of equations was used to generate a sequence of speeds {vk , k = 0 . . . 300} at one
second intervals for nine sets of parameters
{(r0 , r2 ) | r0 ∈ {0.005, 0.010, 0.015}, r2 ∈ {0.0005, 0.0010, 0.0015}} .
We also generated ten observation noise sequences which are drawn from a normal distribu-
tion N(0, 0.2). The ten observation noise sequences were added to each of the nine speed
sequences, giving a total of 90 observation sequences.
The filter requires us to specify an initial guess for (r̂ 0 (0), r̂0 (2) and an initial guess for the
covariance matrix P of the difference between the estimated and true states. In each case
we set the initial guess for (r̂0 (0), r̂0 (2) to (0.001, 0.0001). We start by setting the initial
covariance to ⎡ ⎤
1 0 0
⎢ ⎥
⎢ ⎥
P1 = ⎢ 0 10−5 −10−7 ⎥.
⎣ ⎦
0 −10−7 10−5
Figure 16.1 shows a (r0 , r2 ) phase plot. The nine red circles represent the nine true values
of (r0 , r2 ). The 90 black circles represent the final estimates (rˆ0 , rˆ2 ) from each of the 90
observation sequences. The red paths represent the paths taken from the initial guess to the
final estimate.
Figure 16.2 shows r̂0 (t) and r̂2 (t) for a single observation sequence. The red lines indicate
the true values of the observation and (r0 , r2 ); the blue lines indicate the estimated values.
Notice that rˆ2 converges very quickly to the correct value.
−3
x 10
2.5
1.5
r
2
0.5
0
−0.01 −0.005 0 0.005 0.01 0.015 0.02 0.025 0.03
r0
Figure 16.1: 90 estimations on flat track with no process noise, initial covariance matrix P1 .
Estimate of velocity
30
20
10
0
0 50 100 150 200 250 300
Estimate of r0
0.02
0.015
0.01
0.005
0
0 50 100 150 200 250 300
−3
Estimate of r2
x 10
2
1.5
0.5
0
0 50 100 150 200 250 300
Figure 16.2: Estimations of v, r0 and r2 on flat track with no process noise, initial covariance matrix
P1 .
16.2.3 The role of the covariance matrix
The filter estimates the covariance matrix

P (i|j) = E {x(i) − x̂(i)}{x(i) − x̂(i)}T |Z j
where x(i) is the (unknown) true state at time i, x̂(i) is the estimated state at time i, i > j
and Z j is the set of observations {z(1), z(2), . . . , z(j)}T .
To start the filter process, we must guess the initial state value (v(0), r 0(0), r2 (0)) and the
initial value of the state covariance matrix P0 . Changing the initial state does not have a
large influence on the ability of the filter to converge to the correct values of r0 and r2 . The
state covariance matrix P used to initialise the filter, however, has significant influence on
the performance of the filter. We wish to find a value for P that gives good results for a wide
range of trains and observation noise sequences.
We consider initial covariance matrices with the form

⎡ ⎤
σ 0 0
⎢ v ⎥
⎢ ⎥
P0 = ⎢ 0 σr0 0 ⎥
⎣ ⎦
0 0 σr2
where σv ,σr0 and σr2 are the variances of the state estimates of v, r0 and r2 respectively. The
filter was tested with each combination of the three standard deviations whose values are
σv ∈ {10−5, 10−4 , 10−3, 10−2 , 10−1 },
σr0 ∈ {10−8 , 10−7 , 10−6, 10−5 , 10−4},
σr2 ∈ {10−10 , 10−9, 10−8 , 10−7, 10−6 }.
With each initial covariance matrix the filter was run for the nine combinations of r 0 and r2
described previously, and with 10 observation noise sequences for each (r 0 , r2 ) combination.
As before, the initial guesses of r̂0 and r̂2 were 0.01 and 0.001 respectively.
For each test we let the filter run for 300 time steps. At the end of each run we calculated the
maximum relative error in the resistance function,
r̂(v) − r(v)
= max
v∈[3,30] r(v)
where r(v) = r0 + r2 v 2 and r̂(v) = r̂0 + r̂2 v 2 , and where r̂0 and r̂2 were the final estimates
of r0 and r2 .
For each initial P we calculated 90 relative errors. The mean of these 90 errors is given in
table 16.1. Figure 16.3 shows the same results in a parallel coordinates plot. We can see
from the table and the plot that the lowest relative error occurs with the covariance matrix
⎡ ⎤
10−1 0 0
⎢ ⎥
⎢ ⎥
P =⎢ 0 10 −5
0 ⎥.
⎣ ⎦
−8
0 0 10
The result of the test with 90 journeys using this initial covariance matrix is presented in
figure 16.4. Figure 16.5 shows the sequence of parameter estimates for a single journey with
initial covariance matrix P . We are also interested in how σv , σr0 and σr2 evolve at each time
step; their evolution in time is shown in Figure 16.6.
Table 16.1: Relative errors for different values of the initial covariance matrix P .
σr 2
σv σr 0 10−10 10−9 10−8 10−7 10−6
10−5 10−8 0.159 0.153 0.153 0.153 0.153

10−7 0.086 0.078 0.075 0.069 0.0706
10−6 0.078 0.067 0.059 0.024 0.0219
10−5 0.087 0.025 0.026 0.0264 0.0252
10−4 0.115 0.027 0.026 0.0255 0.0256
10−4 10−8 0.159 0.153 0.153 0.153 0.153

10−7 0.086 0.126 0.111 0.111 0.0707
10−6 0.087 0.094 0.082 0.082 0.0218
10−5 0.089 0.025 0.026 0.068 0.0251
10−4 0.115 0.0253 0.0265 0.060 0.0255
10−3 10−8 0.159 0.153 0.153 0.153 0.153

10−7 0.088 0.126 0.112 0.071 0.0714
10−6 0.092 0.021 0.021 0.021 0.021
10−5 0.092 0.023 0.023 0.023 0.023
10−4 0.105 0.024 0.024 0.024 0.0247
10−2 10−8 0.161 0.153 0.152 0.152 0.152

10−7 0.096 0.114 0.113 0.113 0.0724
10−6 0.107 0.083 0.0807 0.0808 0.0201
10−5 0.107 0.0206 0.018 0.0178 0.0222
10−4 0.128 0.0216 0.019 0.0183 0.0227
10−1 10−8 0.1617 0.153 0.152 0.152 0.152

10−7 0.1003 0.078 0.0759 0.076 0.0736
10−6 0.116 0.0223 0.0169 0.0169 0.0195
10−5 0.115 0.0209 0.0144 0.0146 0.0203
10−4 0.135 0.0215 0.0146 0.0148 0.0217
Figure 16.3: Mean filter error plotted using parallel coordinates. The smallest error occurs when
σv = 10−1 , σr0 = 10−5 and σr2 = 10−8 .
−3
x 10
2.5
1.5
r2
0.5
0
−0.01 −0.005 0 0.005 0.01 0.015 0.02 0.025 0.03
r0
Figure 16.4: Estimations on flat track with no process noise, initial covariance matrix P .
Estimate of velocity
30
20
10
0
0 50 100 150 200 250 300
Estimate of r
0
0.02
0.015
0.01
0.005
0
0 50 100 150 200 250 300
−3
Estimate of r2
x 10
2
1.5
0.5
0
0 50 100 150 200 250 300
Figure 16.5: Estimations on flat track with no process noise, initial covariance matrix P .
0.1
0.08
σ 0.06
v
0.04
0.02
0
0x 10−5 50 100 150 200 250 300
1.5
1
σ
r
0
0.5
0
0 −8 50 100 150 200 250 300
x 10
1
0.8
σr 0.6
2
0.4
0.2
0
0 50 100 150 200 250 300
t
Figure 16.6: Evolutions of covariances of initial covariance matrix P .
16.2.4 Conclusion
The Unscented Kalman Filter is easy to implement and fast enough that it can be run in
real-time with observations made every second. With a suitably chosen initial covariance
matrix, the method gives adequate results for our test problems. We found a suitable initial
covariance matrix by trial-and-error and searching; more work is required to find a more
rigorous method for initialising the filter. In the future we will employ the filter for the case
of a train coasting on an undulating track using real data.
Chapter 17
Conclusions
In attempting to prove the uniqueness of the optimal strategy to a power phase or a coast
phase interrupting a speedhold phase we came up with two new forms of the necessary con-
ditions. They are of course equivalent but were obtained by looking at the problem from
different perspectives. The original necessary condition developed by Howlett was derived
directly from Pontryagin’s Maximum Principle. One can interpret an optimal strategy intu-
itively by realising that if we start full power too soon before a steep section we can drive
over the steep section more quickly but the energy used is increased; by contrast, if we start
full power too late, we use less energy but the time taken will be increased. Hence between
the two extremes it seems there must be a unique point which is an optimal point for starting
full power.
Our revised necessary conditions were based on Howlett’s result. They were obtained by
minimising a special functional which is a trade-off between the energy used and time spent
on a power or coast phase between two speed holding phases. A new numerical method of
finding an optimal strategy was developed from our new necessary conditions. The optimal
solution was found using the bisection method. This iteration was shown to be effective
in a wide range of examples. The calculation also confirmed the uniqueness of an optimal
267
Chapter 17. Conclusions 268
solution in every example considered.
We also showed that for a track with a piecewise constant gradient we could find an explicit
algebraic relationship for the adjoint variable η in terms of the state variable v for every
specific strategy. By applying Pontryagin’s Maximum Principle we obtained a third form for
the necessary conditions. Although we have not been able to prove the uniqueness of the
optimal strategy at this stage we believe that this final form has some potential to provide
a constructive proof of uniqueness. Such a proof would depend on showing that the η(v)
curves for different decision profiles could never intersect. The phase trajectories of v and η
plotted for both non-optimal strategies and optimal strategies provided valuable information
about the evolution of the key state and adjoint variables.
The estimation of resistance in our train model is a non-linear estimation problem and re-
quires a non-linear filter. Of all the non-linear filters we tested for our problem, the Unscented
Kalman Filter has shown the best convergence and the least computational cost. Despite the
promising initial results, we are still looking for a better theoretical basis for the estimation
process.
Appendix A
First variation equation
A.1 Existence and uniqueness of solution for a system of

ordinary differential equation
The main objective of this appendix is to present a standard theory relating to the uniqueness
of solution of a system of differential equations and perturbation equations or variational
equations as described in the text [5] by Birkhoff and Rota.
Let Xi = Xi (x, t) be continuous functions of the independent variables x = [x 1 , x2 , . . . , xn ] ∈

X and t ∈ T for i = 1, . . . , n and consider a system of n ordinary differential equations
x (t) = X(x, t) (A.1)
with the initial condition

x(a) = c. (A.2)
We assume a is fixed but will consider change to c.
Definition A.1.1 A vector function X(x, t) satisfies a Lipschitz condition in the region R ⊆
269
Chapter A. First variation equation 270
X × T if and only if, for some constant L,
|X(x, t) − X(y, t)| ≤ L|x − y| if (x, t), (y, t) ∈ R. (A.3)
We say that L is a Lipschitz constant for a function X in the variable x.
We have the following theorem on existence of a solution to the system (A.1) and (A.2).
Theorem A.1.2 (Peano existence theorem) If the function X(x, t) is continuous for
|x − c| ≤ K, |t − a| ≤ T,
and if |X(x, t)| ≤ M there, then the vector DE (A.1) has at least one solution x(t), defined
for
|t − a| ≤ min(T, K/M),
satisfying the initial condition x(a) = c.
The proof of this theorem can be found in [5].
We also have a corresponding uniqueness theorem for solution of the system (A.1) and (A.2).
Theorem A.1.3 (Uniqueness Theorem) If the function X(x, t) satisfies a Lipschitz condi-
tion in a domain R ⊆ X × T , there is at most one solution x(t) of the differential equation
(A.1) and (A.2) which satisfies a given initial condition x(a) = c in R.
By the theorem A.1.3 if x(t) and y(t) are both solutions of (A.1) and at a point t in the
domain, say t = a, x(a) = y(a) then x(t) ≡ y(t) in the entire domain.
The solutions of a first-order normal system depend continuously on their initial values.
Recall that a normal system is an n equation system with n unknowns.
Theorem A.1.4 (Continuity Theorem) Let x(t) and y(t) be any two solutions of the vec-
tor differential equation (A.1) with initial condition (A.2), where X(x, t) is continuous and
satisfies the Lipschitz condition (A.3). Then
|x(a + h) − y(a + h)| ≤ eL|h| |x(a) − y(a)|. (A.4)

The proofs of these theorems can be found in [5].
A.2 The variational equation for a normal system
Let x = f (t, c) be the solution to the system (A.1) and (A.2) with initial condition x(a) = c.
Assume that f (t, c) is analytic in t, c then f can be expressed as a convergent power series
in c around c = 0. That means we have

∂ ∂f ∂ ∂f ∂
= = [X(f (t, c), t)]
∂t ∂c ∂c ∂t ∂c

∂X ∂f
= (f (t, c), t) · (t, c) . (A.5)
∂x ∂c
Using a Taylor series expansion for f (t, c) around c = 0 gives us
c2 cn
f (t, c) = f0 (t) + cf1 (t) + f2 (t) + . . . , + fn (t) + · · · (A.6)
2! n!
where fi (t) = ∂ i f /∂ci (t, 0). By (A.6) we have

∂ ∂f ∂
= [f1 (t) + cf2 (t) + . . . ] = f1 (t) + cf2 (t) + . . .
∂t ∂c ∂t
and so (A.5) becomes

∂X
f1 (t) + cf2 (t) +··· = (f (t, c), t) (f1 (t) + cf2 (t) + . . . ) .
∂x
With c = 0, f1 (t) satisfies the linear perturbation equation

∂X
f1 (t) = (f0 (t), t) f1 (t), (A.7)
∂x
with the initial condition f 1 (0) = 0. If we change the initial conditions, (A.7) can be used
to approximate the change of the solution. So it is called the perturbation equation or varia-
tional equation of the normal system (A.1).
Theorem A.2.1 Let the vector function X be continuous, and let x(t, c) be the solution of
the normal system (A.1), taking the initial value c at t = a. Then x(t, c) is a continuously
differentiable function of each of the components c j of c.
For the proof of this theorem, please see [5].
From the proof of this theorem we have the following result.
Corollary A.2.2 If x(t, c) is a solution of the normal system (A.1) satisfying the initial con-
dition x(a) = c for each c, and if each component of the function X is of class C 1 , then for
each j the partial derivative ∂x(t, c)/∂cj is a solution of the perturbation equation
dhi ∂Xi (x(t, c), t)

n
= hk + Hi (h, t, c, ηj ) (A.8)
dt k=1
∂x k
where x(t, c) is the solution of the normal system (A.1) for which x(t, a) = c; the functions
Hi are bounded and tends to zero as η j → 0 uniformly for t − a ≤ T and h − δ j ≤ K;
and δ j is the vector whose components are the Kronecker delta δij .
Bibliography
[1] B. Akin, U. Orguner, and A. Ersak. State estimation of induction motor using unscented
kalman filter. submitted to IEEE on january 31, 2003.
[2] I.A Anis, A.V. Dmitruk, and N.P Osmolovskii. Solution of the problem of the ener-
genetically optimal control of a train by the maximum principle. Coput. Math. Math.
Phys., 25(6), 1985.
[3] R. Beard, J. Kenney, J. Gunther, J. Lawton, and W. Stirling. Non-linear projection filter
based on the galerkin approximation. Journal of Guidance, Control and Dynamics.,
22(2), March-April 1999.
[4] B.R. Benjamin, I.P Milroy, and P.J. Pudney. Energy-efficient operation of long haul
trains. Proceedings of the Fourth International Heavy Haul Railway Conference., pages
369–372, 1989.
[5] G. Birkhoff and G-C. Rota. Ordinary Differential equation. John Wiley and Sons, Inc.,
3rd edition edition, 1978.
[6] R. Booue and R. Definlippi. A galerkin multiharmonic procedure for nonlinear multidi-
mentional random vibration. International Journal of Engineering Science., 25(6):723–
733, 1987.
[7] R.G Burden and J.D. Faires. Numerical Analysis. Brooks/Cole Publishing Company.,
6th edition edition, 1997.
273
Bibliography
[8] J. Cheng. Analysis of optimal driving strategies for train control problems. PhD.
Thesis, 1997.
[9] J. Cheng and P. G. Howlett. Application of critical velocities to the minimisation in the
control of trains. Automatica, 28(1):165–169, 1992.
[10] J. Cheng and P. G. Howlett. A note on the calculation of optimal strategies for the min-
imisation of fuel consumption in the control of trains. IEEE transaction on Automatic
Control, 38(11):1730–1734, 1993.
[11] Y. Davydova, J. Cheng, P. G. Howlett, and J. P. Pudney. Optimal driving strategies for
a train journey with non-zero track gradient and speed limits. Technical Report Report
no. 8, 1996.
[12] H. Eves. Elementary Matrix Theory. Dover Publications, New York, 1980.
[13] J. Figuera. Automatic optimal control of trains with frequent stops. Dyna (Spain),
45(7):263–269, 1970.
[14] Howlett P. G. Optimal strategies for the control of train. Automatica, 32(4):519–532,
1996.
[15] P. Gil, J. Henriques, H. Duarte-Ramos, and A. Dourado. State-space neural networks

and the unscented kalman filter in online non-linear system identification. IASTED
International Conference on Intelligent System and Control (ISC 2001)., November.
2001.
[16] R. Hill. Elementary Linear Algebra with Applications. Harcourt Brace College Pub-
lishers, New York, third edition edition, 1996.
[17] H. H. Hoang, M. P. Polis, and A. Haurie. Reucing energy consumption through tra-
jectory optimazation for a metero network. IEEE Transaction on Automatic Control,
20(5):590 – 595, 1975.
Bibliography
[18] P. Howlett and P. Pudney. Energy-efficient driving strategies for long-haul trains. Pro-
ceedings of CORE 2000 Conference on Railway Engineering., May 2000.
[19] P. G. Howlett. An optimal strategy for the control of a train. J. Austral. Math. Soc.
series B, 31:454 – 471, 1990.
[20] P. G. Howlett. The optimal control of a train. Annals of Operations research., 98:65–87,
2000.
[21] P. G. Howlett and J. Cheng. Optimal driving strategies for a train on a track with
continuously varying gradient. J. Austral. Math. Soc. Series B., 38:388–410, 1995.
[22] P. G. Howlett and A. Leizarowitz. Optimal strategies for vehicle control problems
with finite control sets. Dynamic of Continuous, Discrete and Impulsive Systems B,
Application and Algorithms., 8:41–69, 2001.
[23] P. G. Howlett and P. J. Pudney. An optimal driving for a solar powered car on an
undulating road. Dynamics of Continuous, Discrete and Impulsive System., 4:553–567,
1998.
[24] P. G. Howlett, P. J. Pudney, D. Gates, and T. Tarnopolskaya. Optimal driving strategies

for a solar car on a level road. IMA journal of Mathematics Applied in Business and
Industry, 8:59–81, 1997.
[25] P.G. Howlett and P.J. Pudney. Energy-efficient Train Control. Springer–Verlag London
Ltd., 1995.
[26] J. Joseph and Jr. LaViola. A comparison of unscented and extended kalman filtering for
estimating quaternion motion. Proceedings of the 2003 American Control Conference,
IEEE Press, pages 2435–2440, June 2003.
[27] S. Julier and J. Uhlmann. A general method for approximating nonlinear transformation
of probability distributions. Technical report, 1996.
[28] S. Julier and J. Uhlmann. A new extension of the kalman filter to nonlinear systems.
Bibliography
Proceedings of AeroSense: The 11th International Symposium on Aerospace/Defence

Sensing, Simulation and Controls, 1997.
[29] S. Julier, J. Uhlmann, and H. F. Durant-Whyte. Navigation and parameter estimations

of high speed road vehicles. Robotics and Automation Conference, pages 101–105,
1995.
[30] S. Julier, J. Uhlmann, and H. F. Durrant-Whyte. A new method for the nonlinear trans-
formation of means and covariances in filters and estimators. IEEE transactions on
Automatic Control, March 2000.
[31] R. E. Kalman. A new approach to linear filtering and prediction problems. Journal of
Basic Engineering., pages 35–45, 1960.
[32] R. E. Kalman and R. S. Bucy. New results in linear filering and prediction theory.
Journal of Basic Engineering, pages 95–107, March 1961.
[33] E. Khmelnitsky. On an optimal control problem of train operation. IEEE Transaction

on Automatic Control, 45(7):1257–1266, July 2000.
[34] P. Kokotovic and G. Singh. Minimum-energy control of a traction motor. IEEE trans-
action on Automatic Control., 17(1):92–95, 1972.
[35] E. B. Lee and L. Markus. Foundations of Optimal Control Theory. John Wiley and
Sons, New York, 1967.
[36] T. Lefebvre, H. Bruyninckx, and De Schutter J. Comment on “a new method for the
nonlinear transformation of means and covariances in filters and estimators” . IEEE
Transactions on Automatic Control, 47(8):1406–1408, August 2002.
[37] Peihua Li and Tianwen Zhang. Unscented kalman filter for visual curve tracking. Pro-
ceedings of Statistical Methods in Video Processing, June 2002.
[38] F. H. Ling and X. X. Wu. Fast garlekin method and its application to determine peri-
Bibliography
odic solutions of nonlinear oscillators. International Journal of Nonlinear Mechanics.,

22(2):89–98, 1987.
[39] R. Liu and I. M. Golovitcher. Energy-efficient operation of rail vehicles. Transportation

Research, 37:917–932, 2003.
[40] DG Luenberger. Optimization by Vector Space Methods. Jonh Wiley and Sons, New
York, 1969.
[41] H. A. Prime, S. Sujitjorn, C. J. Goodman, and B. Mellitt. Energy reduction by

dynamic train control. Control in Transportaion System, proceedings of the 5th
IFAC/IFIP/IFORS conference:173–177, 1986.
[42] P. J. Pudney. Optimal energy management for solar powered cars. PhD thesis, 2000.
[43] P. J. Pudney and P. G. Howlett. Optimal driving strategy for a train journey with speed
limits. J. Austral. Maths. Soc. Series B, 36:38–49, 1994.
[44] D. Simon. From here to infinity. Embbeded Systems Programming, October 2001.
[45] H. Strobel and P. Horn. Energy optimum on board microcomputer control of train
operation. Bridge between Control and Science and Technology., 16(3):219–230, 1985.
[46] R. van der Merwe and E.A. Wan. Efficient derivative-free kalman filters for online
learning. European Symposium on Artificial Neural Networks(ESANN), ESANN2001
proceedings, April 2001.
[47] E. A. Wan and R. van der Merwe. The unscented kalman filter for nonlinear estimation.
Proc. of IEEE Symposium 2000 (AS-SPCC)., October 2000.
[48] E. A. Wan, R. van der Merwe, and A. T. Nelson. Dual estimation and the unscented
transformation. Neural information Processing Systems., pages 666–672, December
2000.

Train Control

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Train Control

Uploaded by

Copyright:

Available Formats

Analysis of necessary conditions for the

optimal control of a train

Thesis submitted for the degree of

Doctor of Philosophy (Mathematics)

School of Mathematics and Statistics

List of Figures viii

List of Tables xiii

I The train control problem 1

2.2 Necessary conditions for an optimal journey . . . . . . . . . . . . . . . . . 14

3 Steep hills and limiting speeds 29

4 Uniqueness of the optimal strategy on a non-steep track 34

II A new method for finding optimal driving strategies 50

5 Hold-power-hold for a single steep uphill gradient 51

5.5 Numerical solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Hold-power-hold on an uphill section with two steep gradients 75

7 Hold-power-hold on an uphill section with piecewise constant gradient 91

8 Hold-coast-hold for a single steep downhill gradient 105

8.5.2 Algebraic approach . . . . . . . . . . . . . . . . . . . . . . . . . . 114

9 Hold-coast-hold on a downhill section with two steep gradients 124

10 Hold-coast-hold on a steep downhill section with many gradients 139

III Phase trajectories 148

11 Phase trajectories on a steep uphill section 149

11.4 Phase diagram for (v, η) . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

12 Phase trajectories on a steep downhill section 180

13 Coasting and braking at the end of a journey 199

IV Parameter estimation 214

14 Parameter estimation 215

15 Nonlinear Projection Filter 225

16 Unscented Kalman Filter 254

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

A First variation equation 269

1.1 FreightMiser screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1 Diagram of possible transitions 1 . . . . . . . . . . . . . . . . . . . . . . 21

3.1 Speed decreases on a steep uphill section . . . . . . . . . . . . . . . . . . 30

4.1 Optimal journeys for different hold speeds . . . . . . . . . . . . . . . . . 36

5.1 Optimal speed profile for a single steep gradient . . . . . . . . . . . . . . 53

6.1 Speed profile of γ12 < γ11 track . . . . . . . . . . . . . . . . . . . . . . . 77

7.1 Track gradient profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

8.1 Track profile and speed profile . . . . . . . . . . . . . . . . . . . . . . . 106

9.1 Speed profile of two steep sections 1 . . . . . . . . . . . . . . . . . . . . 126

12.1 Track profile and speed profile . . . . . . . . . . . . . . . . . . . . . . . 182

13.1 Track diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

15.1 Kolmogorov solution using FFT without diffusion at t = 0 . . . . . . . . 231

2.1 Summary of optimal control modes . . . . . . . . . . . . . . . . . . . . . 19

5.1 Experimental results of hold-power-hold on a single steep uphill. . . . . . . 73

6.1 Experimental results of hold-power-hold on a double steep uphill section,

8.1 Experimental results of hold-coast-hold on a single steep downhill section. . 123

9.1 Experimental results of an optimal journey on a double steep downhill sec-

11.1 Eight combinations of long and short steep sections. . . . . . . . . . . . . . 170

12.1 Eight combinations of long and short steep sections. . . . . . . . . . . . . . 189

Previous work formulated the problem of minimising the energy required to

A critical part of constructing an optimal control strategy is determining where

To calculate an optimal driving strategy, we need a realistic model of train per-

I am very grateful to my Associate Supervisor, Dr Peter Pudney, for his constant

Finally I am indebted to my family for their unconditional love and support

The train control problem

1.1 Project aims

1.2 History of optimal train control

In 2000, Khmelnitsky [33] provided a comprehensive analysis of train control on a track

Figure 1.1: FreightMiser screen.

1.3 Uniqueness of the optimal control strategy

The optimal control of a train is discussed in Part I – III of this thesis.

1.4 Parameter estimation