CAPTAIN Handbook

System Identification,
Time Series Analysis and Forecasting.

The Captain Toolbox.
Handbook v2.0 February 2007

D. J. Pedregal, C. J. Taylor and P. C. Young
Copyright (c) 2007

Centre for Research on Environmental Systems and Statistics (CRES),
Lancaster University, Lancaster, LA1 4YQ, United Kingdom.
Web www.es.lancs.ac.uk/cres/captain
Email c.taylor@lancaster.ac.uk
Captain Toolbox
CAPTAIN is a MATLAB compatible toolbox for non stationary time series analysis, system
identification, signal processing and forecasting, using unobserved components models,
time variable parameter models, state dependent parameter models and multiple input
transfer function models. CAPTAIN also includes functions for true digital control.
Toolbox Authors by Department
Department of Environmental Science, Faculty of Science and Technology,

Lancaster University, Lancaster, LA1 4YQ, United Kingdom
Prof. Peter C. Young p.young@lancaster.ac.uk

Dr. Wlodek Tych w.tych@lancaster.ac.uk
Engineering Department, Faculty of Science and Technology,

Lancaster University, Lancaster, LA1 4YR, United Kingdom
Dr. C. James Taylor c.taylor@lancaster.ac.uk
Escuela Tcnica Superior de Ingenieros Industriales, Edificio Politcnico,

Campus Universitario s/n, 13071 Ciudad Real, Spain
Dr. Diego J. Pedregal diego.pedregal@uclm.es
Additional Contributors
Dr Paul McKenna, Department of Environmental Science, Lancaster University
Dr Renata Romanowicz, Department of Environmental Science, Lancaster University
ii
Toolbox Installation
CAPTAIN is usually distributed as a mixture of pre-parsed MATLAB pseudo-code (P-files)

and conventional M-files. The following installation instructions assume MATLAB itself is
already installed.
1 Copy all the M- and P-files to a directory where you want the toolbox to reside, such as
Program Files\Matlab\Toolbox\Captain or similar.
2 Start MATLAB and add the above location of the toolbox to your path. You can use the
standard addpath function or the graphical user interface to do this. Refer to your
MATLAB documentation for more information.
3 Once installed, typing captdemo in the MATLAB Command Window starts a simple
graphical user interface for access to the on-line demos. If this does not work, then
check that you have correctly added the toolbox location to your MATLAB path.
4 To obtain a full list of CAPTAIN functions, type help captain in the MATLAB
Command Window, replacing captain with the name of the installation directory
chosen in item 1 above.
To uninstall CAPTAIN, simply delete the files and remove the associated path.
Conditions of use
The CAPTAIN software package may be freely used for scientific or educational purposes.
However, if you publish any results using CAPTAIN you should state clearly that you used
the tools. The full reference is:
Young, P.C., Taylor, C.J., Tych, W. and Pedregal, D.J. (2007)

The Captain Toolbox. Centre for Research on Environmental Systems and Statistics,
Lancaster University, UK. Internet: www.es.lancs.ac.uk/cres/captain
For commercial applications, permission is required from the authors.
The Toolbox is provided without formal support, although questions and bug reports can
be emailed to the authors.
Version Tracking
Recent versions of CAPTAIN are developed for MATLAB v7.0 (R14) onwards.
The examples in the handbook were originally developed with CAPTAIN v5.2 and
MATLAB v6.5 (R13) on a Windows PC. They have subsequently been tested for CAPTAIN
v6.0 and MATLAB v7.3 (R2006b). However, updates to the optimisation routines and
default values may yield different numerical results in some cases.
Please check the on-line help for the latest function calling syntax and default values.
iii
Contents
Preface
Chapter 1 Introduction 1
Chapter 2 State Space models 18
Chapter 3 Unobserved Components models 42
Chapter 4 Time Variable Parameter models 70
Chapter 5 State Dependent Parameter models 93
Chapter 6 Discrete-Time Transfer Function models 110
Chapter 7 Continuous-Time Transfer Function models 143
Chapter 8 True Digital Control 156
Bibliography 166
Appendix 1 Reference Guide 173
Appendix 2 Data Sets, Functions and Abbreviations 269
iv
Examples
1.1 Interpolation of advertising data using DLR adv.dat 11

1.2 Transfer function model estimation using RIV vent.dat 13
2.1 Estimation of a trend for the air passenger series air.dat 23
2.2 Hyper-parameter estimation for the air passenger series air.dat 30
2.3 Interpolation and variance intervention for steel consumption steel.dat 33
2.4 Time variable mean estimation for volume in the Nile river nile.dat 38
3.1 Analysis of the Mauna Loa CO2 data using DHR co2.dat 54
3.2 Modelling the US GDP using Trend + AR models usgdp.dat 59
3.3 Steel consumption in the UK revisited steel.dat 63
3.4 Car drivers killed and seriously injured cars.dat 66
4.1 Initial Evaluation of the River Cam data using DLR cam.dat 72
4.2 Analysis of a signal with changing frequency using DAR sdar.dat 78
4.3 River Cam Data analysed using DARX cam.dat 82
4.4 Comparison of DARX and DTFM for simulated data sdtfm1.dat 90
5.1 Analysis of a simulated SDARX model - 98
5.2 Final parameter estimates for the model in Example 5.1 - 102
5.3 Analysis of squid data squid.dat 104
5.4 Hydraulic Actuator - 106
6.1 Simulation showing the bias of the LS parameter estimates - 114
6.2 Simulation experiment using the IV algorithm - 121
6.3 Simulation comparing the SRIV, RIV and ML estimates - 125
6.4 Gas furnace gas.dat 133
6.5 Unemployment rate in the USA usemp.dat 137
6.6 Ventilation data re-visited vent.dat 141
7.1 Model identification for the WG Example swg01.dat 150
7.2 Parameter Estimation for the WG Example swg01.dat 151
7.3 Analysis of winding pilot plant data wind.dat 153
8.1 Non-minimal state space form for the ventilation model - 157
8.2 Pole assignment design for the ventilation mode - 160
8.3 Optimal design for the ventilation model - 161
v
CHAPTER 1
INTRODUCTION
CAPTAIN is a MATLAB compatible toolbox for non-stationary time series analysis and
forecasting. Based around a powerful state space framework, CAPTAIN extends MATLAB
to allow, in the most general case, for the identification of Unobserved Components (UC)
models. Here, the time series is assumed to be composed of an additive or multiplicative
combination of different components that have defined statistical characteristics but which
cannot be observed directly. With Maximum Likelihood estimation of most models and the
inclusion of several popular model forms, such as the Basic Structural Model of Harvey
(1989) and the Dynamic Linear Model of West and Harrison (1989), together with a
standard set of data pre-processing, system identification and model validation tools,
CAPTAIN is a wide-ranging package for signal processing and general time series analysis.
Uniquely, however, CAPTAIN focuses on Time Variable Parameter (TVP) models, where
the stochastic evolution of each parameter is assumed to be described by a generalised
random walk process (Jakeman and Young, 1981). In this regard, the state space
formulation utilised is particularly well suited to estimation based on optimal recursive
estimation, in which the time variable parameters are estimated sequentially whilst
working through the data in temporal order. In the off-line situation, where all the time
series data are available for analysis, this Kalman filtering operation (Kalman, 1960) is
accompanied by optimal recursive smoothing. Here the estimates obtained from the
forward pass filtering algorithm are updated sequentially whilst working through the data
in reverse temporal order using a backwards-recursive Fixed Interval Smoothing (FIS)
algorithm (Bryson and Ho, 1969).
In this manner, CAPTAIN provides novel tools for TVP analysis, allowing for the optimal
estimation of dynamic regression models, including linear regression, auto-regression
(Young, 1998b) and harmonic regression (Young et al., 1999). Furthermore, a closely
related algorithm for state dependent parameter estimation provides for the non-parametric
identification and forecasting of a very wide class of nonlinear systems, including chaotic
systems. The identification stage in this process again exploits the recursive FIS
algorithms, combined with special data re-ordering and back-fitting procedures, to obtain
estimates of any state dependent parameter variations (Young, 2000).
Chapter 1 Introduction
Of course, in many cases, specifying time invariant parameters for the model yields the
equivalent, conventional, stationary model. In this regard, one model that has received
special treatment in the toolbox is the multiple-input, single-output Transfer Function (TF)
model. CAPTAIN includes functions for robust unbiased identification and estimation of
both discrete- time (Young, 1984, 1985) and continuous- time (Young, 2002) TF models.
One advantage of the TF model is its simplicity and ability to characterise the dominant
modal behaviour of a dynamic system. This makes such a model an ideal basis for control
system design.
In the latter regard, the toolbox includes a set of functions for True Digital Control (TDC),
based on the Proportional-Integral-Plus (PIP) control system design methodology (Young
et al., 1987; Taylor et al. 1998, 2000). The underlying philosophy of the approach is that
the entire design procedure, from the identification and estimation of a suitable model
through to the practical implementation of the final control algorithm, is carried out in
discrete time. This differs from many conventional digital controllers, where an inherently
continuous time algorithm is digitised for implementation purposes. Indeed, CAPTAIN has
been successfully utilised for the design of practical PIP control systems for many years
(e.g. Young et al., 1994; Gu et al., 2003; Taylor et al., 2004; Taylor and Shaban, 2006).
As demonstrated by the numerous publications and examples below, the CAPTAIN package
is useful for system identification, signal extraction, interpolation, backcasting, forecasting
and Data-Based Mechanistic (DBM) analysis of a wide range of linear and non-linear
stochastic systems. In the latter case, the resulting DBM model is only considered fully
acceptable if, in addition to explaining the data well, it also provides a description that has
relevance to the physical reality of the system under study (e.g. Young, 1998b).
Some of the estimation algorithms considered here were developed originally in the
1960/1970/1980s for the CAPTAIN and microCAPTAIN time series analysis and
forecasting packages (MS-DOS based). The associated optimisation algorithms were
developed in the 1980/1990s and are used in the latest version of microCAPTAIN (Young
and Benner, 1991). However, the MATLAB implementation is much more flexible than
microCAPTAIN and includes the latest innovations and improvements to the algorithms.
Note that the present text refers exclusively to this CAPTAIN Toolbox for Time Series
Analysis and Forecasting using MATLAB (Taylor et al., 2007).
1.1 Modelling Philosophy
As we look around us, we perceive complexity in all directions: environmental, biological

and ecological systems, socio-economic systems, and some of the more complex
engineering systems - they all appear to be complicated assemblages of interacting
CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 2

processes, many of which are inherently nonlinear dynamic systems, often with
considerable uncertainty about both their nature and their interconnections. It is not too
surprising, therefore, that the mathematical models of such systems, as constructed by
scientists, social scientists and engineers, are often similarly complex. What is perhaps
surprising, however, is the apparently widespread belief that such systems can be described
very well, if not exactly, by deterministic mathematical equations, with little or no
quantification of the associated uncertainty. Such deterministic reductionism leads
inexorably to large, nonlinear simulation models which reflect the popular view that
complex systems must be described by similarly complex models.
The CAPTAIN toolbox has evolved from a different DBM modelling philosophy, developed
by the present third author and colleagues, which is almost the antithesis of deterministic
reductionism. DBM models are obtained initially from the analysis of observational time-
series but are only considered credible if they can be interpreted in physically meaningful
terms. It is a philosophy that emphasises the importance of parametrically efficient, low
order, dominant mode models, as well as the development of stochastic methods and the
associated statistical analysis required for the identification and estimation of such models.
Furthermore, it stresses the importance of explicitly acknowledging the basic uncertainty
that is essential to any characterisation of physical, chemical, biological and socio-
economic processes.
Previous publications map the evolution of the DBM philosophy and its methodological
underpinning. Such publications utilise the approach for the analysis of numerous natural
and man-made systems. An incomplete list includes: Beck and Young (1975); Jarvis et al.
(1999); Parkinson and Young (1998); Price et al. (1999, 2000, 2001); Shackley et al.
(1998); Tych et al. (1999); Ye et al. (1998); Young (1978, 1981; 1983, 1984; 1985; 1993a,
1993b; 1994; 1998a, 1998b, 1999a, 1999b, 2000a, 2000b, 2000c, 2001a, 2001b, 2002);
Young and Beven (1994); Young and Lees (1993); Young and Minchin (1991); Young and
Pedregal (1996; 1997; 1998, 1999a, 1999b); Young et al. (1996, 1997, 1999, 2000).
Naturally, these publications introduce a wide range of modelling tools, encompassing

various model structures and identification algorithms. However, they can be broadly
categorised into the four closely related and overlapping themes below.
1. Many of the tools that underpin the DBM modelling philosophy can be unified in terms
of the discrete-time UC model. Here, the components may include a trend or low
frequency component, a seasonal component (e.g. annual seasonality), additional
sustained cyclical or quasi-cyclical components, stochastic perturbations, a component
to capture the influence of exogenous input signals and so on. CAPTAIN allows for a
wide range of such components, as discussed throughout the text.

2. Nonstationary and nonlinear signal processing based on the identification and

estimation of stochastic models with time varying parameters. In this case, the term
nonstationarity is assumed to mean that the statistical properties of the signal, as
defined by the parameters in an associated stochastic model, are changing over time at
a rate which is slow in relation to the rates of change of the stochastic state variables
in the system under study. Although such nonstationary systems exhibit nonlinear
behaviour, this can often be approximated well by TVP (or piece-wise linear) models,
the parameters of which are recursively estimated.
3. Further to item 2. above, if the changes in the parameters are functions of the state or
input variables (i.e. they actually constitute stochastic state variables), then the system
is truly nonlinear and likely to exhibit severe nonlinear behaviour. Normally, this
cannot be approximated in a simple TVP manner; in which case, recourse must be
made to alternative, and more powerful in this context State Dependent Parameter
(SDP) modelling methods.
4. Finally, if the essential small perturbation behaviour of the system can be approximated
by linearised TF models, then robust unbiased, Refined Instrumental Variable (RIV)
and Simplified Refined Instrumental Variable (SRIV) algorithms are employed. Here,
either discrete-time TF models represented in terms of the backward shift operator
(often denoted in the statistical and engineering literature by either z 1 , q, B or L,
where the latter is utilised in the present text) or continuous-time TF models based on
the Laplace Transform s-operator are identified and estimated.
1.2 Toolbox Overview
MATLAB is a high performance language published by The MathWorks, Inc., integrating

computation, visualisation and programming in a single environment (MathWorks, 2001).
CAPTAIN is a collection of MATLAB functions for the estimation of UC, TVP, SDP and TF
models. By also including a number of tools for data pre-processing, system identification
and model validation, CAPTAIN provides a powerful all round package for the analysis of
complex stochastic systems. The following subsections introduce these main areas of
functionality.
Unobserved Components models
CAPTAIN includes a range of UC models, a number of which are unique to this toolbox. In
particular, the Dynamic Harmonic Regression (DHR) model, estimated using the function
dhr, is very useful for signal extraction and forecasting of periodic or quasi-periodic series.
This function provides smoothed estimates of the series, as well as all its components
(trend, fundamental frequency and harmonic components), together with the estimated

changing amplitude and phase of the latter. Typical applications are for the analysis of
periodic environmental and economic time-series; restoration of noisy signals with gaps or
other aberrations; and the evaluation of temporal changes in environmental data etc.
Furthermore, the same function allows for the estimation of the well-known Basic
Structural Models (BSM) of Harvey (1989).
It should be pointed out that, while it is sometimes convenient to categorise the

functionality of the toolbox, there is considerable overlap between the methodological
areas chosen. For example, the DHR model is a particular case of the general stochastic
TVP model discussed in the following subsection. In this regard, the hyper-parameters of
the model, which define the statistical properties of the time variable parameters, need to
be estimated in some manner. CAPTAIN provides three approaches, all through the function
dhropt, namely: Maximum Likelihood (ML) based on prediction error decomposition;
minimisation of the multiple-steps-ahead forecasting errors; and a special frequency
domain optimisation, based on fitting the model pseudo-spectrum to the logarithm of the
Auto-Regression (AR) spectrum.
An alternative to dhr/dhropt is provided by the pair univ/univopt, which allow for the
estimation of various additional UC model forms. Here, the trend is extracted from the
time series and a peturbational component about the trend is modelled as a pure AR
component. Although they may also be utilised for modelling seasonal series,
univ/univopt are particularly useful in cases where the periodic behaviour of the
perturbation about the trend is not very marked. In this case, the models are estimated
using either standard statistical methods or a sequential spectral decomposition approach
that has been developed for the toolbox in order to avoid identifiably problems.
Time Variable Parameter models
The class of TVP, or dynamic, regression models, includes: Dynamic Linear Regression
(DLR), Dynamic Harmonic Regression (DHR), Dynamic Auto-Regression (DAR),
Dynamic Auto-Regression with eXogenous variables (DARX) and the closely related
Dynamic Transfer Function (DTF) model. It should be noted that the term dynamic,
which is used to differentiate time variable parameter regression models from their
standard constant parameter relatives, is somewhat misleading, since not all of these
models are inherently dynamic in a systems sense. However, it is a common term in certain
areas of statistics (e.g. West and Harrison, 1989 and the references therein) and is retained
for this reason.
CAPTAIN provides functions that allow for the optimal estimation of all these dynamic
regression models. In each case, Fixed Interval Smoothing (FIS) estimates of the TVPs are

obtained, under the assumption that the parameters vary as one of a family of generalised
random walks (see Chapter 2), namely: Random Walk (RW); Integrated Random Walk
(IRW); Smoothed Random Walk (SRW); and Local Linear Trend (LLT). The associated
filtering and FIS algorithms are accessible via shells, namely the functions dlr, dhr, dar,
darx and dtfm. Since the regressors are freely defined by the user, the most flexible
toolbox function for TVP analysis is dlr, which can include all the remaining models as
special cases. The other functions all restrict the model to the most commonly used forms.
For example, dhr automatically constrains the regressors to model harmonic components.
As one of the key tools for estimating UC models, it has already been discussed above.
At this juncture, it is worth pointing out that CAPTAIN includes the functions mar and
arspec for Auto-Regression (AR) model and spectrum estimation. However, in the context
of dynamic regression, dar and darsp are instead useful for evaluating changing signal
spectra and time-frequency analysis based on DAR models, since they provide the AR
spectrum at each point in time based on the locally optimum time variable AR parameters.
Further to this, darx is an extension of the DAR model to include measured eXogeneous or
input time series that are thought to affect the output, while dtfm augments the model in
order to allow for coloured noise in the output signal. The latter function employs
instrumental variables in the solution to ensure that the parameter estimates are unbiased
(see below and Chapter 4). In this manner, the functions darx and dtfm are truly dynamic
in a systems sense and form a link between the dynamic regression analysis considered
here and the dedicated TF modelling component of the toolbox discussed below.
In the case of dlr, dar, darx and dtfm, the Noise Variance Ratio (NVR) and other hyper-
parameters, which define the statistical properties of the TVPs, are optimised via ML
based on prediction error decomposition. The relevant toolbox functions are dlropt,
daropt, darxopt and dtfmopt. In comparison with most other algorithms for TVP
estimation, the main innovations in CAPTAIN are this automatic hyper-parameter
optimisation, the provision of FIS rather than the filtered TVP estimates and the various
special uses outlined above.
State Dependent Parameter models
The approach to TVP estimation discussed above works very well in situations where the
parameters are slowly varying when compared to the observed temporal variation in the
measured system inputs and outputs. Although such models are nonlinear systems, since
the same inputs, injected at different times, will elicit quite different output responses, the
resultant nonlinearity is fairly mild. It is only when the parameters are varying at a rate
commensurate with that of the system variables themselves that the model behaves in a

heavily nonlinear or even chaotic manner. For such cases, CAPTAIN includes a novel
algorithm for state dependent parameter estimation, sdp, allowing for the non-parametric
identification and forecasting of a very wide class of nonlinear systems.
Multi-Input Transfer Function models
There are numerous algorithms for estimating TF models. However, the primary technique
employed in CAPTAIN, is the least squares- based instrumental variable approach. Here, an
adaptive auxiliary model is introduced into the solution in order to avoid parameter bias
and to optimally filter the data, so making the estimation more statistically efficient.
In particular, CAPTAIN provides the recursive and en-block RIV and SRIV algorithms, as
well as more conventional least squares based approaches, primarily through the functions
riv (for discrete-time systems) and rivc (continuous-time). Both these functions return the
modelling results in the form of a special matrix from which the various parameters and
standard errors may be extracted using getpar. Such parameters may subsequently be
utilised for simulation and forecasting through conventional MATLAB commands like
filter, or by using SIMULINK (MathWorks, 2001).
For a given physical system, an appropriate structure first needs to be identified, i.e. the
most appropriate values for the time delay and the orders of the numerator and
denominator polynomials in the TF. In this regard, CAPTAIN utilises two functions, namely
rivid (discrete-time) and rivcid (continuous-time), which provide numerous statistical
diagnostics associated with the model. These include the Coefficient of Determination RT2 ,
based on the response error, which is a simple measure of model fit; and the more
sophisticated Young Identification Criterion (YIC), which provides a combined measure of
fit and parametric efficiency.
True Digital Control
Following the identification of a suitable discrete-time time TF model, PIP control systems
are determined using either the pip or pipopt functions, for pole assignment or Linear
Quadratic (LQ) optimal design respectively. PIP control with command input anticipation
is implemented using pipcom. Finally, gains and pipcl are used to analyse the closed-loop
system. The above functions are for the single input, single output (SISO) case, while
mfdform, mfd2nmss, mpipqr and mpipinit are used for multivariable PIP control.
Finally, dlrqri provides the iterative linear quadratic regulator solution for either the SISO
or multivariable cases; while piplib is the associated Simulink library for various PIP
control structures, including the conventional feedback form and an alternative forward
path approach.

Conventional Models, Identification Tools and Auxiliary functions
As pointed out above, specifying time invariant parameters in CAPTAIN usually yields the
equivalent stationary time series model. In this manner, many of the functions above may
be utilised to estimate either the well known conventional model or the more sophisticated
TVP version, depending on the input arguments chosen.
Similarly, system identification is inherent to the modelling approach utilised by most of

the functions already discussed. However, identification tools not yet mentioned, include:
acf to determine the sample and partial autocorrelation function; ccf for the sample cross-
correlation; period to estimate the periodogram; and statist for some sample descriptive
statistics. Additional statistical diagnostics include: boxcox, cusum and histon.
Finally, del generates a matrix of delayed variables; fcast may be employed to prepare data
for forecasting and interpolation; irwsm for smoothing, decimation or for fitting a simple
trend to a time series; prepz to prepare data for TF modelling (e.g. baseline removal and
input scaling); scaleb to rescale the estimated TF model numerator polynomial following
initial prepz use; stand to standardise or de-standardise a matrix by columns; and reconst
to reconstruct a time series by removing any dramatic jumps in the trend.
1.3 Getting Started
Installation instructions and conditions of use are given in the preface. Since CAPTAIN is
largely a command line toolbox, it is assumed that the reader is already familiar with basic
MATLAB usage, such as loading data, plotting graphs and writing simple M-files.
Introductory guides to the package include Etter (1993) and Biran and Breiner (1995). For
example, to plot the well known airline passenger series (e.g. Box and Jenkins, 1970),
enter the following text at the MATLAB Command Window prompt,
>> load air.dat

>> plot(air)
>> title(thousands of passengers per month (1949-1960))
These data are included with CAPTAIN for demonstration purposes, in a standard text file.
If an error occurs, then check that you have correctly added the toolbox location to your
MATLAB path. Note that the Courier New font is used to indicate such worked examples
throughout the text. Another convention employed here, is that function and variable
names referred to in the body of the text, such as plot, are highlighted in bold notation.

Getting Help
On-line help information follows MATLAB conventions. For example, to obtain a full list
of functions, type help captain in the Command Window, where captain is the name of
the installation directory. Similarly, the brief calling syntax for each function is obtained
by entering its name without any input arguments, while more information is provided
using the standard help command, as illustrated below.
>> irwsm
IRWSM Integrated Random Walk smoothing and decimation
[t,deriv,err,filt,h,w,y0]=irwsm(y,TVP,nvr,Int,dt)
>> help irwsm
IRWSM Integrated Random Walk smoothing and decimation
[t,deriv,err,filt,h,w,y0]=irwsm(y,TVP,nvr,Int,dt)
y: Time series (*)

TVP: Model type (RW=0, IRW=1, DIRW=2) (1)
nvr: NVR hyper-parameter (1605*(1/(2*dt))^4)
Int: Vector of variance intervention points (0)
dt: Sampling (1)
t: Decimated (or simply smoothed if dt=1) series

deriv: Derivatives
err: Standard error
filt: Filter frequency response
h: Frequency response
w: Frequency axis for plots
y0: Interpolated data
See also IRWSMOPT, FCAST, STAND, DHR, DHROPT, SDP
In the latter case, each input argument is described in turn, followed by the output
arguments and any other information. Note that the default values for any optional inputs
are given in brackets, whilst any necessary inputs, such as the data vector y above, are
listed with an asterix. In this case, the default TVP = 1 implies the following model based
on an IRW plus noise,
yt = Tt + et (1.1)
Tt = 2Tt 1 Tt 2 + t (1.2)
where yt is the time series, Tt is the smoothed signal at sample t, returned by irwsm as the
first output argument, and Tt 1 and Tt 2 are their values at the two previous samples,
respectively. Here, Tt is effectively a time variable parameter, whose stochastic evolution
in the form of an IRW is described by equation (1.2). Finally, t and et are independent

zero mean white noise sequences with variance q 2 and 2 , representing the system
disturbances and measurement noise respectively.
It should be pointed out that the on-line help messages in CAPTAIN are kept deliberately
concise, so that the experienced user can find information quickly (some of the more
advanced functions can have ten or more input arguments). For new users, the Reference
Guide in Chapter 8 provides more descriptive information about each of the options, whilst
the various models implemented in the toolbox are defined in Chapters 2 to 7.
Empty variables [] may be used to indicate default values when a mixture of defaults and
user specified arguments are required. For example, a smoothed trend may be fitted to the
airline passengers series as follows,
>> load air.dat

>> t = irwsm(air, [], 0.0001); % equivalent to t = irwsm(air, 1, 0.0001)
>> plot([air t])
In Chapter 2, the IRW plus noise model is developed within a state space framework,
based on the definition of suitable observation (1.1) and state (1.2) equations. Note that the
3rd input argument to irwsm specifies the associated Noise Variance Ratio (NVR) hyper-
parameter. Defined here as q 2 2 = 0.0001 , this variable is closely related to the
bandwidth of the filter, as discussed in Example 2.1 (Chapter 2).
Demonstrations
The following command initialises the standard MATLAB Demo Window for access to the
on-line demonstrations,
>> captdemo
This simple graphical user interface provides basic background information about
CAPTAIN, slideshows and numerous Command Line demos. The latter demos utilise the
MATLAB Command Window for input and output, as well as generating graphs in a
separate figure window, so make sure the command window is visible while you run these.
Experience suggests that one of the most effective ways to get started with CAPTAIN, is to
examine each Command Line Demo in turn and then to personally adapt them for each
new data set. The following examples are based on two of these demos. The intention is to
provide a brief illustration of toolbox functionality for users already familiar with the
methodology, or at least to introduce some of the ideas to the open minded reader who is
not. In this regard, it should be pointed out that formal stochastic descriptions of the
models are withheld until later chapters. For brevity, note that the straightforward
MATLAB code to label the plots, set the axis limits etc. is not necessarily shown.

0.4
0.3
Response 0.2
0.1
0 10 20 30 40 50 60 70 80 90
800
600
Expenditure
400
200
0
0 10 20 30 40 50 60 70 80 90
Time (sample number)
Figure 1.1 Scaled advertising data plotted against an arbitrary fixed sampling rate.
Top: response to advertising. Bottom: expenditure on advertising.
Example 1.1 Interpolation of advertising data using DLR
Dynamic Linear Regression or DLR provides an excellent vehicle for the analysis of data
in areas such as economic, business and social data, where regression analysis is a popular
method of modelling relationships between variables and where these relationships may
change over time. In this regard, consider the following straightforward demonstration
from the Toolbox, which examines the relationship between a particular companys
expenditure on advertising and their measure of the publics response to this expenditure,
as illustrated in Figure 1.1.
For the purposes of the example, these confidential data have been scaled in an arbitrary
manner, so no units are given in the plots. The output data are in the range 0-1, where a
larger number implies a more successful response to the advertising. It is clear that the
response data contain missing values, represented in MATLAB by special Not-a-Number
or nan values and forming gaps in the top plot of Figure 1.1. The filtering and smoothing
algorithms implemented in CAPTAIN automatically account for these.
For a preliminary analysis of these data, we will utilise the following model,

yt = Tt + bt ut + et t = 1, 2, ..., 90 (1.3)
where yt is the response and ut is the expenditure, while Tt and bt are the time variable
parameters. Finally, et is a serially uncorrelated and normally distributed Gaussian
sequence with zero mean value and variance 2 . Note that a full description of the general
DLR methodology is given in Chapter 4.
For constant parameters Tt = T and bt = b , equation (1.3) takes the form of a conventional
regression model based on the equation of a straight line. However, here we utilise dlropt
to determine if the optimal values of the parameters, in a Maximum Likelihood sense, in
fact vary over time. In this regard, assuming a default random walk model for each of the
parameters, the associated NVR hyper-parameters are estimated as follows,
>> load adv.dat

>> u = adv(:, 1); % expenditure
>> y = adv(:, 2); % response
>> z = [ones(size(u)) u]; % regressors
>> nvr = dlropt(y, z)
nvr =
0.0078
0.0000
While dlropt is running, a window will briefly appear on screen indicating the
optimisation algorithm being utilised, together with an update of the Log-Likelihood. If the
solution fails to converge the optimisation may be terminated by pressing the STOP button,
although this should not prove necessary in the present case.
It should be pointed out that default values for the Toolbox have been carefully chosen, in
order to be as widely applicable as possible. In the present case, the default initial
conditions and optimisation settings converge to a solution without any problem, hence
only the first two input arguments are required.
From this analysis, it appears that the Tt level or trend parameter varies significantly over
time (NVR = 0.0078), while the bt slope parameter is relatively time invariant and has a
NVR value close to zero. To determine the fit and parameters,
>> [fit, fitse, par] = dlr(y, z, [], nvr);
By default, dlr assumes NVRs of zero, so the 4th input argument above is necessary to
specify the previously optimised values. The 3rd input argument selects the model type: in
this case, empty brackets imply the default random walk model again. Examination of the
parameters, returned as the first and second columns of par, show how these evolve
gradually over time.

0.4
0.3
Response 0.2
0.1
0 10 20 30 40 50 60 70 80 90
Figure 1.2 Scaled response to advertising plotted against an arbitrary fixed sampling rate.
Data (circles), DLR fit (solid) and standard errors (dashed).
The model fit and associated standard errors are shown in Figure 1.2, which is obtained
using the code below,
>> plot(y, 'o')

>> hold on
>> plot(fit)
>> plot(fit+2*fitse, ':')
>> plot(fit-2*fitse, ':')
It is clear from Figure 1.2 that the data all lie within the standard error bounds. Note also
that no user intervention was required to interpolate over the missing response data: both
fit and par apply over the entire time series. Refer to Chapter 8 for a full list of
optimisation settings and output arguments. For example, dlr can return the interpolated
output y0, consisting of the original series with any missing data replaced by the model fit.
Example 1.2 Transfer function model estimation using RIV
Many control systems, both classical and modern, are analysed by means of TF models.
Indeed, CAPTAIN has been successfully utilised for the design of control systems for many
years, particularly with regards to the development of Proportional-Integral-Plus (PIP)
control methods (Young et al., 1987; Taylor et al., 2000). One recent practical application
is concerned with forced ventilation in animal houses (Taylor et al., 2003). Here,
uncontrolled data are first collected in order to identify the dominant dynamics of the fan.
For a particular test installation at the Katholieke Universiteit Leuven, the SRIV algorithm,
combined with the RT2 and YIC identification criteria (Chapter 6), reveal that a first order
model with 6 seconds time delay provides the best estimated model and most optimum fit
to the data across a wide range of operating conditions. In a typical experiment, based on a
2 second sampling rate, the SRIV algorithm yields the following difference equation,

yt = 0.438 yt 1 + 79.8 u t 3 (1.4)
where yt is the airflow rate (m3/h) and u t is the applied voltage to the fan expressed as a
percentage. Equation (1.4) shows that the output variable yt , is a simple linear function of
its value at the previous sample and the delayed input variable. Equation (1.4) may
alternatively be represented in terms of the backward shift operator L, i.e. L j yt = yt j , by
the following discrete-time TF model,
79.8 L3
yt = ut (1.5)
1 0.438 L
The response of the model (1.5) closely follows the noisy measured data, as illustrated by
Figure 1.3. These data are included with CAPTAIN for demonstration purposes and the
associated MATLAB commands for estimating the model are shown below.
>> load vent.dat

>> [z, m] = prepz(vent, [], 25);
>> [th, stats, e] = riv(z, [1 1 3 0]);
>> [a, b] = getpar(th)
a =
1.0000 -0.4381
b =
0 0 0 79.7835
>> rt2 = stats(3)
rt2 =
0.9873
>> subplot(211); plot([z(:, 1) z(:, 1)-e]+m(1))
>> subplot(212); plot(z(:, 2)+m(2))
Here, the experimental data are organised into matrix form, with the first column of vent
consisting of the output variable yt , and the second the input variable u t . The function
prepz is utilised to prepare the data for modelling. In particular, the 3rd input argument
subtracts the mean of the first 25 samples from the data in order to remove the baseline
from the series. Such data pre-processing often yields better results in the context of TF
model estimation, as discussed in Chapter 6.
The TF is estimated using riv, where the second input argument defines the model
structure: in this case, 1 denominator parameter, 1 numerator parameter, 3 samples time
delay and no model required for the noise. Refer to Chapters 6 and 7 for a full description
of the TF modelling tools and the syntax required. In particular, note that MISO and
continuous-time models are also possible, while additional functions allow for the
identification of the most appropriate model structure.

3200
3000
m / hour 2800
3
2600
2400
2200
0 50 100 150 200
36
34
%
32
30
28
0 50 100 150 200
Time (2 second samples)
Figure 1.3 Top: ventilation rate (m3/h) and response of the identified TF model (thick trace)
Bottom: applied voltage to the control fan expressed as a percentage.
The first riv output argument, th, is a matrix containing information about the TF model
structure, the estimated parameters and their estimated accuracy. In this case, getpar is
utilised to extract the required parameter estimates for later control system design. Note
that these parameter vectors include the leading unity of the TF denominator, and that the
time delays are represented as zero valued elements in the numerator. The second riv
output argument, stats, lists nine statistical diagnostics associated with the model,
including RT2 = 0.9873 , implying that the model describes nearly 99% of the variation in
the data. Finally, the modelling errors are returned as the variable e and are used in the
code above to compare the TF response with the original data, as shown in Figure 1.3. In
this graph, the baseline is returned to the series.
Note that the built-in MATLAB function filter may also be employed to simulate the TF
response using these parameter vectors. As discussed in Chapter 6, filter can be useful for
simulation and (if estimates of the future input variable are available) forecasting purposes.
1.4 How to use this book
This publication is primarily intended as a tutorial guide to the data-based mechanistic

modelling philosophy developed by Peter Young and colleagues over many years. In this

regard, the chapter headings follow a logical structured progress through the relevant
methodology, using worked examples throughout the text.
Time variable parameter modelling is introduced in Chapter 2. Here, the filtering

algorithm, smoothing algorithm, generalised random walk model and hyper-parameter
optimisation routines are formally described. This chapter presents the models in their
most general state space form, while the following three chapters introduce the various
special cases, namely: unobserved component models (Chapter 3); dynamic regression
models (Chapter 4); and state dependent parameter models (Chapter 5). Next, discrete-time
(Chapter 6) and continuous-time (Chapter 7) transfer function models are considered,
followed by a chapter on control system design (Chapter 8).
Finally, Appendix 1 lists each CAPTAIN function in alphabetical order, showing the calling
syntax, together with a brief description of the associated input and output arguments.
Appendix 1 is designed to augment the concise on-line help messages. To learn about a
particular model, turn to Table A2 for the appropriate function name, then to Appendix 1
for its description. The See Also section for each entry in Appendix 1 lists the relevant
worked examples from the text.
Some of the algorithms discussed here have been in constant use for over 20 years. The
present authors hope that the CAPTAIN toolbox for MATLAB will allow interested
researchers to add to the ever expanding list of successful applications, which already
includes time series analysis, forecasting and control of numerous biological, engineering,
environmental and socio-economic processes (see references above).

CHAPTER 2
STATE SPACE MODELS
While Chapter 1 provided a general overview of CAPTAIN, the present chapter turns to one
of its main methodological tools, the State Space (SS) model. Indeed, most of the models
in the toolbox, though not all, are implemented in such a SS form. Originating from the
state-variable method of describing differential equations, the SS approach has
subsequently been developed for modelling by researchers in many different scientific
disciplines and is, perhaps, the most natural and convenient approach for use with
computers. It is a general and flexible tool that encompasses numerous time series models.
In fact, a number of models uniquely available in CAPTAIN are inherently based on such
state space methods and, therefore, always require a SS formulation. This includes the
entire category of Time Variable Parameter (TVP) models considered in Chapter 4. There
are other practical situations in which it is particularly convenient, though not essential, to
also use a SS model, e.g. when there are missing values in the time series or when
interpolation, forecasting and backcasting operations are required. Finally, in certain cases,
the SS form simply offers a solution that is identical to other, sometimes better known,
conventional approaches.
For these reasons, whenever a SS form is necessary or convenient, CAPTAIN uses it, while
in a few exceptional cases, it is avoided because other superior approaches are available.
For example, the instrumental variable method for the estimation of fixed parameter
Transfer Function (TF) models replaces the state space- based Maximum Likelihood (ML)
approach by default, unless the latter is specifically chosen (Chapter 6). However, such
issues are usually transparent to the user, since CAPTAIN chooses the optimal modelling
strategy internally.
The present chapter provides an introduction to SS methods and may, therefore, be

regarded as the broad theoretical basis for the special cases discussed subsequently. In this
regard, a number of standard topics are briefly covered, typical of many publications and
textbooks in this field. However, several aspects are novel and exclusive to CAPTAIN and
these are highlighted in the text where appropriate.
Chapter 2 State Space Models
Almost all of the modelling functions in CAPTAIN might be quoted in this chapter because
they are particular cases of the general SS framework! However, to help the reader digest
the general principles, we will restrict the present discussion to the simplest possible
models, leaving the more advanced cases for subsequent chapters. In this regard, the key
modelling functions covered below are irwsm and irwsmopt, while fcast, reconst, acf and
histon are also introduced for data preparation and diagnostics. Together, these tools are
useful for exploratory analysis or in situations where the a priori knowledge about the
system is minimal, as will be seen in the later worked examples.
2.1 The State Space framework
A SS system is composed of two sets of equations, namely: (i) the so called State
Equations (represented as a single equation in vector-matrix form below), that reflect all
the dynamic behaviour of the system by relating the current value of the states to their past
values, together with any deterministic and stochastic inputs; and (ii) the Observation
Equation that defines how these state variables are related to the observed data. Although
there are a number of different SS formulations possible, the one favoured in CAPTAIN is:
State Equations : x t = Fx t -1 + Gt 1
(2.1)
Observation Equation : y t = H t x t + et
where y t is the stochastic observed variable; x t is an n dimensional stochastic state vector;

t is an k dimensional vector of system disturbances; and et is a vector of zero mean
white noise variables (measurement noise). F, G and H t are, respectively, the nxn, nxk, and
1xn non-stochastic system matrices.
The following assumptions apply to this system:
t ~iid N (0, Q ) ; et ~iid N (0, 2 ); cov (t , et ) = 0 .
The initial stochastic state x 0 is independent of t and et for every t.
The system matrices F, G , H t , Q and 2 are known (or have been previously
estimated in some way) whereas the initial conditions for the states and their
covariance matrix ( x 0 and P0 respectively) are unknown.
The main reason for this formulation is that, under these straightforward conditions, the
associated recursive algorithms employed to estimate the state vector from measured data
provide the optimal solution, in the sense that such estimators minimise the Mean Square
Error (see e.g. Young, 1984; or Harvey, 1989). In CAPTAIN, these recursive algorithms are
the Kalman Filter (KF; Kalman, 1960) and Fixed Interval Smoothing (Bryson and Ho,
1969), as discussed in Section 2.2 below.

Numerous results may be found in the literature for various relaxations of the above
conditions (see e.g. Durbin and Koopman, 2001). One particularly straightforward and
useful case is when the gaussianity assumptions on the perturbations are dropped. In this
case, the state estimates provided by the KF/FIS are still optimal in the sense that they
minimise the mean square error within the class of all linear estimators.
The Generalised Random Walk model
The following Generalised Random Walk (GRW) model, which is one of the simplest SS
models based on (2.1), is used extensively in CAPTAIN,
x1t x1t 1 1t x
= + Tt = (1 0 ) 1t (2.2)
x 2t 0 x 2t 1 2t x 2t
Here, , and are constant parameters; Tt is a smoothed signal component consisting

of the first state x1t ; and x 2t is a second state variable (generally known as the slope);
while 1t and 2 t are zero mean, serially uncorrelated white noise variables with constant
block diagonal covariance matrix Q , as stated in the general formulation above.
This model subsumes a number of special cases that have received specific names in the
literature. The main ones are the Random Walk (RW: = 1 ; = = 0 ; 2 t = 0 );
Smoothed Random Walk ( 0 < < 1 ; = = 1 ; 1t = 0 ); the Integrated Random Walk
(IRW: = = = 1 ; 1t = 0 ); the Local Linear Trend (LLT: = = = 1 ); and the
Damped Trend ( = = 1 ; 0 < < 1 ).
When any one of these options is combined with additive noise in an observation equation,
the resulting time series model might be called a GRW plus noise model. One such case
is the RW plus noise model, ideal for the estimation of a time varying mean. Here, the SS
representation takes the following form (cf. equations (1.1) and (1.2) in Chapter 1):
State Equation : xt = xt 1 + t 1
(2.3)
Observation Equation : y t = x t + et
The state equation may instead be written as (1 L )xt = t 1 where, as defined in Chapter 1
(Example 1.2), L is the backward-shift operator. Substituting in the observation equation,
we obtain the so called reduced form of the model, i.e. a random walk model with added
observational noise as shown below,
t 1
yt = + et (2.4)
(1 L )

Similarly, the IRW plus noise model, useful for extracting smooth trends to non-
stationary time series, may be developed as follows,
x 1 1 x1t 1 0
State Equations : 1t = + t 1
x2 t 0 1 x 2 t 1 1
(2.5)
x
Observation Equation : y t = (1 0) 1t + et
x2 t
Here, the two state equations that may instead be written as,
(1 B )x1t = x2t 1
(2.6)
(1 B )x2t = t 1
In this case, substituting the second state equation into the first, and then into the
observation equation, we obtain the reduced form,
t 2
yt = + et (2.7)
(1 L )2
This IRW model with observational noise will be utilised in Example 2.1 below. First,
however, an algorithm for the estimation of the states is required.
2.2 State Estimation
Given the model (2.1), in which all the system matrices are known, the estimation problem
becomes one of finding the optimal distribution of the state vector, conditional to all the
data in a sample. In the case of Gaussian disturbances, the distribution is completely
characterised by the first and second order moments, i.e. the mean and variance, and most
algorithms that perform this operation concentrate on the estimation of these two moments.
Such tools include the Kalman Filter (KF, Kalman, 1960, Kalman and Bucy, 1961) and
Fixed Interval Smoothing (FIS, e.g. Bryson and Ho, 1969) algorithms stated below.
1. Forward Pass Filtering Equations
Prediction:
x t|t 1 = Fx t 1
P t|t 1 = FP t 1F T + GQ r G T
Correction: (2.8)
[
x t = x t|t 1 + P t|t 1H Tt 1 + H t P t|t 1H Tt ] {y H x }
1
t t t |t 1
P t = P t|t 1 P t|t 1H Tt [1 + H P
t t |t 1 H t] H P
T 1
t t |t 1

2. Backward Pass Smoothing Equations
[
x t|N = F 1 x t +1| N + GQ r G T L t ]
[ T
][
L t = I P t +1H Tt+1H t +1 F T L t +1 H Tt+1{ yt +1 H t +1x t +1} ] (2.9)
[ ]
P t|N = P t + P t F T P t+11|t P t +1| N P t +1|t P t+11|t FP t
with LN = 0 . Note that the FIS algorithm is in the form of a backward recursion operating
from the end of the sample set to the beginning. The main difference between the KF and
FIS algorithms (2.8)-(2.9) utilised in CAPTAIN, and more conventional filtering/smoothing
algorithms found in some other toolboxes, are the nxn Noise Variance Ratio (NVR)
matrix Q r and the nxn matrix P t defined below,
Q P t*
Qr = ; P t = (2.10)
2 2
Here, P t* is the error covariance matrix associated with the state estimates x t . In most of
the models implemented in CAPTAIN, the NVR matrix Q r is diagonal.
Forward Pass Filtering
For a data set of T samples, the KF algorithm (2.8) runs forward and returns a filtered
estimate of the state vector and its covariance matrix ( x t and Pt , respectively) at every
sample t, based on the time series data up to sample t. These estimates are computed in two
steps. In the first instance, the one step ahead forecast for the mean value of the state vector
and its covariance matrix ( x t t 1 and P t t 1 , respectively) are obtained from the prediction
equations, using the model alone. Secondly, these estimates are updated by means of the
correction equations, as each new data sample becomes available.
One interesting feature of the KF worth mentioning at this juncture, is that the state
estimates (obtained using the first equation of each set of prediction and correction
equations) depend on the previous state estimates, its covariance matrix and the data. By
contrast, the current estimate of the covariance matrix itself, does not depend on either the
state estimates or the data, but only on the model and the previous estimate of the
covariance matrix. Another noteworthy point, is that significant saving in computational
effort are obtained in CAPTAIN by the realisation that repeated operations are involved in
the calculation of the required estimates, especially in the correction equations.
There are two intermediate variables calculated by the KF that are of particular importance
(as will become clear in the following section 2.3), namely:

vt = y t H t x t|t 1 (2.11)
ft = 1 + H t P t|t 1H Tt
The first term above is the innovations sequence, or the one-step-ahead forecasting errors
of the model, while the second is the variance of these innovations, where the latter are
equal to the variance of the one-step-ahead forecasts. Both variables are scalar, so that the
matrix inversions required in the KF correction equations are actually very simple in
computational terms. The normalised innovations are used in many diagnostic tests, since,
recalling the original formulation of the SS model (2.1), they should be iid N (0, 2 ) in a
correctly specified model.
When missing data are encountered within the data set, the correction equations are
redundant and missing samples are effectively interpolated by using the filtered estimates.
In the same manner, forecasts may be produced by artificially adding missing data at the
end of the series. In this case, the values of vt and ft outside the sample span are now the
true multiple-steps-ahead forecasting errors and associated variance respectively, and may
be used for computation of the confidence intervals.
Fixed interval smoothing
The FIS algorithm in (2.9) normally runs backwards after the filtering step and yields a
smoothed estimate of the state vector and its covariance matrix based, at every sample t,
on all T samples of the data. This means that, as more information is used in the FIS
algorithm with respect to the KF estimates, its Mean Square Error cannot be greater. When
missing data are encountered, an interpolation is generated based on the data at both sides
of the gap. Finally, if the missing observations are at the beginning of the sample, the FIS
algorithm generates backcasts of the time series.
The FIS algorithm (2.9) is utilised by many software packages and is the default
so called Q-algorithm in CAPTAIN. However, an additional FIS algorithm is also available
as an option in the toolbox. It should be used in special cases when numerical problems
arise with certain models. Essentially, if a solution fails to converge, the appropriate input
argument can be changed to select this alternative P-algorithm, obtained by replacing the
first equation in (2.9) by,
x t|N = x t +1|t P t F T L t +1 (2.12)
Here, the new estimates of the state vector are based on the filtered ones, while in (2.9) it is
calculated recursively from the previous estimate of the state vector. Furthermore, (2.12)
does not require inversion of the F matrix. Both algorithms are discussed in detail by
previous publications (see e.g. Young, 1984; Harvey, 1989; Ng and Young, 1990).

Example 2.1 Estimation of a trend for the air passenger series
The simple IRW plus noise model (2.7) has long been used for smoothing time series in
the economics literature. It is equivalent to the well-known Hodrick-Prescott filter (HP;
Hodrick and Prescott, 1997). This filter has been used by many researchers in the area of
empirical business cycles, and the similarities between both filters was highlighted long
ago (recently reviewed by Young and Pedregal, 1996).
Many researchers use fixed values of the smoothing constant depending on the frequency
rate of the data. In CAPTAIN, however, the relationship between the smoothing constant
(the NVR in this context) and the cut-off frequency properties of the low-pass filter is
made explicit, and is easily generalisable to any type of model in SS form. The advantage
is that the user may choose the NVR according to their particular needs and the properties
of the data, it is not fixed by any prior conceptions. For example, given the RW family of
models defined by (cf. (2.6) and (2.7)),
t j
yt = + et (2.13)
(1 L ) j
the cut-off frequency for 50% of spectral power is given by,
w = arccos1 0.5 NVR j

1
(2.14)

2
with NVR = 2 (Young and Pedregal, 1996).

In this case, Q r in equation (2.10) is the scalar state noise variance 2 . For an IRW filter
(i.e. j = 2 ), Table 2.1 shows the relationship between the NVR and the associated cut-off
period, first in samples, then with its equivalence in years depending on the frequency
sampling of the signal. Table 2.1 suggests that for a NVR value of 0.001, the smoothed
signal would contain all the information of the original signal from a period of 35.28
samples up to infinity. All periods below that value are filtered out.
NVR Period (time samples) Years (quarterly series) Years (monthly series)
10 2.86 0.71 0.24
1 6.01 1.51 0.51
0.1 11.02 2.76 0.92
0.01 19.78 4.95 1.65
0.001 35.28 8.82 2.94
1/1600 39.69 9.92 3.31
0.0001 62.81 15.71 5.23
Table 2.1 Relationship between the NVR parameter and the associated bandwidth of the IRW filter,
i.e. the minimum period of cycles included in the filtered series.

Put another way, if a signal has to be estimated such that it contains all the information of
the original series for approximately 5 years and above, a NVR of 0.01 would be the option
for a quarterly series, while 0.0001 should be chosen for a monthly series.
To illustrate these points, consider again the well-known air passenger series introduced in
Chapter 1 (Box and Jenkins, 1970, 1976). Figure 2.1 illustrates these data, together with
two possible trends obtained from different NVR values (0.1 and 0.0001). This graph may
be obtained in CAPTAIN by entering the following MATLAB code,
>> load air.dat

>> t1 = irwsm(air, 1, 0.1);
>> t2 = irwsm(air, 1, 0.0001);
>> t = (1949 : 1/12 : 1961-1/12)';
>> plot(t, [air t1 t2]);
600
Thousand Passengers
500
400
300
200
1950 1952 1954 1956 1958 1960

Months
Figure 2.1 Air passenger data (thin trace) and IRW trends with
NVR = 0.1 (thick solid) and NVR = 0.0001 (dashed).
Which one of these trends is the best? Although this is clearly a subjective matter, we may
still say something about Figure 2.1, based on a general idea of what we mean by a trend.
In particular, with the higher NVR of 0.1, the smoothed signal follows the data too closely
to be regarded as a trend in the normal sense, since it includes cyclic behaviour with a
period of less than one year, i.e. it is really a combination of a trend and a seasonal
component, something that in principle is undesirable. By contrast, with NVR = 0.0001,
the smoothed signal does not appear to combine the trend and seasonal component in this
manner, and so is more correct in a signal decomposition sense.
In statistical terms, a more suitable approach would be to model the whole series with an
Unobserved Components (UC) model, rather than attempt to extract the trend alone. Of
course, this is also the main approach utilised in CAPTAIN and is described in Chapter 3.
Nonetheless, CAPTAIN does offer the opportunity to just estimate a trend in this simple,
exploratory context, using objective criteria for NVR optimisation, as discussed below.

2.3 (Hyper-) Parameter Estimation
The recursive KF and FIS algorithms above both require knowledge of all the system
matrices F, G, H t and Q r . In this regard, depending on the particular structure of the model
chosen, there will be a number of elements either known prior to the analysis or fixed by
the user. For example, if a RW model is specified, then clearly = 1 and = = 0 in the
SS model (2.2). However, in most cases, some unknown elements or hyper-parameters
will remain unspecified and must be estimated separately. Typically, these include the
NVR matrix Q r . In the following discussion, we will summarise all these hyper-
parameters, whatever variables they may represent, in the vector .
The hyper-parameter estimation problem is completely different to the state estimation

problem and, in a certain sense, entirely independent of it. This fact is clearly seen in the
range of methods available in the literature, some of which do not use the SS form at all!
The most common methods include: Maximum Likelihood (ML) in the time domain
(Schweppe 1965; Harvey, 1989); ML in the frequency domain based on a Fourier
transform (Harvey, 1989; pages 191-204); alternative approaches in the frequency domain
(Ng and Young, 1990; Young et al., 1999); combinations of all the previous methods
(Young and Pedregal, 1999); Bayesian approaches (West and Harrison, 1989); and
estimation methods based on the reduced ARIMA form (Hillmer and Tiao, 1982; Hillmer
et al., 1983; Maravall and Gmez, 1998).
A selection of the most useful methods are implemented in CAPTAIN and listed below.
Each method is associated with one of more different model-types in the toolbox. In some
cases, several methods are available for the same type of model, with the one considered
optimal as the default option (refer to the on-line help for each particular function).
Time domain ML estimation for UC, TVP and State Dependent Parameter (SDP)
models (Chapters 3, 4 and 5). This is the most widespread estimation method in the
SS context, mainly because of its strong theoretical basis and because it is the most
well known approach in many other areas of statistics. For this reason, it is
discussed in detail below. See also Examples 2.2 and 2.4 below.
Minimisation of the multiple-steps-ahead forecasting errors, also discussed below.

This heuristic method is very useful when other methods do not provide a
satisfactory answer to the problem, or when the objective of the research is strongly
based on the forecasting performance of model. See also Examples 2.2 and 2.3
below.

Frequency domain estimation, based on the spectral properties of the model (Young
et al., 1999). The parameters are estimated so that the logarithm of the model
spectrum fits the logarithm of the empirical pseudo-spectrum (either an AR-
spectrum or periodogram) in a least squares sense. A full description of this
algorithm can be found in Chapter 3.
Sequential Spectral Decomposition, reserved for a certain class of unobserved

components models, i.e. Trend plus Auto-Regression (AR) also discussed in
Chapter 3. This approach consists of decomposing the original series into quasi-
orthogonal components, taking advantage of the exceptional spectral properties of
the smoothing algorithms mentioned above. The overall non-linear problem is
decomposed into several linear or quasi-linear steps, each solved in fully recursive
terms. This yields a simple solution, with some loss of optimality from the ML
viewpoint, but has proven to be very successful in practise. As a final step, filtering
and smoothing are repeated using the whole SS formulation based on the analysis
completed in the previous steps.
Instrumental variable estimation of transfer function models in discrete and

continuous time, as discussed in Chapters 6 and 7.
One interesting issue is that the complexity (or richness) of reality ensures that no single
estimation method outperforms the rest in all possible situations - all of them have their
own advantages and disadvantages. Therefore, the researcher's experience and knowledge
is essential in selecting the best option for each application.
Because some of the methods mentioned above have been developed for specific models,
their description is conveniently postponed to future chapters. In the present chapter,
however, two general methods that are not intrinsically linked to particular models are
reviewed, namely ML and the minimisation of the multiple-steps-ahead forecasting errors.
Maximum Likelihood
Assuming that all the disturbances in the SS form are normally distributed, the required
Log-likelihood function can be computed using Kalman Filtering via prediction error
decomposition (Schweppe 1965; Harvey, 1989). The appropriate function for the general
SS model in equation (2.1) is, therefore,
1 T 1 T v 2
log L() =
T
log 2 log ft t (2.15)
2 2 t =1 2 t =1 ft

where T is the number of observations, while vt and ft are the innovations and their
variance respectively (see equation (2.11)), computed directly from the KF algorithm.
A number of issues must be taken into account when maximising (2.15); see e.g. Harvey,
(1989) and Koopman et al. (2000). In the first place, the gradient and hessian necessary to
find the optimum by numerical procedures can be evaluated either analytically or
numerically, while the standard errors of the estimates may be found by means of the
hessian, as is usual in the literature. Secondly, in the present context, it is usual to
maximise the so called concentrated likelihood. This is because it is always possible to
concentrate out one of the variances in 2 or Q , reducing by one the number of
parameters to estimate. In CAPTAIN, 2 is always the concentrated out variance, as shown
by the definition of the diagonal NVR matrix Q r introduced in the previous subsection.
Thirdly, in dynamic models, there always exists a problem of defining the initial
conditions, i.e. the initial values of the state vector and its covariance matrix that are
assumed unknown ( x 0 and P0 , respectively). Once more, there are a number of solutions
available. One is to define the diffuse priors, e.g. zero values for the initial state vector and
big values for the diagonal elements of its covariance matrix, implying that there is little
confidence in an arbitrary initialisation. In general, such an initialisation may affect the
computation of the likelihood function, since the sum operators run over the whole sample.
An alternative solution is to start the summation in (2.15) from one observation past the
length of the state vector (i.e. n+1), or from some other point where the KF algorithm has
effectively converged. The effect of the initial conditions decreases as the length of the
series increases, except for some very specific non-stationary models for which such
effects never disappear (Casals and Sotoca, 2000). Another, more theoretical, solution to
the initial condition problem, is to incorporate the distribution of the initial conditions into
the likelihood function itself, which yields the exact likelihood function, independent of
such conditions and useful for short length time series (see e.g. Terceiro, 1990; De Jong,
1988, 1991; Casals and Sotoca, 2000). In CAPTAIN, the simplest diffuse priors option is
chosen by default, although the user may always intervene and specify x 0 and P0 directly.
Finally, a consideration common to all the estimation procedures in CAPTAIN, is that it is

common to estimate the scores (certain transformations of the hyper-parameters) rather
than the hyper-parameters themselves. Such parameters are then constrained to a certain
domain in order to avoid nonsensical results. Two typical examples are that the NVR
values should always be positive, while the parameters in SRW models should always
lie between zero and one inclusive. In this regard, the NVR and scores used in CAPTAIN
are listed in Table 2.2.

Score Parameter Score Range Parameter Range
NVR = 10 ( , ) [0, )
= e (1 + e ) ( , ) [0,1]
Table 2.2 Hyper-parameters and scores in CAPTAIN.
In this way, the searching algorithm for ML looks for an unconstrained value of the scores
from minus infinity to infinity, and it is transformed into a valid value of the corresponding
hyper-parameters. The disadvantage when the scores are estimated rather than the
parameter themselves is that the distribution of the scores is normal, according to theory,
while the distribution of the parameters is, in general, not known. However, confidence
intervals may always be reconstructed by applying the same transformation to the
confidence interval of the scores.
Limitations of ML
There is no doubt that ML has theoretical and practical advantages. Its optimal properties
are well-known and it is a widely applicable method for SS models. Indeed, the objective
function (2.15) is applicable to any model written in SS form. The only requirement is that
the user has to determine the appropriate form of the system matrices necessary to specify
the model in SS terms. However, since CAPTAIN uses SS models as standard, this is not a
problem in the present context.
Despite the advantages of ML on theoretical grounds, it does have some disadvantages,

hence other methods are also implemented in CAPTAIN for specific types of model. In
particular, in its normal form, ML is heavily dependent on the length of the series and the
dimension of the model, since the recursive algorithms must be used to compute the Log-
likelihood function at each iteration in the numerical optimisation. Furthermore, problems
are sometimes encountered when the theoretical hypothesis on which the model is based
does not hold in practice. An instructive example is the case of transfer function models
when the input signal is a deterministic variable (such as steps, impulses, ramps, etc.). For
this particular problem, however, CAPTAIN offers instrumental variable methods instead,
which outperform ML and are never worse in other standard situations (Young, 1984), as
discussed in Chapters 6 and 7.
Also, it is well-known (e.g. Young et al., 1999) that in certain UC models, the likelihood
surface can be quite flat around its optimum, making the practical optimisation problem
very inefficient at best and impossible in some cases. This is one important reason why
these models usually have to be constrained when estimated by ML (we will come back to
this issue in Chapter 3: see the examples therein). By contrast, for example, estimation

methods in the frequency domain are free from these difficulties. In this regard, the
approach implemented in CAPTAIN for UC models, optimises the hyper-parameters so that
the logarithm of model spectrum fits the logarithm of the empirical spectrum in a least
squares sense. This method has proven to allow for very much higher dimensional models
than ML and computation times are greatly reduced, yielding solutions that are even better
in likelihood terms than the constrained versions of the same model estimated by ML.
It should be clear from the discussion above that the solution achieved by the time and
frequency domain methods are not necessarily equal, because each estimation method
gives more emphasis to different aspects of the time series. Since the ML criterion is
defined as an explicit, time domain function of the normalised one-step-ahead prediction
errors (2.15), it would be expected that the hyper-parameters estimated in this way would
provide good one-step-ahead forecasts. Indeed, provided all the theoretical assumptions
that support ML methods are correct for a particular data set, it can be expected that the
forecasts will be good for more than one-step-ahead as well. By contrast, estimation in the
frequency domain is primarily concerned with ensuring that the spectral properties of the
estimated components match the empirical spectrum of the data. In general, this tends to
generate a solution which, in terms of the variance of the innovations, lies somewhere
between that obtained by the ideal of unconstrained ML optimisation, which is difficult to
consistently achieve in practice, and the constrained ML optimisation mentioned above.
Another situation where ML often does not always provide a sensible solution is when a
smooth signal (or trend) is removed from a time series. In this case, ML tends to produce a
signal that is too close to the original data, making the whole procedure invalid, as shown
by Example 2.2 below. For such situations CAPTAIN, offers other estimation methods, like
the minimisation of the multiple-steps-ahead forecasting errors discussed in the next
subsection.
Given the pros and cons of each method, it is possible to improve overall model estimation
by combining methods for different components of the model, i.e. to use the most
appropriate method for each component. This is the approach suggested for SDP models
for which, when necessary, frequency objective functions and ML are used together. SDP
modelling is a very general procedure for the identification of non-linear relationships of
many kinds. In this approach, an iterative procedure known as back-fitting is used. Here,
parts of the model are estimated conditional on fixed values of the rest of parameters. In
each iteration, the most appropriate estimation method can be used, either in the frequency
domain or ML in the time domain.

Minimisation of the multiple-steps-ahead forecasting errors
The log-likelihood function (2.15) is dependent on the one-step-ahead forecasting errors,

i.e. vt = yt H t x t|t 1 , a natural outcome of the KF solution. However, the multiple-step-
ahead forecasting errors can also be computed by simply repeating the KF prediction
equations, say h times, without applying the correction equations. The associated
forecasting errors are then as follows,
vt (h ) = y t H t x t|t h (2.16)
This equation computes the forecasting errors at each t with all the information available
up to t-h. Given these errors, one interesting option for hyper-parameter estimation is to
minimise the sum of the squares of the h-step ahead errors, i.e.
(y H t x t|t h )
T T
J= v (h )
t = n + h +1
t
2
=
t = n + h +1
t
2
(2.17)
This approach is a natural heuristic extension to ML and may be applied when the latter
does not provide a sensible solution to certain problems.
2.4 Worked examples
The following examples illustrate the basic functionality of hyper-parameter estimation in

CAPTAIN, and show how the toolbox may be utilised for the preliminary analysis,
smoothing and simple forecasting of non stationary time series.
Example 2.2 Hyper-parameter estimation for the air passenger series
Following directly from Example 2.1 above, consider again IRW smoothing of the air
passenger series. Unfortunately, this is a typical case when ML estimation does not yield
useful results! This is because the theoretical properties assumed about the IRW model are
not fulfilled by these particular data. In effect, the model (2.5) assumes that the
perturbations about the trend are uncorrelated white noise, while it is clear here that the
innovations are not at all white noise, indeed they are strongly periodic (see Figure 2.1).
As a consequence, if the model is defined as an IRW model alone, then ML estimation will
yield a solution that attempts to fit the entire series and all its components using this simple
trend. In practice, this means that the NVR will be very high, in order that the model
output includes all the necessary frequencies in the original series, as shown below.
>> load air.dat

>> [nvr, opts, separ] = irwsmopt(air, 1, 'ml');

METHOD: MAXIMUM LIKELIHOOD

OPTIMISER: FMINU
0.55 seconds.
PER. RW NVR Score S.E. Alpha Score S.E.
0.00 1.0 1.7377e+012 12.2400 0.0000 1.0000 - -
Likelihood: -723.9385
>> tr = irwsm(air, 1, nvr);
For information, Table 2.3 summarises the displayed tabular output from irwsm, while the
header and footer above also confirm the optimisation method, the core MATLAB function
being utilised, the time taken to complete the optimisation and the Likelihood. Other
hyper-parameter optimisation routines in CAPTAIN, including dhropt, dlropt, daropt,
darxopt, dtfopt and univopt follow a similar convention. Note that the MATLAB
optimisation function (fminu above) is automatically selected by CAPTAIN, depending on
the optimisation method chosen and the presence other MATLAB toolboxes installed on
the users system, although this default can be overwritten (refer to Chapter 8 and the on-
line help for each function).
Column Header Meaning

PER Period for periodic components (0 for trends).
RW Type of Time Varying Parameter within the GRW model family
(0-RW; 1-IRW). See Chapter 3 for details.
NVR Estimated NVR.
Score Estimated score value from which the hyper-parameter is computed
(only one NVR in the case above).
S.E. Approximate standard error of the score.
Alpha Estimated parameter in SRW models.
Table 2.3 Tabular outputs from CAPTAIN hyper-parameter estimation functions.
Returning to the present example, the smoothed signal using an NVR of 1.7e12 is almost
identical to the original series. It is plainly not a trend at all and so is not illustrated here.
An alternative is to minimise the 12 steps ahead forecasting errors, as shown below. In
general, the number of samples in a year, or the number of samples in any sort of cycle in
the data, should be utilised in this optimisation. The NVR obtained in this case is 5.57e-4,
with an associated cut-off period of 3.4 years (Table 2.1), returning a trend that is much
more useful, as illustrated by Figure 2.2.
>> [nvr, opts, separ] = irwsmopt(air, 1, 'f12');

METHOD: SUM OF SQUARES OF 12-STEPS-AHEAD FORECAST ERRORS
OPTIMISER: LEASTSQ
5.11 seconds.
0.00 1.0 5.5923e-004 -3.2535 0.120 1.0000 - -
Sum of Squares of 12-Steps-Ahead Forecast Errors: 268133.0759
>> tr = irwsm(air, 1, nvr);
>> plot(t, [air tr])

600
Thousand Passengers
500
400
300
200
1950 1952 1954 1956 1958 1960

Months
Figure 2.2 Air passenger data and trend chosen by minimising the 12-steps-ahead forecasting errors.
Finally, it is straightforward to generate forecasts of the trend by adding nan values

(MATLAB Not-A-Number variables) to the end of the data, either by using standard
commands or the CAPTAIN function fcast,
>> nvr = irwsmopt(air(1 : 132), 1, 'f12');

>> tr = irwsm(fcast(air, [133 144]), 1, nvr);
>> plot(t, [air tr]);
>> hold on
>> plot([t(132) t(132)], [0 650])
For illustrative purposes, the analysis above does not use the final year of the air passenger
series, just the first 132 samples. Instead, the trend is forecasted for this year and compared
with the original series, as illustrated in Figure 2.3, where the vertical line shows the
forecasting horizon. It should be stressed that data to the right of the forecasting horizon
are not used in the analysis, neither for estimating the NVR nor for smoothing the time
series. Nonetheless, even with this simple IRW model, the forecasted trend is sensible,
showing the long term behaviour of the series continuing in the correct direction.
600
Thousand Passengers
500
400
300
200
1950 1952 1954 1956 1958 1960

Months
Figure 2.3 Air passenger data, trend and trend forecasts for the last year of data.

Example 2.3 Interpolation and variance intervention for steel consumption in the UK
In order to illustrate the variance intervention and missing data handling capabilities of
CAPTAIN, the quarterly steel consumption in the UK, measured in thousands of metric tons,
is analysed from the last quarter of 1953 until the end of 1992. Variance intervention is
useful when there are rapid or violent changes, or even discontinuities, in a series. Here,
the estimate of P t|t 1 in the prediction equation (2.8) is reset to a large value, implying a
lack of confidence in the estimates at that sample.
In order to plot the steel consumption with a correct time axis, as in Figure 2.4 below, enter
the following MATLAB commands:
>> load steel.dat

>> t = (1953.75 : 0.25 : 1992.75)';
>> plot(t, steel)
500
Thousand Metric Tons
400
300
200
1955 1960 1965 1970 1975 1980 1985 1990

Quarters
Figure 2.4 UK steel consumption from 1953Q4 to 1992Q4.
The anomalous large consumption figures in the first and second quarters of 1980 (referred
to subsequently as 1980Q1 and 1980Q2) were due to strikes in the sector. Furthermore,
two apparent falls in the mean level of steel consumption can be observed in 1975Q2 and
1980Q1. One way of handling these problems at the simple exploratory level is to smooth
the series, assuming missing data (again, MATLAB Not-A-Number values) for the strikes
and setting variance intervention points at 1975Q2 and 1980Q1, as shown below.
Note that the standard errors of the estimated score is not shown in the tabulated results
below since, in order to reduce computation time slightly, it is not explicitly asked for in
this particular call to irwsmopt (compare with Example 2.2). The graphical output of this
analysis is illustrated in Figure 2.5, where the jumps in the trend at the variance
intervention points can be clearly seen and the perturbation about the trend (lower plot)
appropriately shows that there are no remaining jumps in the series.

>> y = fcast(steel, [106 107]);

>> nvr = irwsmopt(y, 1, 'f12', [87 106]);
METHOD: SUM OF SQUARES OF 12-STEPS-AHEAD FORECAST ERRORS
OPTIMISER: LEASTSQ
10.76 seconds.
2 missing values
0.00 1.0 6.455e-005 -4.1901 - 1.0000 - -
Sum of Squares of 12-Steps-Ahead Forecast Errors: 1756523.0328
>> tr = irwsm(y, 1, nvr, [87 106]);
>> subplot(211), plot(t, [y tr])
>> subplot(212), plot(t, y-tr)
500
450
400
350
300
250
1955 1960 1965 1970 1975 1980 1985 1990
Months
50
-50
1955 1960 1965 1970 1975 1980 1985 1990

Months
Figure 2.5 UK steel consumption, trend and perturbation about the trend.
Finally, it is sometimes interesting to reconstruct the data by assuming that the jumps in the
trend did not occur, as shown below. Here, the CAPTAIN function reconst serves as a tool
to remove the jumps, with the new estimate of the trend added to the perturbation signal.
The value of such a calculation will be shown later in Chapter 3.
>> yr = y - tr + reconst(tr, [87 106]);

>> plot(t, yr)

550

500
450
400
350
300
1955 1960 1965 1970 1975 1980 1985 1990
Months
Figure 2.6 UK steel consumption with the jumps in the trend removed.
Before moving to the final example, it is useful to first introduce the Autocorrelation
(ACF) and Partial Autocorrelation (PACF) Functions, together with the Ljung-Box test,
since these will be utilised below and occasionally in later chapters.
Autocorrelation (ACF) and Partial Autocorrelation (PACF) functions
The ACF and PACF are two key identification tools, very useful for detecting time
structure dependence in any time series. Popularised by Box and Jenkins (1970), they have
been extensively used by time series analysers ever since. The aim is to determine the
linear correlation coefficients between a time series and the lagged values of the same
series. The representation of these coefficients against the lag, usually in the form of a bar
diagram, is the ACF or Correlogram. More formally, the theoretical ACF of a stationary
process y t (i.e. constant mean and variance) is defined as,
k k
k = = k = 0, 1, 2,K (2.18)
y2 0
where,
[
k = E (yt y )(yt + k y ) ] k = 0,1,2, K (2.19)
Here, k is the autocovariance function that measures the covariance between a time series
and its past, while y and y2 are the (constant) mean and variance of the process,
respectively. The ACF is symmetrical around lag k = 0 , so only positive values for the lag
are considered.
Several estimators have been proposed in the literature, including the sample equivalents
of the population counterparts, i.e.,
ck ck
rk = = k = 0, 1, 2, K (2.20)
s y2 c0

with,
1 T k
ck = ( y t y )( y t + k y ) k = 0,1,2, K (2.21)
T k t =1
where T is the number of samples and y is the sample mean. Note that, if the time series is
white noise, an approximation of the variance of the autocorrelation estimators is,
Var[rk ] =
1
k = 0,1,2, K (2.22)
T
Equation (2.22) may be used to test the significance of any individual coefficient, i.e. any
coefficients bigger than twice the standard error will be considered significantly different
from zero.
The previous estimators are equivalent to the following set of linear regressions fitted to
the data,
y t = 1 + r1 y t 1 + e1t
y t = 2 + r2 y t 2 + e2t
(2.23)
M
y t = k + rk y t k + ekt
where i and eit ( i = 1,2, K, k ) are a set of constants and gaussian white noise terms,
respectively. These equations are simply AR models of increasing orders, with all the
intermediate parameters constrained to zero.
Apart from the individual test for each ACF parameter quoted above, a summary test of
autocorrelation up to order m is the Ljung-Box test (Ljung and Box, 1978), given by the
statistic,
rk2
Q = T (T + 2 )
m
(2.24)
k =1 T k
This is distributed as a 2 with m 1 degrees of freedom under the null hypothesis of no

autocorrelation.
The Partial Autocorrelation Function (PACF) extends the previous idea of the ACF using
the standard statistical concept of partial correlation. This function measures the linear
dependence between a time series and some lag of itself discounting the effect of all the
intermediate lags. To understand this concept, and the difference with respect to the ACF,
compare the following set of regressions with equation (2.23),

y t = 1 + 11 y t 1 + e1t
y t = 2 + 21 y t 1 + 22 y t 2 + e2t
(2.25)
M
y t = k + k1 y t 1 + K + kk y t k + ekt
The PACF is a representation of the coefficients ii ( i = 1,2,K, k ) against the lag. The
variance of these coefficients are given by equation (2.22), or they may be computed as the
standard error of the estimators in the previous set of regressions.
The ACF, PACF, their standard errors and the Q test are all computed by the CAPTAIN
function acf. For example, Figure 2.7 is the output of the following code, where the ACF
and PACF are estimated for a sequence of gaussian random numbers.
>> acf(randn(200, 1), 10);

m acf desv Q Prob m pacf desv
1 -0.0199 0.0707 0.0808 - 1 -0.0199 0.0707
2 -0.0306 0.0707 0.2713 0.6025 2 -0.0310 0.0707
3 -0.0798 0.0708 1.5781 0.4543 3 -0.0812 0.0707
4 -0.0251 0.0713 1.7076 0.6353 4 -0.0298 0.0707
5 -0.0688 0.0713 2.6888 0.6112 5 -0.0759 0.0707
6 0.0505 0.0716 3.2207 0.6660 6 0.0390 0.0707
7 0.0481 0.0718 3.7060 0.7164 7 0.0415 0.0707
8 0.0721 0.0720 4.7987 0.6845 8 0.0663 0.0707
9 -0.0143 0.0723 4.8420 0.7743 9 -0.0041 0.0707
10 0.0245 0.0723 4.9700 0.8369 10 0.0343 0.0707
The 2nd input argument to acf is the number of autocorrelation coefficients required by the
user. As would be expected for white noise, Figure 2.7 shows that the series has no
autocorrelation: the bars are all well within the standard errors (dotted trace). The same
conclusion is strongly supported by the Q statistic and its probability value for any value
of m, shown by the 4th and 5th columns above.
ACF PACF
0.1 0.1
Correlation
Correlation
0 0
-0.1 -0.1
0 4 8 0 4 8
Figure 2.7 Autocorrelation and Partial Autocorrelation functions of a gaussian white noise signal.

Example 2.4 Time variable mean estimation for volume in the Nile river
Sometimes a debate arises among researchers about whether the mean level of a certain
variable is changing over time or not. Numerous tests and procedures have been developed
in order to investigate such a hypothesis. In this regard, if the stochastic behaviour of the
signal about the time varying mean is approximately uncorrelated white noise, then there
are some particularly simple and well formalised options. This is the case for the RW plus
noise model introduced above. Here, the smoothed signal or trend is effectively a time
varying mean variable that is assumed to evolve as a RW, while the rest of the signal is
assumed white noise. Clearly, more complex options may be pursued by adding specific
models to describe the perturbation about the trend if necessary, as discussed in Chapter 3.
Consider, for example, the Nile river annual volume measurements from 1871 to 1970
measured in 108 cubic meters, illustrated in Figure 2.8.
>> load nile.dat

>> t = (1871 : 1970)';
>> plot(t, nile)
1200
10e8 m3
1000
800
600
1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970
Years
Figure 2.8 Annual volume of the Nile River in 10e8 cubic meters.
These data were analysed by Cobb (1978) and Balke (1993), among others. The key issue
here is to determine whether there is a systematic decline in the level from 1899 onwards
(sample 29), a feature that seems visually apparent from the figure. To investigate, the RW
plus noise model is estimated as shown below, with the output illustrated in Figure 2.9.
>> nvr = irwsmopt(nile, 0, 'ml')

nvr =
0.0924
>> [tr, deriv, err] = irwsm(nile, 0, nvr);
>> plot(t, [nile tr tr+err tr-err])
Note that the second input argument to both irwsmopt and irwsm is zero, in order to
specify a RW trend, and that the NVR parameter is estimated by ML. In fact, this analysis

takes advantage of the fact that, as it was seen in Example 2.2, ML will only provide a
sensible solution if the theoretical assumptions about the model are fulfilled by the data. In
this case, we can later check that the perturbations about the trend are indeed white noise.
1200
10e8 m3
1000
800
600
1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970
Years
Figure 2.9 Annual volume of the Nile River, time variable mean and approximate 90% confidence intervals.
It is clear from Figure 2.9 that the mean level has gone down since the beginning of the
century. To test the adequacy of the model in a statistical sense, we examine the
perturbations by means of the sample and partial autocorrelation functions (acf), together
with a plot of the histogram superimposed over a Normal distribution (histon). The latter
CAPTAIN function also returns a normality test, in the form of the Bera-Jarque statistic and
associated probability value (Jarque and Bera, 1980).
>> acf(nile-tr, 20);

>> histon(nile-tr);
These graphs are illustrated in Figure 2.10.
The Ljung-Box Q-test of autocorrelation for 20 lags is 17.68 indicating that there are no
overall autocorrelation problems (Ljung and Box, 1978). Furthermore, the Bera-Jarque test
indicates that the normality hypothesis cannot be rejected by a very wide margin. To sum
up, both tests show that the theoretical assumptions about the model are fulfilled with no
problem.
A second approach to the problem, which draws clearer light about the sharp decline in the
level, is to use variance intervention again, directly specifying this 29th sample in the
analysis, as shown below.
>> nvr = irwsmopt(nile, 0, 'ml', 29)

nvr =
9.0209e-018
>> [tr, deriv, err] = irwsm(nile, 0, nvr, 29);
>> plot(t, [nile tr tr+err tr-err])

Figure 2.10 Analysis of the perturbations: Autocorrelation (ACF), Partial autocorrelation (PACF)
and histogram of the residuals.
The estimated NVR is approximately zero, implying that the mean is constant apart from
the break at the selected sample, as illustrated in Figure 2.11. Finally, although not shown
here, acf and histon again indicate that the perturbations about the trend are white noise
(the Q test for 20 lags is 14.35).
1200
10e8 m3
1000
800
600
1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970
Years
Figure 2.11 Annual volume of the Nile River, time variable mean and approximate 90% confidence intervals
with variance intervention in 1899.

2.5 Conclusion
The present chapter has introduced the state space framework that is the basis for most of
the models implemented in CAPTAIN and has formally described the associated filtering
and smoothing algorithms at the heart of the toolbox. The chapter has also discussed a
number of approaches for the estimation of any unknown elements or hyper-parameters in
these models, concentrating on Maximum Likelihood and the minimisation of the multiple-
steps-ahead forecasting errors.
However, to date, the models illustrated have been limited to the simplest Random Walk
and Integrated Random Walk (IRW), plus measurement noise, cases. Whilst useful for
basic smoothing operations, these models are largely concerned with the estimation of a
simple trend. For additional components, such as seasonality and any other perturbations
about the trend, we must turn to the more general, unobserved components model in the
next chapter.

CHAPTER 3
UNOBSERVED
COMPONENTS MODELS
Unobserved Components (UC) modelling is a general strategy for time series analysis and
signal extraction, based on the assumption that the series is composed of an additive or
multiplicative combination of different components that have defined statistical
characteristics but which cannot be observed directly. In CAPTAIN, such components may
include a trend, cyclical components, stochastic perturbations and so on. In the statistical
literature, typical approaches to UC modelling include:
Ad-hoc methods of seasonal adjustment in which smoothing procedures are used

to extract trend and seasonal components from the time series. In this regard,
one of the oldest and best known techniques for signal extraction is the Census
X-11 method and its later extensions X-11 ARIMA and X-12 ARIMA. See e.g.
Findley et al. (1996) and the references therein.
The ARIMA or Reduced Form approach to UC model identification and
estimation, based on the assumption that the series can be modelled as an Auto-
Regressive-Integrated-Moving-Average (ARIMA) model. See e.g. Box et al.
(1978); Gmez and Maravall (1998). Starting from this reduced form (ARIMA)
model, the UC model (considered as a structural form following the
Econometrics parallel) is obtained by the imposition of a number of (arbitrary)
restrictions to ensure the existence and uniqueness of the decomposition.
The Optimal Regularisation approach, based on direct optimal estimation of the

components within a regularisation context. See e.g. Akaike (1980); Young and
Pedregal (1996); Hodrick and Prescott (1997). In this case, constraints are
imposed on the state estimates via a Lagrange Multiplier term within the cost
function, in order to ensure that they possess the required characteristics.
The State Space (SS) approach provides a rather more obvious formulation of
UC concepts and, since this is the method implemented in CAPTAIN, is discussed
in detail below. See also e.g. Ng and Young (1990); Young (1994); Young et al.
(1999). Alternative SS approaches that have some points in common with
CAPTAIN, as well as a few radically different aspects, are discussed by Harrison
and Stevens (1976); Harvey (1989); and West and Harrison (1989).
Chapter 3 Unobserved Components Models
It is clear from these examples that UC modelling may be regarded as a broad philosophy,
an alternative to other more traditional ways of time series modelling, rather than as a
particular model form and estimation method. However, the present authors believe that
the SS approach is one of the most powerful and flexible frameworks for developing UC
models. Indeed, the state estimation algorithms and associated methods for hyper-
parameter optimisation, introduced in Chapter 2, provide a complete solution for the
identification of UC models. All that remains is to characterise each component of the
model in an appropriate SS form.
The previous chapter has already discussed one of the simplest cases, namely a trend
component represented by an Integrated Random Walk (IRW) plus white noise model.
Here, the CAPTAIN function irwsmopt is utilised to optimise the hyper-parameters, while
irwsm provides the filtering, smoothing, forecasting and interpolation operations (see e.g.
Example 2.4). Following a similar syntax, the dhr/dhropt and univ/univopt combinations
in CAPTAIN provide for a more diverse range of UC models, as discussed below. Finally,
the toolbox includes a number of functions to assist in the identification of these models, in
both the time and frequency domains, namely: aic, acf, arspec and period.
3.1 General Form of the Unobserved Components Model
UC models in CAPTAIN can be synthesized by the following discrete-time equation,
yt = Tt + Ct + St + f (ut ) + N t + et et ~ N (0, 2 ) (3.1)
where yt is the observed time series; Tt is a trend or low frequency component; Ct is a

sustained cyclical or quasi-cyclical component (e.g. an economic cycle) with period
different from that of any seasonality in the data; St is a seasonal component (e.g. annual
seasonality); f (ut ) captures the influence of a vector of exogenous variables ut , if
necessary including stochastic, nonlinear static or dynamic relationships; Nt is a stochastic
perturbation model, i.e. coloured noise modelled as an Auto-Regression (AR) process; and,
as shown, et is an irregular component, usually defined for analytical convenience as a
normally distributed Gaussian sequence with zero mean value and variance 2 (i.e
discrete-time white noise).
In the present context, equation (3.1) is regarded as the observation equation of a discrete
time SS system, in which the dynamic behaviour of each of the UCs has to be defined via
the state equations. In order to allow for nonstationarity in the time series yt , the various
components in (3.1), including the trend Tt , can all be characterised by stochastic, Time
Variable Parameters (TVPs), with each TVP defined as a nonstationary stochastic
variable, as discussed below.

One assumption that is maintained in every UC methodology is that all the components are
orthogonal to the rest. In this context, it is noteworthy that the SS model representing
equation (3.1) can be built by block-concatenation of all the matrices of each SS subsystem
related to each of the components.
Despite the generality of equation (3.1), it should be stressed that in the majority of
applications not all these components will be simultaneously necessary. Indeed, important
identifiability problems may arise among the components if they are not defined
appropriately. For example, a Nt AR component including seasonal roots will conflict
severely with the seasonal component St if both are included in a single model. In a similar
manner, unit roots in the Nt component would have problems with the trend component
Tt . For these reasons, CAPTAIN normally restricts the user to formulations of the problem
that are practically useful, so that such identification problems do not arise when using the
toolbox (see examples below).
3.2 State Space form for UC Models
The SS model for each component in equation (3.1) is introduced below.
Trend models ( Tt )
The trend models available in CAPTAIN are all particular cases of the General Random
Walk (GRW) family of models represented by equation (2.2). These include, most
commonly, the Random Walk (RW) and Integrated Random Walk (IRW) introduced in
Chapter 2. A third option, the Local Linear Trend (LLT) model may be obtained by using
RW and IRW models simultaneously, as shown below. The SS form of such a LLT model
with observational noise is defined as follows,
x1t 1 1 x1t 1 1t 1 x
= + Tt = (1 0) 1t + et (3.2)
x 2 t 0 1 x2 t 1 2 t 1 x2 t
The state equations may be written using the backward-shift operator as,
(1 L )x1t = x2t 1 + 1t 1
(3.3)
(1 L )x2t = 2t 1
Then, substituting the second equation in the first one we have,
2t 2
(1 L )x1t = + 1t 1 (3.4)
(1 L )
Finally, substitution of this equation in the observation equation gives the reduced form,

2 t 2 1t 1
yt = + + et (3.5)
(1 L ) 2
(1 L )
i.e. the addition of a RW and IRW model if the state noises are independent of each other
(an assumption that is usual in this context).
Two further trend models, available only in the CAPTAIN functions univ and univopt for
AR + Trend analysis, are the so called Integrated AR (IAR) or Double Integrated AR
(DIAR) models. The motivation for these options arises from the observation that, for the
simplest trend models, it is quite common to find correlation in the residuals. For example,
when utilising an IRW model for the trend, it is assumed that its second difference will be
white noise, which is not always the case in practice. IAR and DIAR models take
advantage of the correlation, by building an addition model for these residuals, in order to
improve the overall forecasting performance.
One caveat, however, is that it is possible for the FIS algorithm itself to induce this kind of
correlation. Indeed, it can be shown (Young and Pedregal, 1996) that the correlation
structure depends on the autocorrelation of the original time series itself. Nonetheless, if
the second difference does show a predictable behaviour, especially if there is some
physical meaning (e.g. related to the business cycle), then it would be worthwhile to try to
forecast it. In this case, the DIAR model, which is defined below in a manner similar to the
earlier examples, provides one particular approach,
Tt = Tt 1 + Dt 1
Dt = Dt 1 + x1t (3.6)
3t
x1t =
1 + 1 L + 2 L2 + K + p Lp
In SS form, this model may be described by the following equations,
T 1 1 0 0 0 L 0 0 T 0
D 0 1 1
0 0 L 0 0 D 0

x1 0 0 1 2 3 L p 1 p x1 3

x 2 = 0 0 1 0 0 L 0 0 x2 + 0 (3.7)
x3 0 0 0 1 0 L 0 0 x3 0

M M M M M M O M M M M
x L 0 x p 0
p t 0 0 0 0 0 1 t 1 t
Tt = [1 0 L 0]x t with x t = T [ D x1t x 2t L x pt ]T

The model is then fully defined by the variance of 3t and the coefficients of the AR
polynomial. Although this is a rather more complex model than the GRW, it does have the
capability of providing non-linear like forecasts of the trend, very useful in situations near
turning points.
Cyclical and seasonal models ( Ct and S t )
Although these two types of components are given different names in equation (3.1), both
can be treated in the same way from a modelling standpoint since both reflect a periodic
kind of behaviour. The difference between them lies only on the period considered, with
seasonal usually reserved for an annual cycle. CAPTAIN provides two approaches for
modelling any such periodic behaviour, as discussed below.
Dynamic Harmonic Regression (DHR; Ng and Young, 1990; Young et al., 1999)
The DHR model is similar to a Fourier analysis, but with coefficients that evolve smoothly
in time. The model is,
s
2
y t = S t + et = {a jt cos( j t ) + b jt sin ( j t )}+ et

(3.8)
j =0
with,
2j s
j = j = 1, 2, K , (3.9)
s 2
and,
a jt j j a jt 1 jt
= + NVR ( jt ) = NVR ( ' jt ) (3.10)
a ' j1t 0 j a ' jt 1 ' jt
Here, [ s 2] = s 2 for even s and [ s 2] = ( s 1) 2 for odd s. Parameters a jt and b jt are

represented as GRW models with the associated NVR values both equal for the same
harmonic period. Note that setting 0 = 0 reduces the correspondent term for j = 0 to a
matrix of ones and zeros, implying that GRW trends are naturally accommodated within a
DHR context.
As an illustration of the SS form of this model, consider the following IRW trend with a
single harmonic modulated by RW parameters for a period of 12 samples, typical of
monthly time series,

Tt 1 1 0 0 Tt 1 0

Dt 0 1 0 0 Dt 1 Tt 1
a = 0 +
0 1 0 a1t 1 a1t
1t
b 0 0 0 1 b1t 1 b1t
1t
y t = [1 0 cos(2t ) sin (2t )]x t + et
Compared with other methodologies for modelling periodic components, the framework
above is one of the most flexible. In some other approaches (e.g. Harvey, 1989; West and
Harrison, 1989) all the variances in the harmonics must be the same. Furthermore, in many
standard approaches to the problem, all the harmonics of the seasonal component are
introduced into every model. By contrast, when using CAPTAIN, the modeller is strongly
recommended to look at the spectral properties of the series in order to identify the most
appropriate model for the time series in question, i.e. to check whether all the harmonics
are actually necessary. For this latter purpose, CAPTAIN includes the functions period and
arspec to estimate the power spectrum and an AR-spectrum respectively. The examples
considered in section 3.4 will demonstrate the importance of this identification stage.
The unknowns in the model (3.8)-(3.10), including the noise variances are the hyper-
parameters. Given an estimation of these hyper-parameters, the KF and FIS algorithms
yield estimates of each TVP and hence the trend and harmonic component, together with
the total cyclical component, i.e. the sum of all the individual harmonics.
Finally, note that the DHR model may be regard as a particular example of Dynamic
Linear Regression (see Chapter 4), in which the inputs are deterministic sinusoidal
functions of time. Note that all the inputs in this model are orthogonal, a property that is
highly desirable in regression methods and makes the TVP estimation problem particularly
straightforward, even when a high number of parameters are involved.
Trigonometric Cycle or Seasonal (Harvey 1989; West and Harrison, 1989):
Here, the periodic components are introduced via the state equations of the SS model,
rather than in the observation equation as for the DHR model. The full model is,
s
2

y t = S t + et = S jt + et (3.11)
j =1
with each S jt defined as follows,

S jt cos j sin j S jt 1 jt
= j + NVR ( jt ) = NVR ( ' jt ) (3.12)
S ' jt sin j cos j S ' jt 1 ' jt
The same noise definitions utilised in DHR models apply here. The j parameter is
introduced to allow convergence in the cyclical components when it is constrained between
0 and 1. In seasonal models it is usually fixed at 1.
The SS form of this model is straightforward. For example, with a trend and a single
harmonic component, the overall SS form would be,
Tt 1 1 0 0 Tt 1 0

Dt 0 1 0 0 Dt 1 Tt 1
S = 0 0 cos 12
+
sin 12 S1t 1 1t
1t
S' 0 0 sin 12 cos 12 S '1t 1 '1t
1t
(3.13)
y t = [1 0 1 0]x t + et
Such a seasonal component, together with a LLT model for the trend, is called a Basic
Structural Model (BSM) by Harvey (1989), although recent publications by the same
author refer instead to unobserved components rather than structural models.
Exogenous variables ( f (ut ) )
CAPTAIN provides numerous approaches for modelling the input-output relationship

between variables. Such methods are discussed at length in future chapters, so will only be
briefly listed here:
Dynamic Linear Regression (DLR; Chapter 4), in which one output is related to
several inputs in a linear regression form, but with TVPs.
Dynamic Autorregression with eXogenous inputs (DARX; Chapter 4). This is an

extension of the DLR model, in which past values of the output are also
regressors of the system. Again, all the parameters affecting the exogenous
variables and the past values of the output (endogenous variables) may be TVPs.
Dynamic Transfer Function model (DTF, Chapter 4), effectively an extension of

the DARX model but with a more widely applicable assumption for the noise.
Standard Transfer Function (TF) models in discrete and continuous time with
constant parameters (see Chapters 6 and 7, respectively). These models are
estimated by an improved Recursive Instrumental Variable (RIV) method.

State Dependent Parameter analysis (SDP, Chapter 5), a general approach for
the identification and estimation of non-linear, non-stationary dynamic
relationships between variables.
Coloured noise components ( N t )
Several models implemented in CAPTAIN allow coloured noise components to handled as

pure AR processes. In order to consider the SS form, it is convenient to assume that the
sum of the coloured noise and the white noise component in equation (3.1) constitutes the
AR process with the same white noise input (Young, 1984; Ng and Young, 1990), i.e.,
1
N t + et = e (3.14)
(L ) t
where (B ) = 1 + 1 L + 2 L2 + K + p Lp . One SS form of such a model is, therefore,
x1t 1 1 0 L 0 x1t 1 t

x 2t 2 0 1 L 0 x 2t 1 0
M = M M M O M M + 0

x p 1t p 1 0 0 L 1 x p 1t 1 0

x pt p 0 0 L 0 x pt 1 0
(3.15)
yt = (1 0 0 0 0 )x t
Models with polynomials of different orders can be implemented by constraining the

corresponding parameters to zero. In the same way, multiplicative polynomials typical of
seasonal ARMA models, in the manner of Box-Jenkins, can be converted into this
equivalent form by convoluting the polynomials and transforming the model into a higher
order system.
A trend component straightforwardly attached to this model by simple block-concatenation

of the appropriate SS matrices, in a similar manner to the earlier examples. This new
model may be used as an alternative to DHR and BSM if the AR process order is high
enough to allow for seasonal roots. It is also of interest in more general situations in which
the signal has some none seasonal correlation about the trend.
3.3 (Hyper-) parameter estimation for UC models
The recursive state estimation algorithms require prior knowledge of all the system
matrices in the state space models above. However, depending on the particular structure
of the model chosen, some hyper-parameters will remain unspecified and must be

estimated separately before state estimation can proceed. Typically, these include the NVR
matrix. Section 2.3 of Chapter 2 introduced the hyper-parameter estimation problem for
general SS models. Since most of the models considered in the present chapter are set up in
such a SS form, all the issues previously raised apply here. In particular, Maximum
Likelihood (ML) and the minimization of the multiple-steps-ahead forecasting errors are
available options for all the UC models in CAPTAIN. Two additional approaches, developed
specially for UC models, are considered below, namely Frequency Domain estimation and
Sequential Spectral Decomposition.
Frequency domain estimation
This method has been developed for DHR and BSM models, implemented in the CAPTAIN
functional pair dhr/dhropt. Frequency domain estimation methods are generally
concerned with approximating the theoretical pseudo-spectrum of the model (a function of
the hyper-parameters in question) to the empirical pseudo-spectrum obtained directly from
the time series.
Given the DHR model (3.8) or the BSM (3.11), there are two logical steps when building
the model spectrum: (a) derivation of the spectrum of the TVP models taken from a GRW
process and (b) derivation of the spectrum of the sinusoidal components modulated by
those TVPs. These are considered in turn below, followed by (c) the full algorithm.
(A) Pseudo-Spectra of GRW models
In order to derive the spectrum of GRW models, it is necessary to first obtain the reduced
form (or transfer function) of its SS description. For example, in the case of a SRW model
(i.e. equation (2.2) with 0 < < 1 ; = = 1 ; 1t = 0 ) the reduced form is given by,
2t 1
yt = + et (3.16)
(1 L )(1 L )
where = (1 L ) is the difference operator. The stationary version of the model is then,
2t 1
y t = + et (3.17)
(1 L )
For this process, the spectrum can be calculated by recalling that the frequency response
(e.g. Priestley, 1989) of a signal z t ,
t
zt = (3.18)
(1 L )
is,

2
2
2
f z ( ) =
1 1 1
= =
2 (1 e ) 2 (1 e )(1 e ) 2 {1 + 2 cos( )}
i i i 2
(3.19)
In this case, the spectrum for yt takes the following form,
1 2 2
f y ( ) = + { ( )} 2
[0, ]
{ }
2 2 cos ; (3.20)
2 1 + 2 2 cos( )

whilst for the non-stationary SRW process y t , the pseudo-spectrum is,
1 2 2
f y ( ) = + 2
[0, ] .
{ }
; (3.21)
2 1 + 2 2 cos( ) { 2 2 cos( )}

This is a function in with a peak at j . Spectra for particular cases can be found by just
constraining the parameter (i.e. 0 or 1 for RW or IRW models, respectively). However,
for the other models considered so far, additional manipulations are necessary. For
example, in the case of a Trigonometric Cycle, the reduced form is,
1 + aL
yt = jt + et (3.22)
1 + bL + cL2
with a = {sin ( j ) cos( j )}, b = 2 cos( j ) and c = 2 . Therefore, its pseudo-

spectrum is,
1 1 + a 2 + 2a cos( )
f y ( ) = 2j + 2 (3.23)
2 1 + b + c + 2b(1 + c ) cos( ) + 2c cos(2 )
2 2

This is also a function in with a peak at j .
(B) Pseudo-Spectra of DHR terms modulated by GRW parameters
From the basic Fourier transform properties, the frequency response of amplitude
( )
modulated signals of the form St = at cos j t is known to be,
f s ( ) =
1
2
[ ( ) (
fA j + fA + j )] (3.24)
where f A ( ) is the frequency response of at . Consider now the case of a single DHR
component of the form S t = a jt cos( j t ) + b jt sin ( j t ) , in which the parameter variations
are governed by SRW models. The pseudo-spectrum is given by,

1 2j
f S ( , j ) = +
2 {1 + 2 2 cos( j )}{2 2 cos( j )}

2
+
{1 + 2 cos( + j )}{2 2 cos( + j )}
j
This is a function of with a maximum at j in a way such that the height and width
depend on the variances of the noises and the value of . Once more, RW or IRW models
may be found by constraining to 0 or 1 respectively, while the LLT may be obtained by
adding RW and IRW models together. Defining,

S ( , j ) =
1 1
+
2 {1 + 2 cos( j )}{2 2 cos( j )}
2
1
+
{ }
1 + 2 cos( + j ) {2 2 cos( + j )}
2
then the pseudo-spectrum of the full DHR model (3.8) becomes,
s
2

2
f y* ( , ) = 2 j S ( , j ) +
2
2 2 = 2 2 2 ... 2 s

j =0 0 1

2
This latter expression can also be described in terms of the hyper- parameters
NVR j = 2 j / 2 , i.e.,
s
2
2
f y ( , NVR) = NVR j S ( , j ) + 1
*
j =0
(3.25)
Here, the full set of NVR parameters is represented by the vector NVR . Note also that
(3.25) is linear in the NVRs and, as we shall see, this facilitates the initial estimation of
these hyper-parameters. The extension of S( , j ) to accommodate more complex
combinations of RW, IRW and LLT defined trends and parameters is obvious.
(C) Full estimation algorithm in the frequency domain
The estimation problem is to find the set of parameters NVR (and any other hyper-
parameters present in the model, such as the parameters in the SRW model) which yield
the optimal least squares fit to the empirically estimated spectrum. Although a linear least-
squares fit is the most obvious, Young et al. (1999) show that substantial advantages can
be found when the following non-linear objective function is used instead,

( ) [ { }]
T 1 2
f y , fy* = log{ f y ( k )} log fy* ( k , NVR ) (3.26)

k =0
Here, f y ( k ) is either the sample periodogram or the AR spectrum of the time series,
with the AR order in the latter case identified by the Akaike Information Criterion (AIC;
Akaike, 1974). Using the log transformed spectra yields much better defined NVR
estimates since it concentrates attention on the most important shape of the shoulders
associated with the harmonic peaks in the AR Spectrum. The linear solution is used as an
initialisation for this non-linear optimisation, which is computationally very fast.
The complete estimation algorithm in the frequency domain, thus consists of the following
four steps:
Estimate an AR(n ) spectrum f y ( ) of the observation process yt ,t = 1,2,..., N

and its associated residual variance 2 , with the AR order n normally identified
by reference to the AIC. Note the [s 2] significant peaks that characterise the
spectrum (these will normally include a fundamental frequency and several or
all of its associated harmonics).
Find the Linear Least Squares estimate of the NVR parameter vector which
minimises the following linear least squares objective function,
( ) [ ]
T 1 2
* f y , fy* = f y ( k ) fy* ( k , NVR ) (3.27)

k =0
Find the Nonlinear Least Squares estimate of the NVR parameter vector which
minimises the following nonlinear least squares objective function,
( ) [ { }]
T 1 2
f y , fy* = log{ f y ( k )} log fy* ( k , NVR ) (3.28)

k =0
using the result from the second step to define the initial conditions.
Use the NVR estimates from the third step to obtain the recursive forward pass
(KF) and backward pass (FIS algorithm) smoothed estimates of the components
in the DHR model: i.e. the trend; the total cyclical and seasonal components; the
fundamental/harmonic components; and the residuals. This last step should
allow for any interventions and outliers, interpolate over gaps and forecast as
necessary in the normal manner.
Example 3.1 demonstrates the straightforward implementation of these four steps using the
AR spectrum and DHR model estimation functions in CAPTAIN.

Example 3.1 Analysis of the Mauna Loa CO2 data using DHR and related models
The measured CO2 concentration at Mauna Loa, illustrated in Figure 3.1, clearly have a
seasonal component, which is related to the global net uptake and release of CO2 in the
biosphere in the summer and winter.
>> load co2.dat

>> plot(co2)
340
CO2
330
320
0 50 100 150 200 250 300

Months
Figure 3.1 CO2 concentration at Mauna Loa (parts-per-million).
The data are sampled monthly (i.e. s = 12 ), hence the expected periodic components are
12, 6, 4, 3, 2.4 and 2 samples per cycle (i.e. 12 / j , j = 1,2, K,6 ). However, rather than rely
on these theoretical harmonics, the first step in the analysis is normally to identify the most
significant harmonics in the series by means of some spectral measurement. CAPTAIN
offers two possibilities: the AR pseudo-spectrum (arspec) and the periodogram (period).
For example, the following straightforward command yields Figure 3.2.
>> arspec(co2);
POWER log10(amp*pi)
5
4
3
2
1
0
20 10 6.67 5 4 3.33 2.86 2.5 2.22 20
PERIOD
Figure 3.2 AR(27) pseudo-spectrum of the CO2 data.
The AR pseudo-spectrum Figure 3.2 is determined on the basis of an AR(27) model,

identified automatically by the default arspec call (see Example 1.2). Additional
arguments are possible, allowing for the direct specification of the AR model order from

any prior analysis: see Chapter 8. It is clear from the left hand side of Figure 3.2 that a
trend is present while, most interestingly, the seasonal component is dominated by just the
two first harmonics, i.e. the 12 and 6 samples per cycle (s/c) period components. In fact,
since the harmonic corresponding to the period 2.4 s/c is very small and the 2 s/c harmonic
is not present at all, these are ignored in the DHR analysis below.
The NVR hyper-parameters are first estimated in the (default) frequency domain, using an
IRW model for the trend and RW models for the four dominant harmonics of the seasonal
component (12, 6, 4, and 3 samples per cycle). Note that the leading zero in the P variable
below represents the trend, while TVP specifies the model types. In this analysis, the final
three years of data are removed from the series in order to later illustrate the forecasting
performance of the model. The resulting fit in the frequency domain, a standard output of
the dhropt function, is shown in Figure 3.3.
>> P = [0 12 6 4 3]
>> TVP = [1 0];
>> nvr = dhropt(co2(1:288), P, TVP);
METHOD: FREQUENCY DOMAIN. AR-SPECTRUM(24)
OPTIMISER: LEASTSQ
0.711 seconds.
3 missing values
0.00 1.0 3.4771e-003 -2.4588 0.042 1.0000 - -
12.00 0.0 7.1466e-002 -1.1459 0.030 1.0000 - -
6.00 0.0 2.0435e-002 -1.6896 0.040 1.0000 - -
4.00 0.0 8.0806e-004 -3.0926 0.088 1.0000 - -
3.00 0.0 1.7861e-004 -3.7481 0.131 1.0000 - -
6
log10(P)
0
20 10 6.67 5 4 3.33 2.86 2.5 2.22
Period (samples/cycle)
Figure 3.3 AR pseudo-spectrum of the CO2 data (solid) and fit of the model (dotted).
Finally, the DHR model is estimated for the same data and settings using the NVR values
listed above. Here, fcast is utilised to add nans to the series, representing the three years
of artificially induced missing data; as discussed in Chapter 2, this is the approach taken in
CAPTAIN to generate forecasts. The trend estimate, forecasts of the series with 95%

confidence intervals, seasonal component and its forecasts, together with the remaining
irregular components are all illustrated in Figure 3.4, using the commands below. Here, the
variable tf is introduced for convenience, simply to define the sample numbers over which
the model is forecasted.
>> [fit, fitse, tr, trse, comp, e] = ...

>> dhr(fcast(co2(1:288), [0 36]), P, TVP, nvr);
>> t = [1 : length(co2)]';
>> tf = (289 : 324)';
>> bands = [fit(tf)+2*fitse(tf) fit(tf)-2*fitse(tf)]; % Confidence bands
>> subplot(311); plot(t, [co2 fit tr], tf, bands, ':');
>> S = sum(comp')'; % Total seasonal component
>> subplot(312); plot(S)
>> subplot(313); plot([co2(1 : 288)-fit(1 : 288)]) % Irregular component
350
340
Trend
330
320
2
Seasonal
-2
Irregular
0.5
-0.5
0 50 100 150 200 250 300
Months
Figure 3.4 Trend, forecasts, seasonal and irregular estimates of the CO2 series.
Many other model types can be implemented using the same dhr/dhropt pairing in
CAPTAIN. For example, by choosing TVP = [1 1], IRW models are selected for each of
the TVPs modulating the harmonics while, if instead, TVP = [1 2], then a trigonometric
cycle is used. Alternatively, SRW models or damped seasonal/cyclical components may be
specified as follows,
>> [nvr, alpha] = dhropt(co2(1 : 288), P, TVP, [], -2, -2);

Here, the -2 terms indicate free (unconstrained) estimation of both the NVR hyper-
parameter and the smoothing ( ) parameter in equation (2.2). Finally, LLT or Damped
trends may be estimated by specifying two zeros in the vector of periodic components, as
shown below.
>> P = [0 0 12 6 4 3];
>> TVP = [1 0];
>> nvr = dhropt(co2(1:288), P, TVP, 24); % LLT
>> [nvr, alpha] = dhropt(co2(1:288), P, TVP, 24, -2, [1 -2 1]); % Damped
Refer to Chapter 8 or the on-line help information for further details. All the examples
above utilise the specially developed frequency domain optimisation routine for DHR
model hyper-parameter estimation. However, as discussed in Chapter 2, ML is usually
available in CAPTAIN as an alternative, even though it is sometimes unable to provide an
appropriate solution unless it is constrained in some way. For example, the command,
>> nvr = dhropt(co2(1 : 288), P, TVP, -24);
utilises ML but takes a long time or is unable to find a solution before reaching the
maximum number of iterations, because it is well known that the log-likelihood surface is
very flat around the optimum in this case (see Young et al., 1999). Note that the 4th input
argument above specifies in an initial condition for ML obtained from a frequency
optimisation with an AR(24) spectrum.
This problem disappears when an appropriate constraint is introduced into the model. In
this regard, one common solution is to impose the same NVR values to all the seasonal
harmonics using the 5th input argument to dhropt, i.e.,
>> nvr = dhropt(co2(1:288), P, TVP, -24, [-2 -1]);

METHOD: MAXIMUM LIKELIHOOD
OPTIMISER: LEASTSQ
25.596 seconds.
3 missing values
0.00 1.0 3.003e-011 -4.9529 - 1.0000 - -
12.00 0.0 9.386e-006 -8.8612 - 1.0000 - -
6.00 0.0 9.386e-006 -8.8612 - 1.0000 - -
4.00 0.0 9.386e-006 -8.8612 - 1.0000 - -
3.00 0.0 9.386e-006 -8.8612 - 1.0000 - -
Of course, this solution is very different to the earlier frequency domain results, because of
the entirely different method and the artificial constraints imposed in the present case.

Sequential Spectral Decomposition
Sequential spectral decomposition is designed for the Trend + AR type of model

implemented in the CAPTAIN functional pair univ/univopt, although in principle it may be
applied to any UC model (see latter examples). The approach was developed to avoid
certain identification problem that arise when using AR models. Such problems occur
because the AR model, which is ideally reserved for the perturbational component about
the trend, may in fact describe the whole series (i.e. the trend and the perturbation) if a
joint estimation is attempted without constraints. In other words, there is nothing in the
Trend + AR model that guarantees that the joint estimation is going to yield a perturbation
and a trend orthogonal to each other, and with the frequency properties assumed in
principle, i.e. the trend as a smooth line across the data and the perturbation as a
component wandering about the zero line in a way such that the sum of both optimally fits
the time series.
This does not mean that such a model would not be useful for forecasting purposes, simply
that each component by itself is not meaningful. Typically, one may find that the trend
does not follow the data and that the perturbational component does not oscillate around
zero as required (i.e. it is not stationary). This identification problem does not appear in the
BSM or DHR models because in these cases, each seasonal harmonic is by definition
independent to the rest and to the trend, so that each one concentrates on a particular
narrow frequency band. In the case of the Trend + AR models, the problem may be
conveniently solved in either four or five steps, as follows.
Estimate an initial trend component on the basis of the IRW model (see
examples in Chapter 2). The NVR may be chosen using any a priori knowledge;
on the basis of the frequency domain properties of this low-pass filter; or may be
objectively estimated in some manner, such as by ML or the minimum of the
multiple-steps-ahead forecasting errors. This initial selection of the NVR
resembles Bayesian methods in an UC context, in the manner of West and
Harrison (1989).
If an IAR or a DIAR trend proves necessary, then the identification and

estimation of the AR polynomial for the trend must be based on the first or
second difference of the initial trend estimate, respectively.
Obtain an initial estimate of the perturbational component as the difference

between the data and the estimated trend in the previous step. Estimate an AR
model or a subset AR model for this component.

Re-estimate the NVR parameter for the trend. The initial NVR selected for the
trend is, by definition, the ratio of the variance of the state noise to the variance
of the observational noise in the initial model. However, this initial model does
not account for the new perturbational AR component and the variance of the
observational noise is generally much bigger than in the complete model, hence
the NVR should be modified accordingly.
Re-estimate the components based on the full model including the Trend and the
AR model with the new NVR for the trend and the estimated AR model for the
perturbational component.
All these steps are automatically handled by the univopt and univ functions. In this way,
the overall non-linear problem is decomposed into several linear or quasi-linear steps, each
solved in fully recursive terms. This simple solution, which has some loss of optimality
from the ML viewpoint, has proven to be very successful in practice.
Example 3.2 Modelling the US GDP using Trend + AR models
Consider the quarterly US Gross Domestic Product (GDP) data from the first quarter of
1947 until the last quarter of 2002. This series, which has already been seasonally adjusted
by the authorities, is illustrated in Figure 3.5 using the command below.
>> load usgdp.dat

>> t = (1947 : 1/4 : 2002.9)';
>> plot(t, usgdp);
>> y = log(usgdp(1 : 204));
10
1000 Billions of 1996 $
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000
Quarters
Figure 3.5 The seasonally adjusted US Gross Domestic Product between 1947Q1 and 2002Q4.
The last line of code above transforms the data to a logarithmic scale and reserves the final
20 observations for forecasting comparisons, in a similar manner to the earlier CO2
example. The present data are particularly interesting since the perturbations about the
trend cannot be modelled as a seasonal component because it is a seasonally adjusted time

series. However, it is not believed that the remaining perturbations are white noise, because
there is evidence of a business cycle in the series. In such cases, a DHR type model with
some periodic cycle of appropriate length may be fitted (e.g. Koopman et al., 2000), but
the existence of such a cycle is rather dubious. Consider, for example, the inconclusive
spectra estimates obtained from the following commands.
>> arspec(y);
>> period(y);
For these reasons, CAPTAIN provides the Trend + AR model. This approach may be
regarded as a powerful extension to its simpler predecessors, i.e. the IRW or HP filter
which are traditionally applied to these data (e.g. Hodrick and Prescott, 1997). Following
the sequential spectral decomposition method outlined above, the first step is to select an
NVR for the trend alone in an IRW + white noise model. For example, if the minimisation
of the 4-steps-ahead forecasting errors is chosen (since 4 is the number of samples per
year),
>> nvr0 = irwsmopt(y, 1, 'f4')

nvr0 =
0.0013
>> tr = irwsm(y, 1, nvr0);
The NVR fitted in this way corresponds to a cut-off period of 8 years and one quarter, i.e.
all the periods above that value are included in the trend (Table 2.1). The second step is to
select the order of the AR model for the perturbations. It can be identified using either the
acf or aic functions as follows.
>> aic(y-tr, [], 1);

>> acf(y-tr);
The optimal model order chosen by the AIC criterium is the AR(13) model. The estimation
of the whole model, conditional on the NVR of the trend in the first step then follows,
>> [nvr, ARp] = univopt(y, [1:13], 1, nvr0);

ESTIMATION OF TREND+AR MODEL
AR Model for Perturbations:
==========================
AR ARp S.E. T
1 -0.9687 0.0715 -13.5477
2 0.1515 0.0947 1.5996
3 0.1514 0.0935 1.6187
4 0.0296 0.0940 0.3143
5 0.0857 0.0937 0.9147
6 -0.0885 0.0936 -0.9459
7 0.1261 0.0937 1.3464

8 0.0271 0.0939 0.2887

9 -0.0369 0.0937 -0.3935
10 -0.0207 0.0936 -0.2206
11 0.0118 0.0929 0.1270
12 0.2062 0.0926 2.2270
13 -0.0596 0.0677 -0.8811
Final trend NVR estimate: 6.0619e-003
Integration order of trend: 2
The table shows the AR lag in the first column; the point estimate for each parameter in the
second column; their standard error in the third column; and the typical T statistic (i.e. each
parameter divided by its standard error) in the fourth column. At the bottom of the table the
new estimate of the trend NVR is calculated, which is greater than the initial nvr0. The
reason is clear, nvr0 was the ratio between the trend variance and the perturbation
variance, in a model in which the perturbation about the trend was assumed to white noise
but which actually included the whole perturbational component; while the new nvr is the
same ratio based on a much smaller observational noise variance.
It is clear that not all the lags are significant and necessary in the AR model based on the
T-test, and a subset model would perform as well. This is the reason why a final estimation
iteration is usually worthwhile, i.e.,
>> [nvr, ARp] = univopt(y, [1:3 12 13], 1, nvr0);

ESTIMATION OF TREND+AR MODEL
AR Model for Perturbations:
==========================
AR ARp S.E. T
1 -1.0231 0.0686 -14.9201
2 0.1262 0.0958 1.3173
3 0.2423 0.0660 3.6725
12 0.1850 0.0594 3.1146
13 -0.0890 0.0593 -1.4997
Final trend NVR estimate: 5.7853e-003
Integration order of trend: 2
The associated smoothing, forecasting, signal extraction, etc. of the series, using the final
NVR estimate above, is achieved by means of the univ function,
>> [fit, fitse, tr, trse, comp] = univ(fcast(y, [0 20]), ARp, 1, nvr);
>> subplot(311), plot(t(1 : 204), tr(1 : 204))
>> subplot(312), plot(t(1 : 204), comp(1 : 204))
>> subplot(313), plot(t(1 : 204), y-fit(1 : 204))
The components are shown in Figure 3.6, where a significant decrease in variance of both
the cycle and the irregular component can be observed after the middle of 1982.

1.5
Trend
1
0.5
0.02
0
AR comp.
-0.02
-0.04
-0.06
0.02
Irregular
-0.02
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995
Figure 3.6 The components of the trend + AR model for the US GDP.
2.3
2.2
Log Scale
2.1
1.9
1.8
1986 1988 1990 1992 1994 1996 1998 2000 2002
Figure 3.7 Example five year ahead forecasts for the US GDP.
Finally, Figure 3.7 shows a detail of a forecasting exercise five full years ahead, starting on
December 1997, where it can be seen that almost all the data lie within the standard error
bounds of the forecast. This figure may be plotted as follows,
>> tfor = (205 : 224)';

>> confband = [fit(tfor) + 2*fitse(tfor) fit(tfor) - 2*fitse(tfor)];
>> plot(t, [log(usgdp) fit], '-', t(tfor), confband, ':')
3.4 Advanced Examples using Variance Intervention
The following two examples introduce an iterative approach for the simultaneous
estimation of the trend NVR and perturbations when variance intervention is required.

Example 3.3 Steel consumption in the UK revisited
In Chapter 2, Example 2.3 considered the quarterly steel consumption data (Figure 2.4) in
the context of a simple trend only model. However, as pointed out then, a better option
would be to estimate a UC model in which all the components were fitted jointly, an
approach pursued below.
Spectral identification is a very important step in this analysis, required in order to check
for the existence of all or part of the seasonal harmonics. In this case, it is often preferable
to calculate the AR-spectrum of the series when any dramatic jumps in the trend have
already been removed using the reconst function, as discussed in Example 2.2 (see Figure
2.6). Such jumps in the trend could distort the spectrum estimate of the original series,
especially the frequency band corresponding to the trend, as illustrated in Figure 3.8. In
fact, it is clear from Figure 3.8, that while the distortion is minimal with respect to the
seasonal spectral peaks (4 and 2 samples/cycle), it is quite noticeable in the low frequency
band of the spectrum. Figure 3.8 also reveals the existence of an approximate four years
cycle in the data and most of the associated harmonics (note that the seasonal peak is one
of these harmonics).
3
log10(P)
20 10 6.67 5 4 3.33 2.86 2.5 2.22 20

Figure 3.8 AR(17)-spectrum of raw data (dotted) and the reconstructed series from Figure 2.6 (solid).
Because of such potential distortions in the spectrum, frequency domain methods should
always be applied with care, especially when estimating the trend (and cyclical) NVR
hyper-parameters. Indeed, this is a typical case where ML in the time domain with
simultaneous variance intervention estimation may yield a superior answer. In this regard,
a DHR model may be estimated by entering the following commands.
>> load steel.dat

>> P = [0 16 8 16/3 4 2];
>> TVP = [1 0];
>> nvr = dhropt(steel, P, TVP, -17, [],[],[],[],[],[],[],[],[87 106]);
>> [fit, fitse, tr, trse, comp] = ...
>> dhr(steel, P, TVP, nvr, [], [], [], [], [], [87 106]);

The 4th input argument to dhropt specifies ML optimization using a frequency domain
estimation based on the AR(17) model as the initial condition. The final input argument
provides the variance intervention points, using a similar syntax to the irwsm/irwsmopt
pair introduced in Chapter 2.
A more satisfactory solution may be obtained by an iterative procedure which combines

both frequency and time domain methods. Although more complex to implement, this
approach arguably provides more a objective and complete UC analysis. Indeed, while the
simple commands above may be regarded as the basic default option, the code below is a
good illustration of the open architecture of CAPTAIN in the MATLAB environment, by
which each user may find different ways to exploit potential solutions to a particular time
series problem.
The idea here is that NVR frequency estimation based on the raw data is contaminated by
jumps in the trend. If these jumps were known, they could be removed and the
contamination problem would disappear. Obviously, such information is not immediately
available and has to be estimated simultaneously with the NVR hyper-parameters. In other
words, trend jumps may be estimated conditional on given NVR parameters by variance
intervention, while the NVR parameters themselves may be estimated in the frequency
domain conditional on given estimates of the trend jumps. In this case, a simultaneous,
unconditional estimate of the trend jumps and NVR parameters may be obtained by the
following iterative algorithm (cf. Example 2.3):
>> Int = [87 106];

>> nvr0 = irwsmopt(steel, 1, 'f12', Int);
>> tnew = irwsm(steel, 1, nvr0, Int);
>> ynew = steel - tnew + reconst(tnew, Int);
>> P = [0 16 8 16/3 4 2]; TVP = [1 0];
>> tol = 1e-4; iter= 0; obj= 1000*tol;
>> while obj > tol
>> nvr = dhropt(ynew, P, TVP, 17);
>> [fit, fitse, tnew, trse, comp] = ...
>> dhr(steel, P, TVP, nvr, [], [], [], [], [], Int);
>> ynew = steel - tnew + reconst(tnew, Int);
>> if iter>0, obj = max(abs(nvrold-nvr)); end
>> nvrold = nvr; iter = iter+1;
>> [iter obj]
>> end
nvr =
0.0001
0.0434
0.0000
0.0009
0.0233
0.0017

The first four lines yield an initial estimate of the constructed series without the trend
jumps based on the IRW + noise model (see Figure 2.6). Subsequently, the iterations are
built in such a way that the NVR parameters are estimated using continuously updated
versions of the jump-free series (first line after the while sentence). However, in each
case the smoothing is based on the DHR model for the raw data, the current estimate of the
NVR parameters and the variance intervention points. The iterations end when the
difference between two subsequent estimates of NVR values are less than the tol control
value.
Finally, the trend estimates, together with the cyclical and seasonal components are all
illustrated in Figure 3.9, obtained as follows.
>> subplot(311), plot([steel tnew]) % Trend and series

>> subplot(312), plot(sum(comp(:, 1:3)')') % Cycle
>> subplot(313), plot(sum(comp(:, 4:5)')') % Seasonal
500
Trend
400
300
50
Cycle
-50
20
Seasonal
-20
-40
1955 1960 1965 1970 1975 1980 1985 1990
Months
Figure 3.9 Estimated components of the steel consumption series, using the frequency domain method.

Example 3.4 Car drivers killed and seriously injured
In order to show that the iterative procedure developed in the previous example is
applicable to other data sets, a similar analysis is performed for the monthly car drivers
killed and seriously injured in Great Britain from January 1969 to December 1984. Figure
3.10 displays this series using the following commands.
>> load cars.dat

>> t = (1969 : 1/12 : 1984.99)';
>> plot(t, cars)
2500
Casualties
2000
1500
1970 1975 1980 1985

Months
Figure 3.10 Car drivers killed and seriously injured in Great Britain from January 1969 to December 1984.
A mild trend and seasonal components are clearly visible in these data. However, more
interesting, are the apparent jumps in the series related to seat-belt legislations in January
1974 and February 1983 (samples 61 and 179). As before, these may be accounted for with
the trend signal using variance interventions. In this case, it is also clear that the seasonal
pattern changes in 1974 and 1975 after the first intervention point, so for the purposes of
the present example, a third intervention is set at sample 74 (February 1975).
An initial estimate of the trend is first obtained using irwsm/irwsmopt with variance
interventions. The AR-spectrum of the raw data and the reconstructed intervened data are
shown in Figure 3.11, where it is clear that the final harmonic is not present and is,
therefore, not necessary in the analysis. Furthermore, the distortion due to the trend jumps
are less significant here than in the steel consumption series. Nonetheless, for illustrative
purposes, the full iterative procedure outlined in the previous subsection is still
implemented and yields the components illustrated in Figure 3.12, together with the
reconstructed series in Figure 3.13.

log10(P)
2
0
20 10 6.67 5 4 3.33 2.86 2.5 2.22 20
Figure 3.11 AR-spectrum of the driver casualties data (dotted) and constructed series (solid).
2500
2000
Trend
1500
400
Seasonal
200
-200
200
100
Irregular
0
-100
-200
1970 1975 1980 1985
Months
Figure 3.12 Estimated components of the driver casualties data.
2500
Casualties
2000
1500
1970 1975 1980 1985
Months
Figure 3.13 Driver casualties data with the variance interventions removed.

3.5 Conclusion
The present chapter has introduced the Unobserved Components (UC) modelling tools in
CAPTAIN, and shown how the toolbox may be utilised for forecasting and signal extraction
of periodic time series with widely varying characteristics.
One point worth stressing again, is that in every example included in the chapter, the
seasonal components are such that not all of the theoretical harmonics are observed in the
data. In this regard, the present authors believe that the identification stage utilised above,
based on both spectral and time domain methods, is a particularly important part of the
analysis, although it is often neglected in conventional UC modelling. In addition to
providing evidence on whether all or only some of the harmonics are really necessary
(especially important for frequency domain estimation methods), this also identifies the
relative importance of each component (measured by the relative magnitude of their NVR).
This latter information is useful if the model has to be constrained in some way at the
estimation stage.
To illustrate the methodology, the models above have been limited to a trend and a cyclical
or seasonal component. However, in fact, CAPTAIN allows for a much wider range of
models than considered so far. The following chapters, therefore, introduce various
additional components such as exogenous variables.

CHAPTER 4
TIME VARIABLE
PARAMETER MODELS
Chapters 2 and 3 have developed an approach to nonstationary signal processing based on

the identification and estimation of time variable parameter (TVP) stochastic models. The
methodological tools that underpin this modelling philosophy are unified in terms of the
discrete-time Unobserved Components (UC) model (3.1). Here, in order to allow for non
stationarity in the time series yt , it is assumed that the various components in the model,
including the trend Tt , can be characterised by TVPs.
Most often, the nature of such parametric time variability will not be known prior to the
analysis and so each TVP is defined as a non stationary stochastic variable. This adds a
statistical degree of freedom to the estimation problem, so allowing for the estimation of
any slow parameter variations. By slow, we mean here variations that are slow in relation
to the variations in the time series itself. Such variations may result from slow physical
changes in the process or from some form of nonlinearity in the data. In this manner, the
models obtained are all inherently self-adaptive: namely, they change their parameters
automatically in an optimal manner to reflect changes in the nature of the time series. For
this reason, they can be exploited in applications such as self-adaptive forecasting,
operational control and management.
In practice, as mentioned in Chapter 3, not all the possible components in the UC model
are necessary: indeed, the simultaneous presence of all these components can induce
identifiability problems in which it is not possible to unambiguously estimate the model.
For this reason, the models considered in Chapter 3 are limited to univariate time series
characterised by a trend, together with a sustained cyclical and/or seasonal component. To
complete the discussion, therefore, the present Chapter considers the other optimal
recursive TVP models that may be estimated using CAPTAIN.
In particular, the Chapter describes the entire class of TVP, or dynamic, regression
models, including Dynamic Linear Regression (DLR), Dynamic Harmonic Regression
(DHR) and Dynamic Auto-Regression (DAR), as well as the closely related, TVP version
of the Auto-Regressive eXogenous variables model (DARX). Finally, the Chapter
considers an alternative Dynamic Transfer Function (DTF) model, estimated using an
Chapter 4 Time Variable Parameter models
instrumental variable method of fixed interval smoothing, and shows how this is superior
to the DARX model when measurement noise is present. The practical utility and self-
adaptive functionality of the dynamic regression model in these various forms is illustrated
by both simulated and practical examples.
In CAPTAIN, the required forward pass filtering and fixed interval smoothing algorithms are
accessible via shells, namely the functions dlr, dhr, dar/darsp, darx and dtfm, while
associated hyper-parameters are estimated using dlropt, dhropt, daropt, darxopt and
dtfmopt respectively. These shells provide for ready estimation of the various special
cases discussed below.
4.1 Dynamic Linear Regression (DLR)
As discussed in Chapter 2, the SS model (2.1) is particularly well suited to estimation

based on optimal time variable parameter recursive estimation, in which the time variable
parameters (acting as surrogate states) are estimated sequentially by the Kalman Filter
(KF) whilst working through the data in temporal order. In the off-line situation, where all
the time series data are available for analysis, this filtering operation may be accompanied
by optimal Fixed Interval Smoothing (FIS).
In this regard, one of the simplest yet widely applicable SS models using time variable
parameters, is a DLR model based on the exogenous input component f (ut ) of equation
(3.1), interpreted in its most basic linear regression form, i.e.,
i =m
yt = Tt + bit u it + et et ~ N {0, 2 } t = 1,2,..., N (4.1)
i =1
where Tt is a trend or low frequency component; bit , i = 1,2,..., m are either constant
parameters (the normal regression model) or they may vary over the observation interval to
reflect possible changes in the regression relationship; and uit , i = 1,2,..., m are the
regression (input or exogenous) variables that are assumed to affect the dependent
variable yt . The presence of significant time variation can be due to various causes,
dependent on the nature of the application, as discussed in the examples below. Finally, as
shown, et is an irregular component, normally defined for analytical convenience as a
serially uncorrelated and normally distributed Gaussian sequence with zero mean value
and variance (i.e. discrete-time white noise).
2
Equation (4.1) is a generalisation of the two parameter DLR model introduced in

Chapter 1: see equation (1.3). In this regard, note that Tt is effectively another TVP with
an associated regression variable of unity and, if required, must be explicitly specified as
such when using CAPTAIN.

Reflecting the statistical setting of the analysis, the stochastic evolution of each parameter
is assumed to be described by the Generalised Random Walk (GRW) process introduced in
Chapter 2, including RW, AR(1), IRW, SRW, LLT and damped trends as particular cases.
As discussed previously, the AR(1), SRW and damped trend models all require the
specification or optimisation of an additional hyper-parameter, . An overall state space
model (2.1) can then be constructed straightforwardly by the aggregation of the GRW
subsystem matrices, in a similar manner to the examples given in Chapter 3 for dynamic
harmonic regression.
Take, for example, a DLR model with an IRW trend and two inputs, where the latter are
governed by SRW and RW parameters, respectively. The overall SS form of such a model
is given by,
Tt 1 1 0 0 0 Tt 1 0

Dt 0 1 0 0 0 Dt 1 Tt 1
b = 0 0 1 0 b1t 1 + 0
1t
b'1t 0 0 0 1 0 b'1t 1 1t 1
b 0 0 0 0 1 b2 t 1 2 t 1
2t
(4.2)
yt = (1 0 u1t 0 u2 t )xt + et
Recall from equation (2.1) that the system disturbances noise vector, denoted earlier
by t , contains the white noise inputs to each of the TVP models, i.e. Tt , 1t and 2t
here. These white noise inputs are assumed to be independent of the observation noise et
and have a covariance matrix Q formed from the combination of the individual covariance
matrices for each parameter. The associated NVR matrix Qr is defined as follows,
Q
Qr =
2
The NVR parameters that characterise Qr are unknown prior to the analysis and clearly
need to be estimated on the basis of the time series data yt before the filtering and
smoothing algorithms can be utilised. The optimization of both the NVR and hyper-
parameters in this DLR context, is accomplished either by Maximum Likelihood (ML)
optimisation or by the minimisation of the multiple-steps-ahead forecasting errors, as
discussed earlier.
Note that, in the case of the simplest random walk model for all the parameters involved,
each parameter can be assumed to be time-invariant if the variance of the white noise input
in the state equation is zero. Then the stochastic TVP setting reverts to the more normal,

constant parameter regression situation. In other words, the recursive estimation algorithms
described below for the general stochastic TVP case will provide constant parameter
estimates identical to the normal en-bloc regression if RW models with zero variance white
noise inputs are specified. Of course, there is some added value to the recursive solution
even in this situation, since the user is provided with the recursive estimates over the whole
interval.
Furthermore, forecasting, interpolation and backcasting are an inherent part of these

filtering and smoothing algorithms. For example, if missing samples are encountered
anywhere within the output series, then the KF and FIS algorithms provide an optimal
interpolation (Chapter 2). If, on the contrary, missing observations are found immediately
after the last sample or prior to the first one, optimal forecasts and backcasts are similarly
produced. Of course, all these cases require knowledge or forecasts of the exogenous
regression variables over the missing data period.
As illustrated in the example below, dlr and dlropt are the CAPTAIN functions for general
DLR analysis and hyper-parameter optimisation, respectively.
Example 4.1 Initial Evaluation of the Relationship Between Sunlight and Dissolved
Oxygen in the River Cam using DLR (Young, 1998b)
Although regression analysis is a particularly popular method of modelling economic,

business and social data (see e.g. Example 1.1), DLR analysis can also prove useful in the
initial data evaluation and the processing of environmental and other scientific data. For
example, one interesting and successful practical example of the latter type is discussed by
Young and Pedregal (1996) where this approach is utilised in the analysis of LIDAR
(laser-radar) data. In the present example, however, we consider how DLR analysis can be
applied to the data shown in Figure 4.1 (Beck and Young, 1975): namely 81 daily
measurements of Dissolved Oxygen (DO) in the river Cam, near Cambridge; together with
the associated measurements of sunlight hours.
As we shall see, DLR analysis is not an entirely appropriate method of modelling these
data: indeed they have been selected here in order to stress the need for careful appraisal of
the TVP estimation results before making any scientific inferences. In effect, the analysis
does no more than provide an initial evaluation of the simplest possible relationship
between the two time series and how it appears to change over time. However, the
simplicity of the DLR analysis helps to expose more clearly how DLR modelling can
function as a useful and easy to use exploratory tool in these initial stages of time series
analysis.

12
10
mg/l 8
4
0 10 20 30 40 50 60 70 80
15
10
hours/day
0 10 20 30 40 50 60 70 80
Time (days)
Figure 4.1 River Cam data: dissolved oxygen (top) and sunlight hours (bottom).
It is well known that sunlight can influence DO levels because of physical and biological
factors and it is not surprising, therefore, that there is a visible relationship between the two
variables in these plots. The maximum cross correlation coefficient between the series is
0.5542 when the sunlight series is lagged (delayed) by one sample.
Using CAPTAIN, this can be seen be entering the following commands,
>> load cam.dat

>> u = cam(:, 1); % sunlight (hours/day)
>> y = cam(:, 2); % DO (mg/l)
>> ccf(y, u);
Therefore, perhaps the most obvious constant parameter regression model takes the form,
yt = T + b1u t 1 + et t = 1,2,...,81 (4.3)
where yt represents the DO measurements, while T and b1 are constant parameters (again
showing how the trend in the DLR model is a time variable equivalent of the intercept
parameter in the constant parameter regression model). This model is determined as
follows,

>> z = [ones(size(u)) del(u, 1)]; % define regressors

>> [fit, fitse, par, parse] = dlr(y, z);
>> par(end, :) % final parameter estimates
ans =
6.4284 0.1423
>> parse(end, :) % final standard errors
ans =
0.1870 0.0256
>> rt2 = 1-(cov(y)-cov(fit))/cov(y) % coefficient of determination
rt2 =
0.3134
Here, the CAPTAIN function del provides the necessary lagged sunlight values. Only two
input arguments to dlr are required, since constant parameters are assumed by default (i.e.
RW model with NVR = 0 for both TVPs: see Chapter 8 for details). As shown above, this
yields estimates of T = 6.43 0.187 and b1 = 0.142 0.026 , together with a coefficient of
determination R2 = 0.313 : i.e. the regression model with these constant parameter
estimates explains only 31.3% of the DO series. The output of this model (dashed line) is
compared with the DO data (circles) in Figure 4.2 and the poverty of the fit is obvious: it
explains the intermediate values of DO between 6.5 and 8.5 mg/l to some extent, but fails
completely to explain the larger deviations from the mean DO level (7.28 mg/l).
12
10
mg/l
4
0 10 20 30 40 50 60 70 80
0.4
0.2
b1
-0.2
0 10 20 30 40 50 60 70 80
Time (days)
Figure 4.2 DLR analysis of River Cam data. Top: comparison of DLR model output (full trace) with DO
data (circles), while the output of the constant parameter model is shown dashed. Bottom: time varying b1 t
and standard errors, together with the constant parameter equivalent b1 = 0.142 .

Therefore, it is useful to turn to the time variable form of the model. In fact, it makes some
sense to constrain the Tt trend estimate to be constant in this case, in order to force all the
estimated variation into the b1 t parameter, which controls the direct relationship between
DO and sunlight. The first step is to optimise the NVR hyper-parameters as shown below,
>> nvr = dlropt(y, z, [0 1], [], [0 -2])

nvr =
1.0e-003 *
0
0.4258
As before, the stochastic model for the variations in Tt are specified as a RW process with
the associated NVR constrained to zero. Here, however, the b1 t parameter is defined as an
IRW with a freely optimised NVR (see Chapter 8 for details). It is clear that the default
ML optimisation yields an NVR = 0.00043 for this model.
Some comment is required regarding the 5th input argument above, i.e. [0 -2], which
specifies the constraints for each NVR, listed in the same order as the regressors. Here, any
value greater than or equal to zero yields a fixed NVR of that value, while the -2 employed
above implies free optimisation. Constrained optimisation is also possible: all NVRs
associated with -1 will be optimised to the same value. For example, with 5 TVPs,
specifying [-1 -1 -1 -2 0.1] implies that the first three NVRs will be optimised together
(returning the same value), the 4th will be optimised independently, and the final NVR will
take the defined fixed value (0.1).
Returning to the present example, the model fit is obtained in the usual manner,
>> [fit, fitse, par, parse] = dlr(y, z, [0 1], nvr);

>> par(end, :)
ans =
6.6213 0.2081
>> parse(end, :)
ans =
0.1501 0.0513
>> rt2 = 1-(cov(y)-cov(fit))/cov(y)
rt2_=
0.6684
It is clear that the trend is now estimated as a constant value of T = 6.62 0.15 , while b1 t
is shown as the time varying solid line in the lower panel of Figure 4.2 (its final value
b1N = 0.2081 ). As expected, this DLR model now has an improved value of R 2 = 0.668 .
Note that, by allowing both T and b to vary over time as IRW processes, the fit may be
t 1t
improved further to R = 0.849 .

2

This DLR model seems reasonably satisfactory, but does it provide a meaningful
representation of the data? In this regard, it is necessary first to consider whether the DLR
normalised recursive residuals are satisfactory. Here, however, the Autocorrelation (ACF),
Partial Autocorrelation (PACF) and Cross Correlation Functions (CCF) show some
evidence of misspecification (see Chapters 2 and 6 for CAPTAIN usage), albeit marginally,
with some evidence of minor correlation in all cases. These statistical deficiencies of the
model suggest simply that it is not an entirely appropriate representation of the relationship
between sunlight and DO. This conclusion is not surprising since the real relationship is
probably more complex and a static regression model, even with dynamically changing
parameters, cannot hope to explain the data in an entirely satisfactory manner. For these
reasons, we will return to this example later in the Chapter, when discussing the more
complicated DARX model.
4.2 Dynamic Harmonic Regression (DHR)
The DHR model contains the trend, cyclical, seasonal and white noise components of
equation (3.1), i.e.,
yt = Tt + St + Ct + et t = 1,2,..., N (4.4)
Although it is sometimes convenient to define the seasonal term St and the cyclical term
Ct separately (e.g. Young, 1998), they are both modelled in the same manner and, in fact,
no distinction is made in CAPTAIN. Both are defined by equation (3.2) and the CAPTAIN
user simply specifies the periodic components required, i.e. the fundamental and harmonic
frequencies associated with the seasonality, together with the frequencies associated with
the (normally longer period) cyclical component.
In both cases, these frequency values are chosen by reference to the spectral properties of
the time series, as discussed in Chapter 3. As for the DLR model above, the trend
component Tt is also be considered as a stochastic, time variable intercept parameter and
so is incorporated, if so desired, into the cyclical or seasonal components as a zero
frequency term. This DHR model can be considered as a straightforward extension of the
classical, constant parameter, Harmonic Regression (or Fourier series) model, in which the
gain and phase of the harmonic components can vary as a result of estimated temporal
changes in the parameters.
In general, each of these TVPs, as well as the trend Tt , are modelled as GRW processes
and the subsequent recursive estimation procedures are exactly the same as for the DLR
model, except that the NVR values (and any other hyper-parameters) in the GRW models
associated with the parameters of each ith component are usually constrained to be equal.

However, as discussed in Chapter 3, the ML method used for hyper-parameter

optimization in the DLR case does not work so well in this DHR context and so a novel
frequency domain optimization algorithm has been developed for CAPTAIN.
It is worth noting that adaptive forecasting, interpolation and backcasting are much more
straightforward than in the DLR case, because the regression variables in the DHR model
are all known functions of time and can be specified easily outside the data sample when
using the model for forecasting and backcasting purposes.
The DHR model may be estimated directly using the CAPTAIN function dlr, by manually
specifying the regressors as appropriate harmonic components. However, special shells are
included in the toolbox for this purpose, namely dhr and dhropt. These functions are
useful for signal extraction and forecasting of periodic or quasi-periodic series, as shown
by the examples in Chapter 3.
4.3 Dynamic Auto-Regression (DAR) and Time-Frequency Analysis
The basic DAR model similar to the DLR model, except that the input variables are
defined as past values of the output series. More formally, a DAR(p) model may be
formulated as:
1
yt = et (4.5)
A( L, t )
in which A( L, t ) = 1 + a1t L + a2 t L +L+ a pt L is a time variable parameter polynomial in the

2 p
backward shift operator L. On multiplying throughout by A( L, t ) so that it operates on yt ,

we obtain the DAR(p) model in the discrete-time equation form:
yt = a1t yt 1 a 2t yt 2 L a pt yt p + et (4.6)
In other words, yt is dependent on past values of itself plus a random component in the
form of the white noise et .
Noting that its constant parameter relative, the AR model, is used for spectral analysis in
the form of the AR spectrum (see Chapter 3), the most obvious application of the DAR
model is, therefore, in time-frequency analysis. Here, at the t-th time instant, the FIS
estimated parameters ai, t|N , i = 1,2,..., p of the DAR model (4.5) and (4.6) are used to
compute the instantaneous AR spectrum at that time from the well known relationship (see
e.g. Priestley, 1981),

2 1
h( ) t = ; t = 1,2,..., N (4.7)
2 1 + a 2
1,t| N exp( j ) +L+ a p ,t| N exp( j )
where 2 is the estimated variance of the model residuals. The order p is selected either
by the user on the basis of prior knowledge, or by use of the AIC (see Chapter 3). Then, for
each user-selected value of over the range 0 (zero frequency) to 0.5 (the Nyquist
frequency), h ( )t , or its logarithm, is evaluated with exp( j ) = cos( ) + j sin( ) . The
set of all these instantaneous but smoothly changing spectra over the interval t = 1,2,..., N
then provide an indication of the changing spectral properties of the series yt over this time
interval. As we shall see in the example below, these time-frequency spectra can be
presented in various ways.
The DAR model may be useful even when constant parameters are specified (by selecting
zero NVR parameters), because unlike conventional AR-spectrum analysis, it still provides
estimates when missing data are encountered. Indeed, when the CAPTAIN function arspec
detects missing data, it automatically estimates the AR model recursively in this manner.
Another important feature, is that a constant parameter AR model estimated using the DAR
function in CAPTAIN, provides recursive estimates of all the parameters and their standard
errors, so that the assumed time invariance of these parameters may, in fact, be tested.
The DAR model may be estimated directly using the CAPTAIN function dlr, by manually
specifying the regressors as appropriate past values of the output. However, this is not an
optimal approach when the time series involves missing data, since such missing data in
the output also generate missing values in the regressors (i.e. the lagged output variable).
However, a solution is provided in CAPTAIN by replacing the missing values in the output
and inputs by their expected values, according to the estimated model, as soon as they are
detected. In other words, when a missing value is encountered at the p-th lagged output, the
KF forecast replaces all subsequent occurrences of this datum in the analysis. Special
shells are included in CAPTAIN for this purpose, namely dar and daropt, while the auxilary
function darsp allows for automatic graphing of the time-frequency spectra.
Example 4.2 Analysis of a signal with sawtooth changing frequency using DAR
This example considers a simulated signal with sawtooth changing frequency. The data
are loaded into the workspace, standardised and a time invariant AR model identified,
>> load sdar.dat

>> y = stand(sdar);
>> p = aic(y)
p =
1.0000 0.0771 -0.1322

Signal 0
-1
-2
0 20 40 60 80 100 120 140 160 180 200
1
Parameters
-1
-2
0 20 40 60 80 100 120 140 160 180 200
Time (sample)
Figure 4.3 Simulated signal with sawtooth changing frequency (top);

DAR parameters and their standard errors (bottom).
The standardised data are illustrated in Figure 4.3, while the AR(2) model is shown below,
yt = 0.0771yt 1 + 0.1322 yt 2 + et (4.8)
Next, the NVR hyperparameters and DAR model are estimated, and the time-frequency
spectra graphed, as shown below,
>> nvr = daropt(y, [1:2], [1 0])

nvr =
0.0010
0.0000
>> [fit, fitse, par, parse]= dar(y, [1:2], [1 0], nvr);
>> par(end, :)
ans =
1.6209 0.8046
>> parse(end, :)
ans =
0.2140 0.0457
>> darsp(par, 6, 1);
Note that the second input argument [1:2] in the calls to daropt and dar specify the
structure of the DAR model, based on (4.8), and takes the same syntax as that used for mar

and univ (see Chapter 3). For the purposes of this example, IRW and RW models are
chosen for a1t and a2 t , respectively, with the DAR model taking the following form,
yt = a1t yt 1 a 2t yt 2 + et (4.9)
Here, it is interesting to note that the second TVP takes an almost constant value of
a2 t 1.62 0.214 (note that the associated ML optimised NVR = 5.2218e - 014 ), leaving
the first parameter a1t ( NVR = 0.001 ) to account for the time varying frequency of the
original signal, as shown by the lower plot of Figure 4.3.
In this regard, of more interest is the time-frequency spectra of the series shown in Figure
4.4. Here, the 2nd and 3rd input arguments to darsp specify that a 3D spectrum plot with a
resolution of 2 6 is required. The visual appearance of this plot depends on the computer
platform and it is sometimes necessary to experiment to find the best resolution. Finally,
note that contoured surfaces and 2D stacked plots may also be obtained by changing the
3rd input argument.
It is clear from Figure 4.4 that this series has a single peak, the frequency of which
gradually changes over time, not surprising since these simulated data were deliberately
generated in such a form to illustrate the methodology.
Figure 4.4 Time-frequency spectra.

Young (1998b) applies DAR analysis to the well known SPECtral MAPping series
(SPECMAP: see Imbrie et al, 1992), which has been obtained from the analysis of oxygen
isotope variations in deep ocean cores. Here, the spectral properties of the series are
examined, revealing some localised variations in the estimated peak frequencies over time
and a significant change at around 650-700 ka. Such analysis is useful in exposing time
series data to greater scrutiny and focussing-in on interesting aspects of the data which
would not be nearly so apparent from the results of more conventional time series analysis.
4.4 Dynamic AutoRegressive eXogenous Variables (DARX)
The DARX model is simply the extension of the DAR model (4.6) to include measured
exogenous or input time series that are thought to affect yt in a truly dynamic, systems
sense. In the case of a single input variable ut , it takes the form,
yt = a1t yt 1 a 2t yt 2 L a nt yt n + b0t u t + b1t u t 1 +L+ bmt u t m + et (4.10)
where is a pure time delay, measured in sampling intervals, which is introduced to allow
for any temporal delay that may occur between the incidence of a change in ut and its first
effect on yt . Such transport delays are, of course, a common feature of many
environmental and engineering systems. This DARX model is, in fact, a special example of
the discrete-time Transfer Function (TF) model, which is considered in more detail in
Chapter 6. This becomes apparent if it is written in the following L operator form,
B ( L, t ) 1
yt = u t + et (4.11)
A( L, t ) A( L, t )
where B ( L, t ) = b0t + b1t L + b2t L +L+ bmt L . It is a special model because it assumes
2 m
that the white noise input enters through the TF (or filter) 1 / A( L, t ) , so avoiding certain
difficult statistical problems that beset more general TF models (see Chapter 6).
The close relationship between the DAR and DARX models means that the hyper-
parameter optimisation and recursive estimation of the TVPs is identical to the earlier
examples in this chapter. Indeed, the DAR model may be estimated directly using the
CAPTAIN function dlr, by manually specifying the regressors as appropriate past values of
the input and output variables. This approach provides the greatest freedom for the user to
also specify a trend and/or other components, as shown in Example 4.3 below. However,
this solution may not be optimal; for example, if there are any missing data the same
problems discussed in Section 4.3 above apply. Therefore, special shells are included for
the estimation of a purely DAR model (4.11), namely darx and darxopt. Use of these
functions is demonstrated later in Example 4.4.

Example 4.3 River Cam Data Revisited (Young, 1998b)
Given the DLR results for the Cam data (Example 4.1) and initial evaluation of different
DARX model structures, the best identified DARX model is the following (Young,
1998b),
yt = a1t yt 1 + b0t u t + Tt + et t = 1,2,...,81 (4.12)
where ut 1 has been replaced by ut because the analysis suggests strongly that the
dynamic lag effect introduced by the lagged term in yt 1 effectively removes the need for
the pure time delay in this case. Under the assumption that the trend Tt and a1t evolve
as RW processes, while b1t varies as an IRW processes, the associated NVR coefficients
11
are optimised by ML to values of : NVR(a1t )=6.5485x 10 ; NVR(b0t )=5.0905x10 7 and
10
NVR(Tt )=2.0181x 10 , as shown below,
>> load cam.dat

>> u = cam(:, 1); % sunlight (hours/day)
>> y = cam(:, 2); % DO (mg/l)
>> z = [del(y, 1) ones(size(y)) u]; % define regressors
>> nvr = dlropt(y, z,[0 0 1]);
nvr =
1.0e-006 *
0.0001
0.0002
0.5090
>> [fit, fitse, par, parse] = dlr(y, z, [0 0 1], nvr);
>> par(end, :)
ans =
0.7186 1.6556 0.0679
>> parse(end, :)
ans =
0.0757 0.5275 0.0309
>> rt2 = 1 - cov(y-fit)./cov(y)
rt2 =
0.7433
Initially, all three parameters were assumed to vary as IRW processes but the FIS
estimation results then suggested strongly that Tt and a1t were not varying significantly, so
that more appropriate stochastic model in both cases was the RW process. In fact, as we
see from the above optimised NVR values, which are insignificantly different from zero in
the case of Tt and a1t , the ML optimisation is indicating that these parameters are
stationary. Indeed, the resulting estimates, using these optimised NVRs, are virtually
constant in both cases, with a1 = 0.719 0.076 and T = 1.656 0.528 .

The estimate of b0t , shown in Figure 4.5 is also not estimated as changing very much and
it is difficult, at this point in the analysis, to say whether the variation is significant in
comparison with the fully constant parameter ARX alternative, whose constant parameters
are estimated as a1 = 0.746 0.062 ; b0 = 0.064 0.062 ; and T = 1.478 0.436. These
latter estimates are obtained by setting the 3rd and 4th input arguments to dlr to the default
zero, as shown below,
>> [fit, fitse, parc, parse] = dlr(y, z);

>> parc(end, :)
ans =
0.7455 1.4784 0.0641
The output of the DLR model with time varying parameters (the output argument fit), as
computed from the equation,
y t| N = a1,t| N yt 1 + b0,t| N u t + Tt| N t = 1,2,...,81 (4.13)
is compared with the DO data (circles) in the top graph of Figure 4.5.
12
10
mg/l
4
0 10 20 30 40 50 60 70 80
0.15
0.1
B0
0.05
-0.05
0 10 20 30 40 50 60 70 80
Time (days)
Figure 4.5 DLR analysis of River Cam data. Top: 1-step ahead predictions (full trace) with DO data
(circles). The simulated model output (dashed) and constant parameter (dotted) model outputs are also shown
for comparison. Bottom: time varying b0 t and standard errors, together with the constant parameter
equivalent b0t = 0.0641 (dashed).

The associated R2 = 0.743 can be compared with the initial DLR model (Example 4.1)
R 2 = 0.668 , and the constant parameter ARX model of R = 0.727, which is only a little
2
smaller. The statistical diagnostics are more satisfactory than the initial DLR model: the
ACF and PACF of the normalised recursive residuals show no significant lag correlation
and are consistent with the white noise assumption. However, the CCF between the
residuals and the sunlight series shows some minor instantaneous correlation.
Since the coefficients of determination for the TVP and constant parameter models are so
similar, it would appear at first sight that little is being gained by allowing for time variable
parameters in this case. However, there is an important complicating factor that needs to be
considered: the output yt| N of the model in equation (4.13) represents only the one-step-
ahead predictions of the DLR model, since yt 1 on the right had side of the equation is the
last measured value of the DO. Consequently, the R2 values relate to the one-step-ahead
prediction errors, which is the normal definition of the coefficient of determination for
regression-type models. Unfortunately, in the case of transfer function models of the
DARX and ARX type, this measure can often provide a somewhat overly optimistic
indication of the models explanatory ability, which may be the reason why it is so often
quoted in the modelling literature!
A much more discerning and critical measure of the models ability to characterise the data
is the coefficient of determination based on the simulation model residuals, RT2 , in which
the simulation model output yt|s N is generated from the equation,
y ts|N = a1,t| N y ts1| N + b0,t| N ut + Tt| N t = 1,2,...,81yt (4.14)
where now the lagged output term on the right hand side of the equation is the last value of
the simulated output yts1|N , rather than the measured yt 1 . In other words, yt|s N is generated
solely from the sunlight series ut and the trend term Tt | N (here estimated as a constant),
without any reference at all to the measured DO, yt , as shown below.
>> tf=y(1); % initial condition

>> for ff=2:length(y)
>> tf(ff, 1) = par(ff, 1)*tf(ff-1) + par(ff, 2) + par(ff, 3)*u(ff);
>> end
>> rt2 = 1 - cov(y-tf)./cov(y)
rt2 =
0.6189
As expected, the RT2 = 0.619 value based on this simulation model residuals, t = yt yt|N ,
s
is quite a lot less than the equivalent R2 = 0.743 obtained from the one-step-ahead
prediction errors et = yt y t|N , but it is much better than the RT2 = 0.478 based on the
simulation model with all constant parameters. The plot of yts1|N for the DARX model is

shown as the dashed line in Figure 9 and this is clearly superior to the equivalent graph of
the constant parameter ARX model output, shown as the dotted line, despite the fact that
the b1, t|N TVP estimate is changing very smoothly and by quite small amount.
So what can we conclude from the results here and in the previous DLR modelling
exercise? In terms of the R2 values, the DLR and DARX models are very similar. Both
have a single time variable parameter, the coefficient associated with the sunlight series,
although the estimated variations of the DARX parameter are much less and much
smoother than in the DLR case. But the DARX model has an additional constant
parameter, the coefficient associated with its additional regression variable, the lagged
dependent variable yt 1 . And, finally, the DARX model has superior, albeit not perfect,
statistical diagnostics.
Taking all these factors in to consideration, the DARX model seems to be marginally
superior. First, its single TVP varies less and much more smoothly that the equivalent TVP
in the DLR model, which is clearly advantageous (if a constant parameter model can be
identified and estimated it is always preferable to a TVP model). Second, it is a dynamic
model, in the systems sense, which seems more acceptable from a physico-biological
standpoint and better satisfies DBM modelling requirements. The analysis provides a very
straightforward, quick and objective methods of analysis, which reveal that the potentially
changing nature of the relationship between lagged sunlight and DO can be represented,
quite well, by very simple models with only one smoothly changing parameter.
Furthermore, having completed this initial analysis, the estimated variations in the
parameters can be investigated further to see if they are associated with other measured
variables (states or inputs) of the system. For example, we might wish too investigate
whether b0t is a function of temperature, as it might well be from physico-biological
considerations. The CCF between the b0t parameters from both the DLR and DARX
models and water temperature reveals a quite marked correlation with a maxima of about
0.7 at lag zero. Moreover, if the water temperature data is smoothed using irwsm with
NVR = 0.00002 , then there is a remarkable maximum instantaneous correlation of 0.997,
as shown below,
>> w = cam(:, 3); % river water temperature (degrees Centigrade)

>> w = irwsm(w, 1, 0.00002);
>> ccf(w, par(:, 3));

X(solid) Y(dashed)
2
-2
-4
0 10 20 30 40 50 60 70 80
CCF between X(t) Y(t-k) CCF between X(t-k) Y(t)

1
0.5
-24 -12 0 12 24
Figure 4.6 CCF between the DARX estimate b1t (dashed) and smoothed water temperature (full).
While this certainly does not mean that there is a physical relationship between the TVPs
and water temperature, it does show how the analysis can expose potential relationships
and provide food for thought. In this case, it suggests that the b0t parameter in both models
may be a State Dependent Parameter (SDP), of the kind discussed in Chapter 5, and such
state dependency is suggestive of a multiplicative nonlinearity of the bilinear kind.
Finally, by demonstrating the need for such a lagged, TVP or SDP relationship between the
variables, it provides a useful prelude to further, more mechanistically oriented DBM
modelling which considers the possibility of simple but nonlinear stochastic, dynamic
relationships involving other relevant variables, such as: upstream measurements of DO;
Biochemical Oxygen Demand (BOD) arising from pollution in the river, nutrient inputs
and algal dynamics (see e.g. Beck and Young, 1975)).
4.5 Dynamic Transfer Function (DTF)
Unfortunately, the DARX model (4.10) is limited in practical terms since it depends on the
assumption of the, rather specific, signal topology, with the noise entering the model
through a restricted AR process with a polynomial A(L,t) equal to that of the denominator
polynomial. A more general Dynamic Transfer Function (DTF) model, without the
restrictions of the DARX, is the following,

B(L,t )
yt = u + t
A(L,t) t (4.15)
A(L,t) and B(L,t) are time variable coefficient polynomials in L of the following form:
A(L,t) = 1 + a1,t L + a2,t L + ... + an, t L

2 n
(4.16)
B(L,t) = b0,t + b1,t L + b2,t L2 + ... + bm,t Lm
Here, t represents uncertainty in the relationship arising from a combination of

measurement noise, the effects of other unmeasured inputs and modelling error. Normally,
t is assumed to be independent of ut and is modelled as an AutoRegressive (AR) or
AutoRegressive-Moving Average (ARMA) stochastic process (see e.g. Box and Jenkins,
1970; Young, 1984), although even this restriction can be avoided by the use of
instrumental variable methods, as discussed below.
Equation (4.16) can be written in the following vector equation form,
yt = z t pt + t
T
(4.17)
where,
z Tt = [ yt 1 yt 2 . . . yt n ut . . . ut m ]
pt = [a1,t . . . bm,t ]
T
a2,t . . . an,t b0,t
(4.18)
and t = A( L, t ) t . For convenience of notation, let pt be defined as follows,
pt = [p1,t . . . pn+ m+1,t ]

T
p2,t (4.19)
with pi, t , i = 1,2,...,n + m + 1 , relating to the TF model parameters ai, t and b j,t through
(4.17). In order to estimate the assumed time variable model parameters in pt , it is
necessary to make some assumptions about the nature of their temporal variability. As for
the earlier examples in this Chapter, the ith parameter, pi, t , i = 1,2,...,n + m + 1 , in pt is
[ ]
T
defined by a two dimensional stochastic state vector xi,t = li,t di,t , where li,t and di,t
are, respectively, the changing level and slope of the associated TVP. The stochastic
evolution of each xi,t (and, therefore, each of the n+m+1 parameters in p t ) is assumed to
be described by the GRW process defined in Chapter 2.
Having introduced the GRW models for the parameter variations, an overall SS model
(2.1) can then be constructed straightforwardly by the aggregation of the subsystem
matrices. As before, the white noise inputs i ,t , which provide the stochastic stimulus for
parametric change in the model, are assumed to be independent of the observation noise et

and have a covariance matrix Q formed from the combination of the individual covariance
matrices Q ,i . Finally, H t is a 1xp vector of the following form,
H t = [yt 1 0 yt 2 0 .... yt n 0 ut 0 .... ut m 0]
that relates the scalar observation yt to the state variables, so that it represents the DTF
model (4.12) with each parameter defined as a GRW process. In the case of the scalar RW
and AR(1) models, the alternate zeros are simply omitted.
In the previous sections of this Chapter, a standard algorithmic approach to the problem
has been utilized based on a forward-pass filtering algorithm, followed by fixed interval
smoothing. In this regard, it should be noted that the recursive filtering algorithm is closely
related to the Kalman Filter (KF: 1960) and is often referred to as such. The difference is
that the H t matrix in the present, recursive TVP estimation context for the TF model
(4.12), is based on measured variables. In particular, the output variables yt i , i = 1,2,...,n ,
in H t are affected by the noise t (the errors-in-variables problem); whereas, strictly, in
the KF, H t has to be composed of exactly known (but, if necessary, time variable)
deterministic coefficients.
This difference is important in the present TF context since it can be shown that the TVP
estimates obtained from the standard recursive filtering/smoothing algorithm will be
asymptotically biased away from their true values. This bias may be unimportant if the
model is to be used within a forecasting context since the forecasts produced by the model
are not biased (although they may not be statistically efficient). However, the level of the
bias is dependent on the magnitude of the measurement noise and it can be problematic in
high noise situations, particularly if the parameters are physically meaningful (see e.g.
Young, 1984 for a discussion of this problem in the constant parameter situation).
For this reason, it is necessary to modify the standard algorithm (2.5, 2.6) to avoid these
biasing problems. The approach taken is similar to that discussed in Chapter 6 for the
identification of general discrete-time transfer function models and requires the
introduction of instrumental variables. In relation to the time series yt , t = 1, 2,..., N , the
time variable parameter recursive Instrumental Variable (IV) filtering/smoothing algorithm
has the following form:
1. Forward Pass Symmetric IV Equations (iterative)
Prediction:
x t |t 1 = Fx t 1
P t |t 1 = FP t 1 FT + GQ rG T (4.21)

Correction:
[ ] {y H x
1
x t = x t|t 1 + P Tt 1 + H
t|t 1H tP t |t 1 H
Tt t t }
t|t 1
P t = P t|t 1 P
t|t 1 [1 + H
HT
t
P
t t |t 1
]H
H T
t
1
P t t |t 1
(4.22)
where,
= [ x 0 x 0 .... x
H t t 1 t 2 t n 0 ut 0 .... ut m 0] (4.23)
B j 1 ( L, t )
xt = u
Aj 1 ( L, t ) t
(4.24)
As before, the FIS algorithm is in the form of a backward recursion operating from the end
of the sample set to the beginning.
2. Backward Pass Fixed Interval Smoothing IV Equations (FISIV: single pass)
x t |N = F 1 [x t +1|N + GQr GT L t ]
[ ] [F L H ]
T
H
Lt = I P T T T x }
{yt +1 H
t +1 t +1H t +1 t +1 t +1 t +1 t +1 (4.25)
FT P 1
P t|N = P t + Pt t +1|t [P P ]P
t +1|N t +1|t
1
t +1|t FPt
with LN = 0 .
The main difference between the above algorithm (4.23)-(4.25) and the standard
filtering/smoothing algorithms is the introduction of hats on the H vector and the P
matrix. H in (4.15) is the IV vector, which is used by the algorithm in the generation of
t

all the P t terms and is the main vehicle in removing the bias from the TVP estimates. The
subscript j 1 on A (L,t) and B (L,t) indicates that the estimated DTF polynomials in
j 1 j 1
the auxiliary model, (4.15), which generates the instrumental variables xt that appear in
, are updated in an iterative manner, starting with the least squares
the definition of H t
estimates of these polynomials. Iteration is continued until the forward pass (filtered) IV
estimates of the TVPs are no longer changing significantly: normally only 3 iterations are
required.
This iterative approach is based on the IV algorithm for constant parameter TF models
(e.g. Young, 1984), except that the symmetric gain version of the IV algorithm (Young,
1970; 1984, p.183) is used, rather than the more usual asymmetric version: see Chapter 6
for details. This is necessary in order that the standard recursive FIS algorithm can be used
to generate the smoothed estimates of the TVPs. Note also that, in these algorithms, the

NVR matrix Q r is defined by equation (2.10) as usual. The optimization of these hyper-
parameters is achieved through either Maximum Likelihood estimation or the minimisation
of the n-step ahead forecasting errors as discussed before.
These modified filtering and smoothing algorithms mean that the standard dlr function
cannot be utilised to estimate the TF model (4.15). Instead, special shells are included in
CAPTAIN, namely dtfm and dtfmopt, which require the user to specify only the structure
of the model, as shown below.
Example 4.4 Comparison of DARX and DTFM for simulated data
As a straightforward example of DTF analysis, consider the estimation of the parameters in

the following first order TVP model with gradually changing denominator parameter, a
fixed numerator parameter and a time delay of two samples.
0.5
yt = ut 2 + et (4.25)
1 + a1t L
where yt represents the output, ut the input and et a zero mean white noise signal. For
this example, we allow the denominator parameter to change slowly over time as a sine
wave, as illustrated in Figure 4.7. The MATLAB code for this example is shown below,
where we assume IRW models for both parameters.
>> load sdtfm1.dat

>> y=sdtfm1(:, 1); % output
>> u=sdtfm1(:, 2); % input
>> nvr=dtfmopt(y, u, [1 1 2], 1);
nvr =
1.0e-007 *
0.3068
0.0000
>> [tfs1, fit1, fitse1, par1, parse1]=dtfm(y, u, [1 1 2], 1, nvr);
>> [tfs2, fit2, fitse2, par2, parse2]=darx(y, u, [1 1 2], 1, nvr);
The final line above estimates the equivalent model using the DARX noise assumption, i.e.
instrumental variables are not utilised in the filtering and smoothing algorithm. The 3rd
input argument in the calls to dtfm, darx and dtfmopt is the model structure (4.35),
represented by the triad [n, m, ]. Of course, this example assumes that the model
structure is known prior to the analysis. In practice, the identification procedure could
involve testing a range of potential model structures until the most appropriate one is
found. Alternatively, the identification tools described later in Chapter 6 may be utilised.

8 18
ML optimization yields NVR(a1t )=3.0679x 10 ; NVR(b0t )=6.8547x 10 , where it will
be noted that the NVR for the b0 parameter is insignificantly different from zero,
indicating that the parameter is identified as being time invariant. This shows how, quite
objectively, the ML optimization is able to identify the relative temporal variability of the
model parameters from the input-output data, without any other a priori information.
10
5
Signal
-5
-10
0 500 1000 1500
1.5
1
0.5
A1
0
-0.5
-1
-1.5
0 500 1000 1500
Figure 4.7 Top: output yt ; bottom: parameter (thin); DTF estimate (solid) and standard errors;
for comparison, the DARX estimate is also shown (dotted).
The dotted trace in the lower plot of Figure 4.7 shows the equivalent DARX estimates of
a1t using the same NVRs. The superiority of the DTF estimates is clear. The DTF model
with these estimated parameters explains the data well: the coefficient of determination
based (on the rather noisy) simulated model output compared with the noise free output,
R T2 = 0.6075 ; whilst for the DARX model, this is reduced to R T2 = 0.4705 . The model
residuals (innovations) for the DTF model are also superior: they have an approximately
normal amplitude distribution; and, as required, both the ACF and the CCF between the
residuals and the input ut , are insignificant at all lags. In contrast, the CCF for the DARX
model residuals shows significant correlation with ut at some lags.

4.6 Conclusion
The present chapter has briefly summarised the entire class of TVP or dynamic,
regression models implemented in CAPTAIN, including Dynamic Linear Regression (DLR),
Dynamic Harmonic Regression (DHR) and Dynamic Auto-Regression (DAR), as well as
the closely related, TVP version of the Auto-Regressive eXogenous variables model
(DARX) and an alternative Dynamic Transfer Function (DTF) model, estimated using an
instrumental variable method of fixed interval smoothing.
As pointed out in the introduction to this chapter, such TVP models allow for the
estimation of any slow parameter variations that may result from slow physical changes in
the process or from some form of nonlinearity in the data. However, when the parameters
vary at a rate commensurate with that of the system variables themselves then the model
may behave in a heavily nonlinear or even chaotic manner. Nonetheless, if these TVPs are
found to be functions of the state or input variables (i.e. they actually constitute stochastic
state variables), then CAPTAIN provides for the estimation of State Dependent Parameter
(SDP) models, as discussed in the next Chapter.

CHAPTER 5
STATE DEPENDENT
PARAMETER MODELS
The idea of using State-Dependent Parameter (SDP) models to represent nonlinear

dynamic systems goes back to Young (1978), who showed how the forced logistic growth
equation could be represented, identified and estimated in SDP form. However, the
practical development of these ideas is of a more recent origin (Young, 1993b, 1998a,b,
2000, 2001a,b; Young et al., 2001); and the sdp tool in CAPTAIN has only been available
since 2001.
5.1 The State-Dependent ARX (SDARX) Model
In order to introduce the ideas that underlie SDP models, consider first the Dynamic ARX
(DARX) model introduced in Chapter 4 which, for the case of a single input variable, is
written in the following form,
y t = a1t y t1 a2t y t 2 L ant y tn + b0t ut + b1t ut 1 + L + bmt ut m + et (5.1a)
or, in transfer function terms,
B(L,t) 1
yt = ut + et (5.1b)
A(L,t) A(L,t)
Equation (5.1) is based on a nomenclature favoured by econometricians and used

throughout most of the present book. However, earlier publications concerned with SDP
models have utilised the following nomenclature for equation (5.1), as used by systems and
control analysts,
1
Bk (z ) 1
yk = 1 uk + e (5.2a)
Ak (z ) Ak (z1 ) k
For consistency with these numerous earlier publications, the present chapter will utilise
this alternative nomenclature. Here z i , rather than Li is used as the backward shift
operator and the subscript k is used to denote that the associated variable is sampled at a kth
sampling instant: i.e. zi y k = y k i ; is still used to denote a pure time delay; and ek is a
Chapter 5 State Dependent Parameter models
zero mean, white noise signal. In this form, Ak (z1 ) , Bk (z 1 ) are the following TVP
polynomials in the backward shift operator z 1 ,
1 1 n
Ak (z ) = 1 + a1,k z + ... + an,k z
(5.2b)
Bk (z 1 ) = b1,k z 1 + ... + bm,k z m
The polynomial coefficients, ai,k ,i = 1,2,...,n and b j,k , j = 1,2,...,m may vary between
samples k to k+1, and the stochastic evolution of each parameter is assumed to be
described by the Generalised Random Walk (GRW) process introduced in Chapter 2,
including RW, AR(1), IRW, SRW, LLT and damped trends as particular cases. As
discussed previously, the AR(1), SRW and damped trend models all require the
specification or optimisation of an additional hyper-parameter, , although this is rarely
necessary in practice. Also, as before in Chapter 4, the model structure is defined by the
triad [n, m, ] (see Example 4.4).
The SDP version of the model (5.2a) is made explicit by writing the definitions of (5.2b) in
the following SDP form:
Ak (z1 ) = A( k ,z1 ) = 1+ a1 ( k )z1 + ... + an ( k )z n

(5.2c)
Bk (z 1 ) = B( k ,z1 ) = b0 ( k ) + b1 ( k )z 1 + ... + bm ( k )z m
where the notation ai ( k ),i = 1,2,...,n; b j ( k ), j = 0,2,...,m indicates that the parameters are
nonlinear functions of the vector k . In general, k is defined in terms of any variables on
which the parameters are identified to be dependent. In the present context, however, each
SDP is assumed to be a nonlinear function of a single variable which might, for instance,
be its associated past input or output variable, i.e.,
A(y k i ,z 1 ) = 1+ a1 (y k 1 )z 1 + ... + an (y k n )z n
(5.2d)
B(uk j ,z 1 ) = b1 (uk 1 )z 1 + ... + bm (uk m )z m
However, in general, the SDARX model can be written in the equation form of (5.1a) but
with the time variable parameters defined explicitly as a function of user-specified
variables xi.k , i = 1,2,..., n + m + 1 , i.e.,
y t = a1 (x1,k )y t 1 a2 (x 2,k )yt 2 L an (x n,k )y k n

(5.2e)
+ b0 (x n +1,k )ut + b1 (x n +2,k )ut 1 + L + bm (x n +m +1,k )ut m + et
Young (2000, 2001a) and Young et al. (2001) describe an identification and estimation
strategy for SDP models of this general type. The details of this strategy are given in these
references and it will suffice here to outline the main features of the approach.

Initial non-parametric estimation of the SDARX model
The identification and estimation of a model such as (5.2) follows a similar procedure to
that discussed in previous chapters for TVP models: after all, a SDP is also a TVP. Indeed,
if the variable xi,k is only slowly changing in relation to the changes in the input and
output variables y k and uk , then it can be treated as a TVP model and estimated in the
same manner as the DARX model considered in Chapter 4.
However, consider the situation where the input uk and output y k of the dynamic system
are changing rapidly and the variable xi,k is changing at a rate that is commensurate with
the changes in these variables (e.g. it could be a function of the input and output variables,
as in the definition (5.2d) above). Under these conditions, normal TVP estimation will fail
because the GRW models for the parameters will be unable to effectively track the very
rapid variations in the SDPs. For instance, as we shall see, the SDP model can describe a
chaotic process, in which case the SDPs could also be chaotic! In order to obviate these
difficulties, it is necessary to perform the TVP estimation in a different manner, in which
the data are re-ordered prior to estimation and the recursive FIS algorithm is applied
using a special back-fitting procedure.
The data re-ordering is a simple but very effective device for transforming the rapid TVP
estimation into a much simpler and solvable, slow TVP estimation problem. It works on
the basis that if, at any sample time k in an off-line (non-real-time) situation, all the
variables in an equation such as (5.2) are available for the purposes of estimation, then it
is not necessary to consider each equation in the normal temporal order, k = 1,2,...N . For
instance, each equation and the variables appearing in this equation, can be re-ordered in
some manner and the model parameters in the equation can then be recursively updated in
this new, transformed data space. And if the re-ordering is chosen such that, in this
transformed data space, the variables and associated parameters are changing quite slowly,
then recursive FIS estimation, based on the GRW class of models for the parameter
variations, will provide sensible estimates of the parameter variations in the transformed
data space. Transformation of these estimated SDPs back into the original data space then
reveals their true rapid variation in natural temporal terms.
In order to illustrate the nature of this re-ordering procedure, consider first a simple
example in the form of the following State Dependent AR (SDAR) model:
y k = a( y k 1 ).y k 1 + ek ; a( y k 1 ) = 4 4 y k 1 (5.3)
This is the chaotic version of the logistic growth equation and so the state dependency of
the parameter induces rapid, chaotic changes in a ( y k 1 ) that are clearly not identifiable
using standard TVP estimation. However, if the data are re-ordered in the ascending order

(the Matlab sort operation) of the dependent state y k 1 , then the variations of a ( y k 1 ) in
this re-ordered data space have the same degree of smoothness as the re-ordered y k 1 . This
is shown in Figure 5.1, where the upper plot shows 200 samples of y k 1 in natural
temporal order; while the lower plot shows y k 1 sorted in ascending order of magnitude.
Table 1 compares the first ten samples of y k 1 (second column) with the first ten samples
of y ko1 , the re-ordered y k 1 series, in the fourth column. The sampling index of y ko1 in
the normal temporal order is shown in the third column.
1
Data, temporal order
0.8
0.6
0.4
0.2
0
0 20 40 60 80 100 120 140 160 180 200
Normal Time
0.8
Sorted data
0.6
0.4
0.2
0
0 20 40 60 80 100 120 140 160 180 200
Re-Ordered Time
Figure 5.1 Chaotic example: model output in normal time (upper panel); model output
re-ordered in ascending order of magnitude (lower panel).
k y k 1 k y ko1
1 0.7 47 0.0009
2 0.7 26 0.0019
3 0.8398 116 0.0027
4 0.5381 48 0.0036
5 0.9940 147 0.0038
6 0.0241 27 0.0077
7 0.0943 117 0.0106
8 0.3415 155 0.0117
9 0.8994 167 0.0120
10 0.3616 49 0.0144
Table 5.1 Example of sorted data.

In the case of this simple example, TVP estimation based on the re-ordered data provides
information on the nature of the parametric state dependency. But what if there is more
than one term on the right hand side of the model equation? How do we sort the data in this
case if the associated parameters are dependent on different variables? This is where the
back-fitting procedure comes into the analysis. To clarify this back-fitting procedure,
consider the following, first order example of the SDARX model (5.2e),
y k = a1 ( x1,k ) y k 1 + b0 ( x 2,k ) uk + ek k = 1,2,..., N (5.4)
where the time delay has been removed for simplicity.
Backfitting Algorithm for the Model (5.4)
Assume that, without any sorting, FIS estimation has yielded prior TVP estimates
1
a1,k 1 1
|N and b0,k |N of a1 (x1,k ) and b0 (x 2,k ), respectively . An SDP estimation
equation for a1,k = a1 (x1,k ) can then be formulated as,
[y k b0,k |N uk ] = a1,k .y k 1
1 sy sy sy
(5.5)
where the term on the left hand side can be considered as a modified dependent
variable and the superscript sy denotes that all the variables are sorted in the
ascending order of y k 1 . Application of the standard TVP algorithm to this single
sx1 sx1
SDP sub-model then yields the FIS estimate a1,k |N of a1,k .
sy
a1,k |N is then unsorted so that an SDP estimation equation for b0,k = b0 ( x 2,k ) can
be formulated as,
su su su
[y k + a1,k|N y k 1] = b0,k .uk (5.6)
with the superscript su denoting that all the variables are sorted in the ascending
order of uk . Application of the standard TVP algorithm to this single SDP sub-
model then yields the FIS estimate b0,k
su
|N and the first iteration of the backfitting
algorithm is complete.
This process is continued in an iterative manner (each time unsorting, forming the
modified dependent variable, and sorting according to the current right hand side
variable, prior to TVP estimation using the FIS algorithm), until the FIS estimates
of the SDPs a1,k |N = a1 ( x1,k ){k | N } and b0,k | N = b0 ( x 2,k ) {k | N } (which are each
time-series of length N) do not change significantly between iterations. Here, the
1 The sdp tool in CAPTAIN uses the constant least squares parameter estimates, since the convergence of the
backfitting procedure is not too sensitive to the prior estimates, provided they are reasonable.

nomenclature {k | N} indicates the FIS estimate at sample k given the whole data
set of N samples.
The smoothing hyper-parameters required for FIS estimation at each iteration are
optimized by Maximum Likelihood (ML), as discussed in Chapter 2. Such ML
optimization can be carried out in various ways: after every complete iteration until
convergence; only at the initial iteration, with the hyper-parameters maintained at
these values for the rest of the backfitting; or just on the first two iterations. The
latter seems most satisfactory in general practice, since very little change in the
optimised NVR values or improvement in convergence occurs if optimization is
continued after this stage. Normally, convergence is completed after only a few
iterations. However, care must be taken to ensure that the convergence is
completely satisfactory in each example, since a large number of iterations are
sometimes required (see Young, 2001a).
Example 5.1 Analysis of a simulated SDARX model
This example utilizes data generated from the following 1st order SDARX relationship that
is similar to equation (5.4) but with each parameter a function of its associated variable, i.e.
with x1,k = y k 1 and x2,k = uk ,
y k = a1 (y k 1).y k 1 + b0 (uk ).uk + e k ek = N(0,0.000025) ut = N(0,0.0064) (5.7)
Here, the functions a1 (y k1 ) and b0 (uk ) are defined as follows,
a1 ( y k 1 ) = 2.0(1 y k 1 ) ; b0 (u k ) = 10u k 2 (5.8)
When written in the nonlinear functional form,
y k = f1 (y k 1 ) + f 2 (uk ) + e k (5.9)
or,
y k = 2.0 y k 1 2.0 y 2k 1 + 10u3k + e k (5.10)
this is revealed as the SDP formulation of the non-chaotic logistic growth equation, with an
input signal in the form of a normally distributed white noise sequence passed through a
cubic law nonlinearity. A typical response of this system is shown in Figure 5.2, where the
output y k is in the top panel and the input uk in the lower panel. Here, the percentage
noise/signal, based on the standard deviations (i.e. 100 *{sd(ek ) / sd(10uk3 )}), is 28%. The
MATLAB code for this example, including both the simulation and SDP estimation is
given below,

>> nn = 2000; % number of samples

>> e = 0.0053*randn(1, nn); % measurement noise
>> u = 0.08*randn(nn, 1); % input signal
>> y = zeros(nn, 1);
>> y(1) = 0.5; % initial condition
>> for i = 2:nn % simulation response with noise
>> y(i) = 2.0*y(i-1)-2.0*y(i-1)*y(i-1)+10*u(i)*u(i)*u(i)+e(i);
>> end
>> yd = del(y, 1); % output delayed by one sampling interval
>> z = [yd u]; % states
>> x = [yd u]; % regressors
>> nvr = -2;
>> [fit, fitse, par, parse, zs, pars, parses, rsq, nvre] ...
= sdp(y, z, x, [], nvr);
The results of the SDARX analysis for a total sample size of N = 2000 are shown in the
upper panels of Fig. 5.3. The hyper-parameter optimization is carried out at the first and
second iterations (nvr specified as -2): thereafter, the NVR hyper-parameters are
maintained constant, for the four iterations required to obtain good convergence in this
case, at the following optimized values,
NVR{a1 ( y k 1 )} = 0.82; NVR{b0 (uk )} = 6.76 (5.11)
Clearly for a stochastic simulation (with noise) as here, the exact values obtained depend
on the seed utilised in MATLAB for the generation of the random signals. This caveat
applies to all the numerical values given in the present section.
The upper panels of Figure 5.3 show the estimated SDPs obtained using these optimized
NVR values, with a1 ( y k 1 ) {k | N } in the left graph and b0 (u k ) {k | N } in the right. The
thin traces are the actual SDP relationships, while the thick traces are the estimates. It is
clear that the state dependency has been estimated well in both cases. The standard error
(se) bounds are not plotted on these graphs but they are available as returned variables in
the parse or parses matrices, corresponding to the unsorted (in normal temporal order) and
sorted parameter estimates par and pars respectively. The other returned variables are fit
and fitse, the output y k|k 1 of the SDARX model,
y k|k 1 = a1 (y k 1 ){k | N}.y k 1 + b0 (uk ){k | N}.uk (5.12)
and its se bound, respectively; zs are the sorted variables on which the SDPs are dependent
(in this case, the sorted y k 1 and uk , respectively); rsq, the Coefficient Of Determination
(COD), normally denoted by R 2 , based on y k|k 1 (see below); and nvre, the optimised
NVR values.

0.7
0.6
Output
0.5
0.4
0.3
0 50 100 150 200 250 300 350 400 450 500
0.4
0.2
Input
-0.2
-0.4
0 50 100 150 200 250 300 350 400 450 500
Time
Figure 5.2 Input and output data for the SDARX simulation example.
2 1
1.5 0.5
1 0
0.5 -0.5
0.3 0.4 0.5 0.6 0.7 -0.2 -0.1 0 0.1 0.2
1.3 0.8
1.2
0.6
1.1
0.4
1
0.2
0.9
0.8 0
0.4 0.45 0.5 0.55 -0.2 -0.1 0 0.1
Lagged output Input
Figure 5.3 SDP estimation results: upper panels show the results based on 2000 samples; lower panels 200
samples. In both cases, the left hand plots show a1 ( yk 1) and the right hand plots b0 (uk ) , returned as the
first and second columns in pars respectively, plotted against the associated state variable.

The response of the estimated SDARX model can be generated in two ways. First, directly
from the equation (5.12), in the usual, SDARX regression-like manner, where it will be
noted that the y k 1 on the right hand side of the equation is based on the actual
measurements y k 1 and not the modelled value of this variable. This is returned as rsq
(see above) and suggests that the model explains 95.8% of the output y k ; i.e. the
regression-based COD is R 2 = 0.958. However, since the SDARX model is a truly
dynamic nonlinear system, this is a little misleading. It is more sensible to base the COD
on the simulated model output, as generated from the equation,
y k = a1 (y k 1){k | N ).y k 1 + b0 (uk ){k | N ).uk (5.13)
This is most easily generated by a SIMULINK model using look-up tables based on the
SDP estimation results, as illustrated in Figure 5.4 below.
Figure 5.4 SIMULINK block diagram to determine the model response (5.13).
The COD obtained in relation to the actual output y k , including the effects of the noise ek ,
is RT2 = 0.927, where the subscript T is introduced to differentiate this simulation-based
COD from the more normal standard, regression-based R2 . However, if this simulation
model output is compared with the noise free output (i.e. ek = 0 k ), then RT2 = 0.987
and it is clear that the SDARX model (5.13) provides an excellent representation of the
nonlinear system (5.2). These results are obtained as follows,
>> x = zeros(nn, 1);

>> x(1) = 0.5; % initial condition
>> for i = 2:nn; % noise-free simulation response
>> x(i) = 2.0*x(i-1)-2.0*x(i-1)*x(i-1)+10*u(i).*u(i)*u(i);
>> end
>> t_in=[0:length(x)-1]'; % time vector for block diagram
>> sim('chapt5sim'); % simulate block diagram (Figure 5.4)
>> RT2f = 1-cov(x-ym) / cov(x) % RT2 based on noise-free system
RT2f = 0.9812
>> RT2n = 1-cov(y-ym) / cov(y) % RT2 based on original noisy system
RT2n = 0.9161

Final parameteric estimation of the SDARX model
Each FIS estimated SDP in the SDARX model can be considered as a nonparametric
estimate because it has a different value at each sample in time and can only be viewed in
complete form as a graph. However, as we see in the continued example below, it is
possible to proceed to a final parametric identification and estimation stage, where the non-
parametrically defined nonlinearities obtained initially by FIS estimation are parameterised
in some manner in terms of their associated dependent variable. For example, this can be
achieved by defining an appropriate parametric model in some convenient form, such as a
polynomial or trigonometric function; a radial basis function; a more general neuro-fuzzy
relationship; or a neural network. The parameters of this parameterised model can then be
estimated directly from the input-output data using some method of dynamic model
optimization: e.g. deterministic Nonlinear Least Squares (NLS) or a more statistically
efficient stochastic method, such as maximum likelihood.
Example 5.2 Final parameter estimates for the model in Example 5.1
Even without our prior knowledge in this simulation example, it is fairly obvious from
Figure 5.3 that the two SDPs are linear and quadratic functions of the associated variables
respectively (i.e. the associated nonlinearities are quadratic and cubic functions,
respectively). Thus, it is straightforward to obtain these parametric estimates, either by
Least Squares (LS) or Weighted Least Squares (WLS) estimation based on the SDP
estimation results (Young, 1993a; Young and Beven, 1994); or, preferably, by direct
estimation from the data using NLS based on the identified model structure,
y k = ay k 1 by k2 1 + cuk3 + ek (5.14)
In this case, the two sets of estimation results are given as follows:
(i) LS from SDP estimates: a = 1.980 (0.002); b = 1.961 (0.004); c = 9.999 (0.055) .
(ii) Direct NLS from data: a = 1.995 (0.006); b = 1.990 (0.012); c = 10.00 (0.056) .
Here, the estimates (i) are obtained as follows,
>> z = [ones(size(zs(:, 1))) zs(:, 1)];

>> ab = inv(z'*z)*z'*pars(:, 1); % parameters a and b
>> ee = pars(:, 1)-z*ab;
>> P = cov(ee)*inv(z'*z);
>> sd1 = sqrt(diag(P)); % standard errors
>> z = [zs(:, 2).*zs(:, 2)];
>> c = inv(z'*z)*z'*pars(:, 2); % parameter c
>> ee = pars(:, 2)-z*c;

>> P = cov(ee)*inv(z'*z);
>> sd2 = sqrt(diag(P));
>> disp([ab' c'; sd1' sd2']);
1.9803 -1.9613 9.9992
0.0023 0.0042 0.0551
The estimates (ii) are obtained directly from the data using conventional MATLAB
optimisation tools, or functions such as leastsq available in specialist toolboxes. Although
the standard errors on the initial SDP-based estimates (i) tend to be too optimistic, the
parametric estimates themselves are close to the true values, showing the efficacy of the
SDP estimation stage in identifying the nature of the nonlinearities. Indeed, the SDP
estimates obtained for only N = 200 samples, as shown by the two graphs in the lower
panel of Fig. 5.3, are quite good enough to identify the form of the nonlinear functions
themselves.
5.2 Other SDP Models
Of course, SDP models are not restricted to the dynamic SDARX form: they can be of any
chosen type as long as the model is in the form of a SDP regression model. The simplest
and most obvious of these models is the SDP equivalent of the DLR model (4.1). In the
case where the trend component is zero, this model has the following form,
i =m
y k = b(zi,k )ui,k + ek e k ~ N{0, 2 } k = 1,2,...,N (5.15)
i =1
where the regression variables ui,k ,k = 1,2,...,m are defined by the user. For instance, they
could be simply other independent variables that are assumed to be related nonlinearly to
the dependent variable y k ; they could be delayed versions of a single input variable, so
that the model is a nonlinear SDP version of the linear Finite Impulse Response (FIR)
model; or they could be basis functions defined in various ways: e.g. as orthogonal
functions or principle components. Clearly, the possibilities are multi-various.
However, one model that is worthy of special mention is the State Dependent Transfer
Function (SDTF) model, which takes the form:
B(xi,k ,z 1 )
yk = uk + k (5.16)
A(xi,k ,z 1 )
where k is, in general, coloured noise; and
A( xi ,k , z 1 ) = 1 + a1 ( x1,k ) z 1 + L + a n ( x n,k ) z n
(5.17)
B ( x j ,k , z 1 ) = b0 ( x n +1,k ) z 1 + b1 ( x n + 2,k ) z 2 + L + bm ( x n + m1,k ) z m

Once again, the xi,k , i = 1,2,...,n + m + 1, are the variables on which the parameters are
dependent. However, even in the case where k is zero mean value, white noise, the SDP
estimates obtained from the sdp tool in CAPTAIN will show signs of bias caused by the
noise. An example of these biasing effects is given in example 3 of Young (2000) and the
user of CAPTAIN must take this into account when using the function.
5.3 Further Examples
To illustrate the wide ranging utility of the sdp tool, two further examples are considered.
Example 5.3 Analysis of Squid Data (Young, 2001a)
A typical example of univariate SDP modelling is an analysis of the squid data shown in
Figure 5.5 (Young, 2001a). These squid data were obtained by Kazu Aihara and Gen
Matsumoto from experiments on the giant axon of a squid (see Mees et al., 1992). A first
order SDAR model of the following kind is identified from the data y k ,
y k = a(y k 1 ).y k 1 + ek (5.18)
The MATLAB code to identify a SDP model is as follows,
>> load squid.dat

>> yd = del(squid, 1);
>> [fit, fitse, par, parse, zs, pars, parses, rsq, nvre] ...
= sdp(squid, yd, yd, [], -2);
Figure 5.6 is a plot of the SDP estimate against the delayed output y k 1 , with the estimated
se bounds shown dashed. It is interesting to note that the parameter is roughly constant
over the range y k < 120 but that it has wide variations after this, leading to the observed
chaotic behaviour. When the actual data y k is compared with a simulated random
realization of the SDP model, it is clear that the latter captures the major nonlinear
dynamic characteristics of the electrical signal very well, as discussed by Young (2001a).

-90
-100
-110
Electrical Signal
-120
-130
-140
-150
-160
0 50 100 150 200 250 300 350 400
Time
Figure 5.5 Electrical signal obtained from experiments on the giant axon of a squid.
1.6
1.5
1.4
1.3
1.2
SDP
1.1
0.9
0.8
0.7
-150 -140 -130 -120 -110 -100
Lagged output
Figure 5.6 SDP estimation results for the squid data: plot of the SDP estimate of the parameter in a first
order, SDARX model. The standard error bounds are shown as dashed lines.

Example 5.4 Hydraulic Actuator (Young, 2001b)
To exemplify the above SDP modelling process further for a higher dimensional process,
let us consider a practical engineering example2. This relates to experiments on a hydraulic
actuator controlling a robot arm. Hu et al. (2001) analyse these data from a special neuro-
fuzzy perspective that can be related to SDP modelling (see Young, 2001b). In particular,
Hu et al., (HKK from hereon) report that prior linear modelling suggests a five
dimensional non-minimal state vector and four fuzzy sets are applied to each of the past
input and output variables with one set each for the other delayed variables. This results in
a complete model with 102 constant parameters. Simulation of the model on the validation
data (the last 512 samples) results in a root mean square error (RMSE) of 0.5445 and the
associated RT2 = 0.882 .
As HKK point out, discrete-time, linear transfer function identification quite often provides
a useful prelude to nonlinear dynamic modelling by helping to define the appropriate
dynamic order of the model. Simplified Refined Instrumental Variable (SRIV) estimation
using the riv tool in CAPTAIN (see next Chapter 6) is particularly useful in this situation,
since it does not require concurrent estimation of a noise model and so is robust to
assumptions about the noise, here affected by the nonlinearity in the data3. Not surprisingly
in this nonlinear situation, SRIV identification reveals that the linear model order is not
particularly well identified, with a number of different second and third order models
possessing similar explanatory power. However, again in order to compare the SDP results
with those of HKK, a [3 2 1] TF model (3rd order denominator, second order numerator
and a pure time delay of 1 sample) will be used in the SDP analysis.
In the initial non-parametric analysis using sdp, the three parameters associated with the
past output variables in the model are clearly identified as being either constant or nearly
constant (see Young, 2001b). These results suggest strongly that the most significant
aspects of the nonlinearity occur at the input to the system and so yield an effective input
to a linear TF model in series with this nonlinearity (the so-called Hammerstein-type
model). The non-parametric estimate of the input nonlinearity obtained from the
backfitting algorithm is plotted in Figure 5.7 as a dash-dot line, with its standard error
bounds shown as dotted lines. When applied in simulation mode to the validation data set,
this SDP model has a RT2 = 0.854 (RMSE=0.606), although this can be improved to
RT2 = 0.873 (RMSE=0.564) if a [3 2 2] model is used. These results are comparable with
the HKK results cited above and a substantial improvement on the linear model
2 Note that the data for this example are not supplied with CAPTAIN.
3 This is in contrast to methods in other specialist toolboxes (such as PEM, ARMAX and BJ) that require
simultaneous noise model estimation. Note, in this same regard, that the riv algorithm is not the same as
the iv4 algorithm, which has inferior performance.

( RT2 = 0.674 ; RMSE=0.91). The full line in Figure 5.7 is the parametric estimate obtained
in the second stage of the sdp analysis. Here, the nonlinearity is parameterised by a 10
element Radial Basis Function (RBF) model. Of course, it could be parameterised in many
other ways. The parameters of the RBF model and the linear TF are optimized
simultaneously, with the TF parameters estimated within the function by the riv tool in
CAPTAIN (see Chapter 6). When this model is applied to the validation data set it yields
RT2 = 0.911 (RMSE=0.474), which is now superior to the 102 parameter HKK model, even
though it is characterized by only 15 parameters! It also compares reasonably with the
RT2 = 0.946 obtained at the first, non-parametric estimation stage of the analysis.
The final nonlinear stochastic model takes the following form:
0.1z 1 0.091z 2 1
yk = 3 f (uk ) + e (5.19)
1 2.738z + 2.572z 0.823z
1 2
C(z 1) k
where f (uk ) is the input nonlinearity in figure 5.7; C(z 1 ) is an AR(9) polynomial and ek
is a zero mean, white noise process with variance 0.0061 that is uncorrelated with uk but
rather heteroscedastic (changing variance).
Comparison of Initial (dashed) and Final Optimised (full) SDP Estimates
2
Nonlinearly Modified "Effective" Input Variable
-1
-2
-3
-4
-1 -0.5 0 0.5 1
Input Variable
Figure 5.7 SDP estimation of the hydraulic actuator data: initial non-parametric estimate of input
nonlinearity shown as a dash-dot line, with standard error bounds shown as dotted lines; final parametric
estimate (10 element radial basis function) shown as a full line.

Also, as we discuss later, the stochastic part of the model has fairly complex dynamics and
these may be difficult to explain in physical terms, as required by DBM modelling. When
applied to the validation data, however, the AR(9) process noise part of (5.19) explains the
residual of the deterministic (simulation) part of the model very well, with a low variance,
reasonably white, innovations sequence and associated coefficient of determination based
on the one-step-ahead prediction errors of R 2 = 0.996 (RMSE =0.1).
The SDP estimation procedure used in the above example is an important part of the
general Data-Based Mechanistic (DBM) modelling strategy developed at Lancaster over
the past decade (see Young et al., 2001a and the references on DBM modelling given
there). In addition to providing an efficiently parameterized model that explains the data
well, this strategy requires that, if at all possible, the model should be interpretable in
physically meaningful terms. This is difficult to accomplish in the present example since it
normally requires a good knowledge of the physical system under study, as well as the
nature of the experiment that produced the input-output data. Here, all that we have easy
access to are the data themselves and a few facts: the input uk is a valve position; the
output y k is the resultant oil pressure in a hydraulic actuator; and the oscillations are
caused by mechanical resonances in the robot arm. Nevertheless, the above SDP model
has two features that make it believable in this physical context.
First, the input nonlinearity reveals asymmetric limiting behaviour and what appears to be
a small, but significant dead zone at the origin. Both of these phenomena can be associated
with hydraulic actuation systems, although their characteristics here might suggest that this
actuator has some design deficiencies. Second, the linear TF part of the model could well
describe the major dynamics of a hydraulic actuation system. In particular, the dominant
mode has a natural frequency of 0.285 cycles/sample and a low damping ratio of 0.085.
And, as in previous DBM studies (a recent example is Price et al., 1999), the linear TF
model part of the model can be decomposed into feedback or parallel connections of lower
dimensional sub-systems that may well have physical significance. If we had more
knowledge of the system, these various aspects of the model could be checked easily
against the (presumably available) design and performance characteristics.
So the SDP model appears to make reasonable physical sense, providing improved
understanding of the system and/or, in this engineering example, a basis for automatic
control system design. This evaluation of the model in such physical terms, albeit rather
crude and necessarily incomplete in this particular example, is a valuable aspect of the
SDP-type model developed within a DBM modelling ethos. And it can be contrasted with
the sterility, in these same terms, of the HKK model that, in common with most network-
type models, is a complete black box with no clear physical interpretation. In this case, for

instance, the purely black box HKK model has an excessive parameterisation simply
because nonlinear elements are abstractly associated with every term in the model. This is
in complete contrast to the SDP model, where sdp estimation locates all the nonlinearity at
the input in a much more physically meaningful manner.
5.4 Conclusions
This chapter has summarised the main features of the State Dependent parameter (SDP)
class of nonlinear, stochastic models that can be identified and estimated using the sdp tool
in CAPTAIN. This class of model has quite wide applicability, ranging from static SDP
regression models to SDARX and SDTF stochastic, dynamic models. The simulation and
real examples emphasise the practical utility of the sdp tool and relative ease of use.
Indeed, it may have even wider application potential, as revealed by recent research that
shows how it can not only result in a drastic reduction in the cost of the sensitivity analysis,
but also allow for the estimation of the first order sensitivity terms of high dimensional
model representations, at no additional cost (see Ratto et al., 2004).

CHAPTER 6
DISCRETE-TIME TRANSFER
FUNCTION MODELS
Chapters 4 and 5 have considered a novel approach to time variable (TVP) and state
dependent (SDP) parameter model identification and estimation, using optimal methods of
recursive filtering and fixed interval smoothing. In both these chapters, one particular
model structure has emerged on a number of occasions, namely the discrete-time Transfer
Function (TF) model. The TF model receives special treatment in CAPTAIN because of its
particular importance in the fields of data-based mechanistic modelling and model-based
control system design (refer to the numerous publications listed in the introduction).
Such a model was introduced within the context of Dynamic Auto-Regression with
Exogenous variables (DARX) in Chapter 4. A more general Dynamic Transfer Function
(DTF) model, with less restrictions on the form of the noise entering the system, was
considered in the same chapter. Finally, the TF has been utilised as a basic structure for a
class of State Dependent Parameter (SDP) models discussed in Chapter 5.
However, the discussion in earlier chapters has concentrated on the time varying case,
since this is particularly useful for the analysis of nonlinear and chaotic systems. By
contrast, the present chapter considers the situation when the essential small perturbation
behaviour can be approximated by linear, time invariant, TF models. Here, we develop a
rather different, although complementary, modelling approach to that taken before. This
difference proves important since, as illustrated in the examples below, the estimates for
TF models obtained from the standard least squares based recursive filtering algorithm,
will be asymptotically biased away from their true values when there is noise on the
variables.
In this regard, the present chapter considers in more detail a solution already mentioned in
Section 4.5, namely instrumental variables. However, the present chapter takes this
approach further, by developing robust unbiased Refined Instrumental Variable (RIV) and
Simplified Refined Instrumental Variable (SRIV) algorithms for the identification and
estimation of general discrete-time, multiple-input, single output (MISO) transfer function
models.
Chapter 6 Discrete Time Transfer Function Models
CAPTAIN has two specialised functions for such analysis: (i) riv for parameter estimation
when the model structure is specified by the user; and (ii) rivid for system identification,
the latter allowing the user to automatically search over a whole range of different model
orders and time delays. Both functions provide numerous statistical diagnostics and both
return the modelling results in the form of a special theta matrix, from which the various
parameters and their standard errors may be extracted using getpar. Finally, the function
ccf provides an alternative approach for the identification of the model structure.
The model parameters may subsequently be utilised for simulation and forecasting through
conventional MATLAB commands or by using SIMULINK (MathWorks, 2001). Two
additional functions often prove useful in this analysis: prepz for the preparation of data
before analysis, and scaleb for the later re-adjustment of the parameter estimates to
account for this pre-processing.
6.1 The Discrete-Time Transfer Function (TF) model
The MISO discrete-time TF model implemented in CAPTAIN takes the form,
B1 ( L) B ( L) 1
yt = u1t 1 + L + k u kt k + et (6.1)
A( L) A( L) C ( L)
where y t is the output; u it (i = 1,2, K , k ) are a set of k inputs that are assumed to affect the
output in a uni-directional relationship; i (i = 1,2, K , k ) are the delays associated with
each individual input; et is a random variable assumed to be a normally distributed
Gaussian sequence with zero mean value and constant variance; A(L ) , C (L ) and Bi (L )
(i = 1,2,K, k ) are polynomials defined by the orders n, p and mi , respectively. Finally, L is
the lag or backward shift operator first introduced in Chapter 1, i.e. L j yt = yt j .
However, in order to make the exposition more straightforward, the present chapter will
largely focus on the single input, single output (SISO) system defined below1,
B ( L) 1
yt = ut + et (6.2)
A( L) C ( L)
where,
A(L ) = 1 + a1L + K + an Ln
B (L ) = b0 + b1L + K + bn Ln (6.3)
C (L ) = 1 + c1L + K + cn Ln
1 It should be stressed that multiple-inputs are possible using CAPTAIN, as Example 6.5 demonstrates.
Furthermore, riv/rivid allow for the specification of time delays and different orders for each polynomial.

For simplicity, the orders of the polynomials are the same and there is no time delay
between the input and output variables. These assumptions do not constrain the model in
any sense, since different orders may be considered by simply setting the relevant
parameters to zero. Similarly, any time delay of > 0 samples can be accounted for by
setting the leading parameters of the B (L) polynomial to zero, i.e. b0 K b 1 = 0 .
Equation (6.2) may be conveniently written as,
y t = xt + t (6.4)
where,
B ( L) 1
xt = u t and t = et (6.5)
A( L) C ( L)
are known as the system model and noise model, respectively. The variable xt , which may
be obtained from the following difference equation,
xt = a1 xt 1 K a n xt n + b0 u t + b1u t 1 + K + bn u t n (6.6)
plays a central role in TF modelling and is also known as the deterministic or noise free
output of the system.
6.2 Estimation of Discrete-Time TF models

Multiplying both sides of the TF model in equation (6.2) by A(L ) , we obtain,
A( L) yt = B( L)ut + nt (6.7)
with,
A(L )
nt = et = A(L ) t (6.8)
C (L )
Further straightforward manipulation yields the following matrix form of the model,
yt = zTt p + nt (6.9)
Equation (6.9) is the constant parameter version of the DTF model (4.17). Here zTt is a
vector composed of the past values of the output and input variables, while p is a vector of
unknown parameters, i.e.,
zTt = [ yt 1 K yt n ; ut K ut n ] (6.10)

p = [a1 K an ; b0 K bn ]
T
(6.11)
Assuming for now a given model order n, the following subsections describe five
approaches for the estimation of the parameters in p, each with various advantages and
limitations in practice.
Estimation algorithm 1: Least Squares (LS)
Equation (6.9) superficially resembles a regression relationship, in which one of the

simplest estimation methods available is, of course, Least Squares (LS). In this regard, the
standard LS estimator is,
1 T
p LS = z t z Tt
T
z t yt (6.12)
t =1 t =1
A recursive solution to this LS problem is found by reference to earlier chapters. For

example, (6.9) can be considered as a DLR model with constant parameters (Section 4.1).
Here, the equivalent system matrices for the SS model (2.1) are then,
F = I 2 n +1 ; Q r = 02 n +1 and H t = zTt (6.13)
where I 2 n +1 and 02 n +1 are identity and zero matrices of dimension 2n + 1 (the number of
parameters in the vector p ). In this SS system, there is no noise added to the transition
equation and the states are simply the parameters in the vector p . Note that, in order to
obtain constant parameter estimates, Qr = 02 n +1 . By contrast, allowing positive values of
Q r forms the basis of TVP estimation, as discussed in earlier chapters.
With all these simplifications, the forward pass Kalman Filter (2.8) reduces to a
straightforward recursive LS algorithm, i.e.,
[ ] {y z
p t = p t 1 + P t 1z t 1 + z Tt P t 1z t
1
t
T
t 1
t p } (6.14)
z [1 + z z ] z P
1 T
P t = P t 1 P t 1 t
T
t Pt 1 t t t 1
At this juncture, it is important to note that the resemblance with a regression equation is
misleading. In fact, (6.9) is clearly not a regression relationship, since it is derived in a
manner which demonstrates its error-in-variables form. Here, it is clear that each of the
lagged output variables in the vector z t , is contaminated by the noise variable nt and et ;
and that both the output and noise signals are serially correlated in time.
The immediate effect of this correlation, is that the simple LS method outlined above
yields biased estimates of the parameters (see e.g. Young, 1984; Box et al., 1994). Several

methods tackle this well known problem including, most commonly, Maximum Likelihood
(ML) estimation. However, CAPTAIN offers a rather less well known approach, albeit one
that been in successful use for over 20 years, namely instrumental variables.
Before continuing, however, one exception to the above bias problem should be noted.
When the model is of the ARX type (i.e. C (L ) = A(L ) and nt is not serially correlated)
then the parameters do converge to their true values and LS may be utilised directly, as
demonstrated in Example 6.1 below.
Example 6.1 Simulation example showing the bias of the LS parameter estimates
The present example utilises Monte Carlo simulation to demonstrate the bias mentioned
above. In this regard, consider the following two TF models,
2L 1
y1t = ut + et
(1 0.8L ) (1 0.8L )
(6.15)
2L 1
y 2t = ut + et
(1 0.8L ) (1 + 0.8L )
Here, the first example is deliberately defined as an ARX model2, appropriate for LS
estimation. The relationship between the input and output variables is identical in both
cases, so that the only difference lies in the noise structure. Multiplying both equations by
A(L ) = 1 0.8L , the models may be written as,
(1 0.8L )y1t = 2 Lu t + et

(1 0.8L )y 2t = 2 Lu t +
(1 0.8 L )
e
(6.16)
(1 + 0.8L ) t
Alternatively, written in a regression-like form, these equations become,
y1t = 0.8 y1t 1 + 2u t 1 + et

(6.17)
y 2t = 0.8 y 2t 1 + 2u t 1 + nt
Both models look similar, except that the noise on the first is serially uncorrelated, while in
the second case it is indeed serially correlated. The equivalence with the matrix form (6.9)
is obvious,
z1Tt = [ y1t 1 u t 1 ] ; z T2t = [ y 2t 1 u t 1 ] ; p = [ 0.8 2]T (6.18)
2 Compare this time invariant ARX model with the DARX equivalent (4.11).

For the purposes of the present exercise, these models are regarded as simulation
examples, so that the LS parameter estimates can be compared with their true values. The
Monte Carlo simulation loop is implemented as follows, where LS estimation is based on
the conventional MATLAB backslash operator.
>> u =[zeros(200,1) ; ones(200,1)]; % step input

>> x = filter([0 2], [1 -0.8], u); % noise free output
>> Ns = 1000; % number of simulations
>> b1 = zeros(2, Ns); b2= b1;
>> for i = 1 : Ns
>> e = randn(400, 1)/5; % white noise
>> y1 = x + filter(1, [1 -0.8], e); % simulation response
>> y2 = x + filter(1, [1 0.8], e);
>> z1t = [-y1(1 : 399) u(1 : 399)]; % exogenous variables
>> z2t = [-y2(1 : 399) u(1 : 399)];
>> b1(:, i) = z1t \ y1(2 : 400); % LS estimation
>> b2(:, i) = z2t \ y2(2 : 400);
>> end
>> [median(b1'); mean(b1'); std(b1')]
>> [median(b2'); mean(b2'); std(b2')]
A typical outcome of this analysis is given in Table 6.1 below. Note that, since this
demonstration is based on multiple stochastic simulations, the values shown in Table 6.1
will not be exactly repeatable.
Parameters a1 = 0.8 b1 = 2
First Model Median -0.7997 2.0036
(ARX) Mean -0.7993 2.0067
Standard Deviation 0.0111 0.1093
Second Model Median -0.5757 4.1873
Mean -0.5737 4.2060
Table 6.1 LS parameter estimation.
It is clear that the LS estimates are very close to the real values for the case of an ARX
model (theoretically unbiased), while they are far from the simulated values in the second
simulation utilising a more general noise model.
Estimation algorithm 2: Maximum Likelihood (ML)
Although ML is not utilised for TF estimation in CAPTAN, it is nonetheless useful to briefly

review the approach, in order to put the later discussion in context. In this regard, if a set of
starting values of the output, input and noise signals prior to the beginning of the sample
are known, then the residuals of the model may be computed recursively from the
beginning of the sample to the end, i.e.,

B (L )C (L )
et = C (L ) yt ut t = 1,2, K , N (6.19)
A(L )
In practice, starting values of the variables involved are not known before the beginning of
the sample. The most obvious solution to this limitation, is to assume zero for the initial
values of the noise sequence, as well as for the initial 2n observations of the input and
output variables, i.e. starting the recursions in equation (6.19) from 2n + 1 , instead of 1.
In this way, assuming that the sequence et is white noise, i.e. a Gaussian sequence with
zero mean value and constant variance 2 , the log-likelihood function may be written as
B(L )C (L )
2
T

L(; y , u ) = log(2 ) log 2 C (L ) y t A(L ) u t
T T 1
2 2

t = 2 n +1
(6.20)
2 2
where is a vector comprising all the unknown parameters involved in the model,
namely the parameters in the A(L ) , B (L ) and C (L ) polynomials and 2 . This is known as
the conditional likelihood function, because the summation runs from 2n + 1 , while initial
values of the output, input and noise are the observed values for each variable at t 2n .
Maximisation of this function is equivalent to the minimisation of the final term, which is
the conditional sum of squares function, i.e.,
B(L )C (L )
2
T

S (; y , u ) = C (L ) y t A(L ) u t (6.21)
t = 2 n +1
The first order conditions for the optimum are obtained in the usual manner by partially
differentiating the likelihood function with respect to each of the parameters in turn. This
yields the following set of equations,
L T
B(L )C (L ) B(L )C (L )
C (L ) y t A(L ) u t A 2 (L ) u t i = 0
1
= 2
ai

t = 2 n +1

L T
B(L )C (L ) C (L )
C (L )yt A(L ) ut A(L ) ut i = 0
1
=
bi 2 t = 2 n +1

(6.22)
L 1 T B(L )C (L ) B (L )
= 2 C (L ) y t u t y t i u t i = 0
ci t = 2 n +1 A(L ) A(L )

B (L )C (L )
2
L 1 T
= 2 + 4 C (L ) y t
T
ut = 0
2
t = 2 n +1 A(L )

Further improvements have been made to this approach, in order to generate solutions that
do not depend on the initial conditions (e.g. Box et al., 1994). These include: unconditional
sums of squares, unconditional ML, Exact ML and so on. Of these, it has been argued
(e.g. Ansley, 1979) that Exact ML is the most convenient approach, yielding results that
are very similar to the conditional solution in general, but giving much better solutions for
short time series, or in systems where some of the roots are close to one (especially with
regards to the estimation of the moving average terms).
Some authors (e.g. Sims, 1980, Young, 1984) avoid the estimation of the moving average
terms in the noise model altogether, because of the complexity involved. The potential
advantages of complex algorithms, like Exact ML, are much less important in this context.
Clearly, much more can be said about ML, as the huge amount of literature on the subject
suggests. However, the present chapter will instead concentrate on the equally powerful,
but rather less well known approaches, that are implemented in CAPTAIN.
Estimation algorithm 3: Instrumental Variables (IV)
The bias in the LS parameter estimates referred to above, is a direct result of the
correlation between the regressors and the system noise. Therefore, the IV approach
consists of substituting the problematic variables3 by new variables that fulfil two key
properties,
they are, obviously, as closely correlated to the original variables as possible;
in order to remove the bias, they should be independent of the noise.
It is well-known that, if a set of such variables can be found, then the IV estimator in
equation (6.12) is asymptotically unbiased (e.g. Young, 1984). Such an estimator is
defined as follows,
1 N
p IV = x t z Tt
N
x t yt (6.23)
t =1 t =1
where x t is the vector of instrumental variables. However, as we shall see, the statistical
efficiency of the solution is dependent upon the degree of correlation between the
instruments x t and the instrumented variables z t .
It is straightforward to determine a recursive solution to (6.23), in a similar manner to the

LS case seen above. This approach can also be regarded as a special case of the IV
3 In the TF context, these are the lagged output variables in the right hand side of equation (6.9).

algorithm introduced in Chapter 4. In particular, since the parameters are now assumed
constant (i.e. Qr = 02 n +1 and F = I 2 n +1 ), equation (4.22) simplifies as follows,
[
p t = p t 1 + P t 1x t 1 + z Tt P t 1x t] {y z
1
t
T
t 1
t p } (6.24)
x [1 + z x ] z P
1 T
P t = P t 1 P t 1 t
T
t Pt 1 t t t 1
The covariance matrix Pt estimated in this way is not symmetric, so is usually replaced in
the literature (e.g. Young, 1984) by a symmetric gain algorithm with superior statistical
properties. Such an algorithm takes the form,
[ ] {y z
p t = p t 1 + P t 1x t 1 + x Tt P t 1x t
1
t
T
t 1
t p } (6.25)
x [1 + x x ] x P
1 T
P t = P t 1 P t 1 t
T
t Pt 1 t t t 1
Of course, this approach first requires the generation of suitable instrumental variables. In
general, the difficulty in obtaining such variables has acted as a strong deterrent to the use
of the method, as evidenced by the paucity of literature on its application to the analysis of
real data. Fortunately, this difficulty can be overcome fairly easily when dealing with the
estimation of TF models activated by a deterministic input, as here.
Referring again to the TF model (6.2), it is clear that the noise-free output xt is correlated
with the input sequence, since it derives directly from the input, i.e.,
B( L)
xt = ut (6.26)
A( L)
As a result, the degree of correlation between ut and xt is a function of the model and its
parameter values. Also, we see that the input sequence is directly available for
measurement and is, by definition, uncorrelated with the noise. Thus ut itself satisfies the
requirements of an IV and can be used to define an IV vector. However, bearing in mind
that the TF will inject dynamic lag between ut and xt , we might expect xt to be more
highly correlated with ut where is a pure time delay, and that there will be some
optimum choice of such delay which maximizes this correlation. This suggests that the IV
vector could in general, take the form,
x Tt = [ u t K u t n ; u t K ut n ] (6.27)
Pursuing this line of reasoning further, it seems possible that IVs which are even more
highly correlated with xt can be obtained by using an auxiliary model of the process,
whose output is used to define an IV vector of the form,

x Tt = [ x t 1 K x t n ; u t K u t n ] (6.28)
The auxiliary model is simply,
A (L )x t = B (L )u t (6.29)
where A(L ) and B(L ) are polynomials similar to the initial model, with parameters
determined in an appropriate manner (see below). The closer these polynomials are to the
actual, unknown, process parameters, the more highly correlated are xt and xt and the
lower the variance of the eventual IV estimates.
An off-line method of using this auxiliary model concept is straightforward to develop: an

initial IV estimation is performed with x t defined by (6.27) above, and the resulting
asymptotically unbiased, but probably high variance estimates used to define the auxiliary
model for a second pass through the data. Alternatively, the auxiliary model is defined on
the basis of the biased linear regression estimates; indeed we shall see that this provides a
very useful approach in practice. A second IV run is then made, utilising an auxiliary
model based on the estimates obtained in the first run. This kind of iterative procedure will
yield asymptotically unbiased estimates at each iteration, provided the IVs generated
during the iteration possess the required properties. It can be continued until there is no
significant change in the resulting estimates. Of course, this iterative procedure can utilise
either the en-bloc solution or the recursive solution of the IV equations shown above.
This argument can be taken one step further: an on-line IV procedure utilizes the recursive
solution to the IV equations, updating the auxiliary model continuously on the basis of
these recursive estimates. In this latter case, the adaptation of the model should be done
with care and checks on stability of the model need to be performed. For further details of
this approach, refer to Young (1984, pages 131-134).
A step by step review of the recursive off-line IV algorithm is presented below:
1. The number of iterations I is specified.
2. The initial states and their covariance matrix for the recursive algorithm is set up
based on diffuse priors, i.e. zeros for the states and a diagonal matrix with
arbitrary big values in the main diagonal for the covariance matrix.
3. The data vector is initialized at t = 2n + 1 by reading in the first 2n samples of

yt and ut , i.e., z T2 n = [ y 2n K y1 ; u 2 n K u1 ] .

4. The recursive least squares estimates are obtained from a standard run of the
Kalman Filter in a regression context, i.e. the algorithm (6.14). Final estimates
of the states and their covariance matrix at sample N / I are stored.
5. The parameters of the auxiliary model are set to the least squares solution
obtained from step 4, and the initial IVs are generated by,
x Tt = [ x t 1 K x t n ; u t K u t n ] with A (L )x t = B (L )u t .
6. Recursive IV estimates are now computed with the auxiliary model supplying
the IVs required to define the IV vector at each recursive step, using the
algorithm (6.25). The auxiliary model parameters are maintained constant at the
values obtained at the end of the least squares pass 4, and the initial covariance
matrix of states is similarly set to the covariance in step 4.
7. The parameters of the auxiliary model are updated to the IV estimates in step 6.
Step 6 is repeated but with this new auxiliary model defining the IV vector.
8. Step 7 is repeated a further I-2 times.
9. A final recursive pass through the data is completed with the auxiliary model set
constant to the final estimates obtained at the end of the I-th pass, but using a
diffuse prior to initialize the covariance matrix of states. The final states and
covariance matrix are the global IV estimates.
Some comments on the above algorithm are necessary. In the first place, note that the
auxiliary model parameters are always kept constant for any iteration at the values
obtained at the end of the previous iteration. The recursive estimates are not used to
continually update the auxiliary model parameters at each recursive step.
Secondly, subsequent to the first iteration and except for the final iteration, the initial
covariance matrix of states is set to a value saved at the recursive step N / I of the
previous iteration. This is an acknowledgement of the increasing confidence in the
estimates as iterations proceed. This is a heuristic way to inform the algorithm of the
increased confidence.
Thirdly, the final iteration in step 9 is required to generate the value of the covariance
matrix of states (i.e. parameters). This may be used to assess the statistical properties of the
estimates, as usual. It is important that it is based on a single run through the data from a
diffuse prior initialization; otherwise the final covariance matrix will be artificially
depressed and can lead to overly confident assessment of the error bounds on the estimates.

Although various modifications are possible, the above approach has been exploited for
many years and has been found to be very robust for day-to-day use. It is implemented in
CAPTAIN when the recursive flag in riv is set (see below). Finally, it should be noted that
the approach suggested above makes no reference at all to the noise model. The good
properties of the estimates are completely independent of the properties of the noise and
the latters serial correlation features.
Example 6.2 Simulation experiment using the IV algorithm
In order to prove the effectiveness of the algorithm described above, the simulation models
in Example 6.1 are re-examined, this time utilising the riv function in CAPTAIN to
determine the IV estimates.
>> u =[zeros(200,1) ; ones(200,1)]; % Step input

>> x = filter([0 2], [1 -0.8], u); % Noise free output
>> Ns = 1000; % No. of simulations
>> riv1 = zeros(2, Ns); riv2= riv1;
>> for i = 1 : Ns
>> e = randn(400, 1)/5; % White noise
>> y1 = x + filter(1, [1 -0.8], e); % Simulations
>> y2 = x + filter(1, [1 0.8], e);
>> [A, B, C] = getpar(riv([y1 u], [1 1 1 0], [3 0 0 0 0]));
>> riv1(:, i) = [A(2); B(2)];
>> [A, B, C] = getpar(riv([y2 u], [1 1 1 0], [3 0 0 0 0]));
>> riv2(:, i) = [A(2); B(2)];
>> end
>> [median(riv1'); mean(riv1'); std(riv1')]
A typical outcome of this analysis is given in Table 6.2 below. Compare the parameter
estimates with those in Table 6.1 and the true values defined by equation (6.15). It is clear
that for both models, the estimation is now concentrated around the true values for the
parameters.
Parameters a1 = 0.8 b1 = 2
First Model Median -0.8004 1.9943
(ARX) Mean -0.7993 2.0065
Second Model Median -0.7999 2.0007
Mean -0.7999 2.0007
Table 6.2 IV parameter estimation.
Before continuing, some explanation of the riv arguments may prove helpful. The model
structure is defined by the [1 1 1 0] input argument. In this case, 1 denominator parameter,

1 numerator parameter, 1 sample time delay and no model required for the noise.
Further examples in Section 6.5 will consider the more general case with higher order
models and multiple inputs. The 2nd input argument to riv ensures that the IV algorithm is
utilised rather than the default SRIV algorithm considered later. Here, it sufficient to note
that the first element of the argument specifies the number of IV iterations (3), while the
final element (0) selects the en block solution4.
The riv output argument is a matrix containing information about the TF model structure,
the estimated parameters and their estimated accuracy. In this case, getpar is utilised to
extract the required parameter estimates A, B and C, which represent A(L ) , B(L ) and
C (L ) respectively (where C (L ) = 1 in the case of IV estimates).
Estimation algorithm 4: Refined Instrumental Variable (RIV)
The IV algorithm concentrates on the estimation of the system model parameters with no
reference to the noise model. However, in many applications the analyst wishes to go
further in the characterization of the system and obtain some evaluation of the estimated
noise sequence. Bearing in mind that IV estimation provides asymptotically unbiased and
consistent estimates of the TF parameters, a good estimation of such noise may be obtained
from,
B ( L)
n t = yt x t = y t ut (6.30)
A ( L)
which may be computed recursively as,
n t = a1n t 1 K a n n t n + y t + a1 yt 1 + K + a n y t n b0 u t K bn u t n
or,
n t = y t + a1 x t 1 + K + a n x t n b0 u t K bn u t n (6.31)
Here n initial values for the noise, input and output are required. The obvious approach is
to set the noise terms to zero, and use the observed values for the input and output
variables from 1 to n.
This noise estimate may be employed for statistical assessment of the model. Furthermore,
Young (1984) shows that it can also be used to estimate a model for the noise, by utilising
Auto-Regressive Moving-Average (ARMA) models estimated recursively by similar
4 Here, 1 would instead specify the (slower) recursive version of the algorithm, which is invaluable when
there are missing data; indeed, CAPTAIN automatically switches to recursive model in such a case.

algorithms to those discussed above. In fact, a simplified version is implemented in

CAPTAIN, i.e. recursive AR models may be obtained using, for example, dar.
The final model estimated in this way is solved in two steps that are independent of each
other. In other words, the system model is estimated with no reference to the noise model,
though this noise model is conditional on the IV estimates. The question now is whether
there is a way to improve the estimates found in these two independent steps. The positive
answer to this question is discussed below. In particular, consistency and efficiency may be
improved by allowing the algorithms to be connected to each other.
Such a connection derives from reference to the ML estimates introduced earlier. In the
first instance, consider the following pre-filtered variables,
C (L ) C (L ) B (L ) *
y t* = yt ; u t* = ut ; x t* = ut (6.32)
A(L ) A(L ) A(L )
The two first sets of the normal equations determining the ML estimate (6.22) may then be
written as,
N
[ ]
A(L ) yt B(L )ut xt i = 0
t = 2 n +1
* * *

(6.33)
N
[ ]
A(L ) yt B(L )ut ut i = 0
t = 2 n +1
* * *

It is interesting to note that these equations are similar to the normal equations in the IV
context. In effect, such equations are given by,
x t z Tt p = 0
N N
x t yt (6.34)
t = 2 n +1 t = 2 n +1
with,
x Tt = [ x t 1 K x t n ; u t K u t n ]
z Tt = [ yt 1 K yt n ; u t K u t n ] (6.35)
p = [a1 K a n ; b0 K bn ]T
Expanding these IV normal equations, they may alternatively be written as,


[ yt + a1 yt 1 + K + a n yt n b0 u t b1u t 1 K bn u t n ] x t i = 0 i = 1,2, K , n
N
t = 2 n +1

[ yt + a1 yt 1 + K + a n yt n b0 u t b1u t 1 K bn u t n ] u t i = 0 i = 0,1, K , n
N
t = 2 n +1
(6.36)
Finally, with the introduction of the backward-shift operator notation, these become,

[A(L ) yt B(L )u t ] x t i = 0
N
t = 2 n +1
(6.37)

[A(L ) yt B(L )u t ] u t i = 0
N
t = 2 n +1
Comparison of equations (6.33) and (6.37) immediately suggests that the former can be
interpreted as the IV normal equations for the system, but here with the input ut , the output
yt and the auxiliary model output xt replaced by their pre-filtered equivalents u t* , yt* and
xt , respectively. As a result, it is clear that the iterative and recursive methods of solution
used in the basic IV case, can also be utilized in the present context: it is necessary merely
to introduce the additional pre-filtering operations and to update the parameters of these
pre-filters either iteratively or recursively. The latter operation utilises similar adaptive
mechanisms to those used to update the auxiliary model parameters in the basic IV case.
In this manner, by alternately making assumptions about prior knowledge of the system
and the noise model parameters, it is possible to decompose the maximum likelihood
solution of the time series problem into simpler sub-problems. We have also seen how
recursive solutions to both of these sub-problems are possible; solutions which are quite
similar to the LS and IV algorithms (6.14) and (6.25) respectively, except that they include
additional data pre-filtering operations.
A simultaneous recursive estimation algorithm of the system and noise model parameters
can, therefore, be achieved by allowing direct communication (that is coordination)
between each algorithm as the solution proceeds. In other words, initial consistent
estimates of the system and noise model parameters can be used to define the initial
auxiliary model and pre-filter parameters and, as the subsequent recursive estimates are
obtained, they can be used to update or adapt these parameters. Such a procedure is
directly analogous to the basic recursive algorithms (6.14) and (6.25), differing only by the
presence and added complication of the adaptive pre-filters.
The implementation of this Refined IV algorithm in iterative or recursive form is based on

the following steps:

1. Set J, the number of RIV iterations,.
2. Find initial estimates of polynomials A 0 (L ) and B 0 (L ) using the IV procedure

described in the previous sub-section.
3. Find estimates of the noise signal from n t = y t x t and C 0 (L ) by least squares.
4. Find values of the pre-filtered variables u t* , yt* and xt based on the estimates in
steps 2 and 3.
5. Repeat steps 2, 3, and 4 J-1 times using the pre-filtered variables found in step 4.
Here, the estimation of a general TF model, i.e. a non-linear optimization problem, is

decomposed into a relatively simple iterative procedure, in which at each step the
estimation is linear-like. The linearity of each step allows for computation of recursive
estimates in a simple way, if required. However, the statistical properties of this approach
are as powerful as those of other methods more rooted in theoretical grounds.
Many other topics related to IV estimation are discussed by Young (1984), including the
statistical properties of the IV and AML estimates; optimal initialisation; convergence;
identifiability; optimality of the estimates; the relationship with ML estimation; and other
theoretical questions, together with numerous practical examples.
Estimation algorithm 5: Simplified Refined Instrumental Variable (SRIV)
The final estimation algorithm considered in this chapter is, as the name suggests, a
simplification of the RIV approach. Briefly, the SRIV algorithm utilises the filtering
concept from above, but applies it to the case when a noise model is not required. Here, the
output, input and instrumental variables are replaced with the following filtered versions,
1 1 1 *
yt* = yt ; u t* = ut ; x t* = ut (6.38)
A(L ) A(L ) A(L )
The adaptation of both the auxiliary model and prefilters is performed within a three-step
iterative procedure (Young, 1984, 1985) in a similar manner to the IV approach above.
Example 6.3 Simulation experiment comparing the SRIV, RIV and ML estimates
This example provides a simple statistical assessment of the various estimators, based on
simulated data obtained from,
2L 1
yt = ut +
(1 0.8L ) (1 1.2 L + 0.8L2 ) t
e (6.39)

Here, ut is a step input and, as usual, et is a serially uncorrelated white noise signal. The
simulation and estimation steps are performed as follows:
>> u =[zeros(200,1) ; ones(200,1)]; % step input

>> x = filter([0 2], [1 -0.8], u); % noise free output
>> Ns = 1000; % number of simulations
>> riv1 = zeros(4, Ns); riv2= riv1; ml= riv1;
>> for i = 1 : Ns
>> e = randn(400, 1); % white noise
>> y = x + filter(1, [1 -1.2 0.8], e); % simulation
>> [th, stats, e] = riv([y u], [1 1 1 0]); % SRIV
>> [A1, B1] = getpar(th);
>> C1 = getpar(mar(e, 2)); % AR noise model
>> riv1(:, i) = [A1(2); B1(2); C1(2:3)'];
>> [A2, B2, C2] = getpar(riv([y u], [1 1 1 2])); % Full RIV
>> riv2(:, i) = [A2(2); B2(2); C2(2:3)'];
>> [AA, Bml, CC, Cml, Aml] = th2poly(bj([y u], [1 0 2 1 1])); % ML
>> ml(:, i) = [Aml(2); Bml(2); Cml(2:3)'];
>> end
>> [median(ml'); mean(ml'); std(ml')]
In this case, the first call to riv utilises the SRIV algorithm by default, while specification
of a 2nd order noise model in the second call ensures that the function automatically
switches to RIV mode. Finally, bj, which is not part of the CAPTAIN toolbox, returns the
ML estimates using the prediction error method. The call to bj is included here for
comparative purposes and can be omitted if the MATLAB Identification Toolbox is not
available. A typical outcome of this analysis is given in Table 6.3 below.
Parameters a1 = 0.8 b1 = 2 c1 = 1.2 c2 = 0.8

SRIV Median -0.7957 2.0391 -1.1966 0.8001
Mean -0.7894 2.1033 -1.1958 0.7971
Standard Deviation 0.0385 0.3782 0.0299 0.0294
RIV Median -0.7982 2.0161 -1.1968 0.8003
Mean -0.7955 2.0428 -1.1959 0.7973
ML Median -0.7987 2.0147 -1.1963 0.7997
Mean -0.7951 2.0472 -1.1951 0.7972
Table 6.3 SRIV, RIV and ML parameter estimation.
It seems clear that all the estimates are unbiased and the most efficient (though marginally
so in this example) are the full RIV estimates. Also, the performance of the SRIV
algorithm is remarkably good when compared with the other two. Obviously, these results
should be considered with care and cannot necessarily be generalised to any system.

6.3 Identification of Discrete-Time TF models
All the analysis above has assumed that the orders of the polynomials in (6.2) are already
known. In fact, identification of the most appropriate TF model structure is a very broad
subject that is only briefly addressed below. Although numerous approaches are described
in the literature, the most important options in CAPTAIN are the cross-correlation function
ccf and various numerical identification criteria returned by riv/rivid.
Cross-Correlation function
The Cross-Correlation Function (CCF) has long been a standard tool for the identification
of TF models and has been particularly widely used since the publication of the book by
Box and Jenkins (1970). It is a representation of the linear correlation between an assumed
input variable and the output at different lags, plotted against this lag.
The cross covariance coefficients between an input ut and an output yt at lag k are
defined as,
[
uy (k ) = E (ut u )(yt + k y ) ] k = 0,1,2, K (6.40)
where u and y are the means of the input and output variables respectively. In a similar
manner, the cross covariance coefficients between yt and u t at lag k are defined as
[
yu (k ) = E (y t y )(u t + k u ) ] k = 0,1,2, K (6.41)
It is clear that the cross covariance function is not symmetrical for positive and negative
values of k. Finally, the CCF is defined as,
uy (k )
uy (k ) = k = 0, 1, 2, K (6.42)
u y
where the cross covariance coefficients are normalised by the standard deviation of both
the input and the output. The usual estimates of the CCF are simply the sample equivalents
of the population counterparts, i.e.,
cuy (k )
ruy (k ) = k = 0, 1, 2, K (6.43)
su s y
where ruy (k ) is the sample cross-correlation coefficient for lag k; su and s y are the
sample standard deviations for the input and output variables respectively, while,

1 T k
(ut u )( yt +k y )
T t =1
k = 0,1,2, K

cuy (k ) = (6.44)
T k
( yt y )(ut +k u )
1
k = 0,1,2, K
T t =1
is the sample cross-covariance. Here, u and y are the sample means of the input and
output variables.
Confidence bands for the CCF estimates can be constructed by reference to the
approximate variance of the coefficients, i.e.,
[ ]
var ruy (k ) = (T k )
1
(6.45)
Ideally, if the relationship between the input and output variables is truly unidirectional,
then the CCF would show all the significant values in the positive axis, while the
coefficients in the negative part would all be zero. However, if the relationship is
simultaneous, significant coefficients will be found on both sides of the axes.
In practice, it is well-known that the CCF is contaminated by noise in the series and by the
autocorrelation of the input (if it is stochastic), meaning that the CCF does not necessarily
show the impulse response as required. Furthermore, identification may be rather complex
when several inputs are involved in the model. Several solutions may be found in the
literature, including pre-whitening the input and output variables or by turning to other
identification criteria (see next subsection).
With regards to the former approach, Box and Jenkins (1970) demonstrate that estimating
the CCF between the output and input pre-whitened by the model of the input, provides the
correct estimation of the impulse response function of the TF, making identification of the
model orders more straightforward, as discussed below.
It is often argued (e.g. Box et al., 1994) that the model (6.2) is, in fact, an approximation to
the real linear relationship between the input and output variables (which, in principle, is of
infinite order), i.e.,
( )
yt = v0 u t + v1u t 1 + v 2 u t 2 + K + N t = v0 + v1 L + v 2 L2 + K u t + N t = V (L )u t + N t
(6.46)
where N t is a coloured noise: in the present case, this is the noise AR model in equation
(6.2). In this regard, the identification process often relies on a rough estimation of weights

v k ( k = 1,2, K ) and then approximating these by a ratio of two finite polynomials (6.2).
Multiplying both sides of the equation by u t + k , and taking expectations, we have,
E (u t + k y t ) = v0 E (u t + k u t ) + v1 E (u t + k u t 1 ) + v 2 E (u t + k u t 2 ) + K + E (u t + k N t ) (6.47)
or,
uy (k ) = v0 u (k ) + v1 u (k 1) + v 2 u (k 2) + K (6.48)
where uy (k ) is the cross covariance at lag k between the input and output variables, and
u (k i ) ( i = 1,2,K ) is the auto-covariance function of the input (when it is stochastic).
Here, it is clear that the cross-covariance function (and cross-correlation) is influenced, or
contaminated, by the auto-covariance function of the input. Therefore, extracting the linear
relationship between the output and input from the crude cross correlation function is
almost impossible.
However, if the input and output are pre-whitened as mentioned above, a more useful
solution is found. This approach consists of filtering the input and output by the model of
the input, as shown below. Note that,
Bu ( L)
ut = t (6.49)
Au ( L)
where, by definition, t is white noise. Filtering the output and input by the inverse of this
model yields the pre-whitened output and input, t and t respectively, i.e.,
Au ( L) A ( L)
t = y t and t = u ut (6.50)
Bu ( L) Bu ( L)
where, in general, t will not be white noise. One important point, is that the linear TF
relationship between these pre-whitened variables is exactly the same as that between the
original ones, i.e.,
Au (L ) A (L ) A (L )
y t = u V (L ) u t + u Nt or t = V (L ) t + N t* (6.51)
Bu (L ) Bu (L ) Bu (L )
However, now the CCF between the pre-whitened input and output variables yields a true
estimate of the impulse response coefficients v k ( k = 1,2, K ),
E ( t + k t ) = v0 E ( t + k t ) + v1 E ( t + k t 1 ) + v 2 E ( t + k t 2 ) + K + E ( t + k N t ) (6.52)
or,

(k ) = v0 (k ) + v1 (k 1) + v 2 (k 2) + K (6.53)
Note that, since the sequence t is white noise, only the kth term of the right hand side is
different from zero. All this analysis is in order to determine an initial estimate of the
impulse response function, on which the identification of the numerator and denominator
TF model orders may be based. In this regard, the following rules apply:
Significant values at both sides of the zero lag is an indication of a simultaneous

relationship. In that case the TF model is inappropriate.
The first non-zero coefficient indicates the pure delay of the TF model.
In essence, an exponential decay of the CCF function would be an indication of

a denominator polynomial of first order at least. If such decay is of sinusoidal
type, then the denominator order is two or higher.
The number of free wandering coefficients is the order of the numerator

polynomial.
Consider, for example, Figure 6.3 below (Example 6.4). Here, there is not a simultaneous
relationship, the pure time delay is 3 samples, the order of the numerator is 2 and the decay
is exponential like, implying a 1st order model (1 denominator parameter).
Identification criteria
Further to the CCF approach, another well known, rather more general, approach to system
identification, involves use of various statistical criteria. There are usually determined from
the fit of a particular TF model to the data but sometimes also penalise the criterion by the
number of parameters involved. In this regard, CAPTAIN provides three main criteria: the
ubiquitous Coefficient of Determination ( RT2 ); the Akaike Information Criterion (AIC,
Akaike, 1974); and the Young Information Criterion (YIC, Young, 1984):
e 2 1 k =N 1 k =N
RT2 = 1 ; y2 = [ y (k ) y ] ;
2
y= y (k ) (6.54)
y2 N k =1 N k =1
( )
AIC = log e2 + 2 h N (6.55)
e 2 1 i =h e . p ii
2
YIC = log e + log e {NEVN }; NEVN = (6.56)
y2 h i =1 a i2
Here, e2 is the variance of the residuals, y2 the variance of the output and h is the
number of estimated parameters in the p parameter vector. With regards to YIC, p ii is the

ith diagonal element of the covariance matrix Pt (so that e . pii can be considered as an
2
approximate estimate of the variance of the estimated uncertainty on the ith parameter
estimate); and ai2 is the square of the ith parameter in the p vector.
We see that the Coefficient of Determination RT2 is a statistical measure of how well the
model explains the data: if the variance of the model residuals e 2 is low compared with
the variance of the data 2y , then RT2 tends towards unity; while if e 2 is of similar
magnitude to 2y then it tends towards zero. Note, however, that RT2 is based on the
variance of the model errors e (k ) and it is not the more conventional R2 based on the
variance of the one step ahead prediction errors: this is because RT2 is a more discerning
measure than R2 for TF model identification: while it is often quite easy for a model to
produce small one step ahead prediction errors, since the model prediction is based on past
measured values of the output variable yt , it is far more difficult for it to yield small
model response errors, where the model output is based only on the measured input
variable u t and does not refer to yt . In other words, RT2 is based on the simulation
response, while R2 is based on the regression response (see also Example 5.1).
In a similar manner to RT2 , AIC has a component related to the simulation fit, but is
penalised by the number of parameters in the model. In CAPTAIN, it is mainly utilised for
the identification of AR models using aic, since in the TF model context it has largely been
superseded by YIC. In this regard, YIC is more a more complex, heuristic criterion.
From the definition of RT2 , we see that the first term of YIC is simply a relative measure of
how well the model explains the data: the smaller the model residuals the more negative
the term becomes. The second term, on the other hand, provides a measure of the
conditioning of the instrumental variable cross product matrix, which needs to be inverted
when the IV normal equations (6.25) are solved: if the model is over-parameterised, then it
can be shown that this matrix will tend to singularity and, because of its ill-conditioning,
the elements of its inverse Pt will increase in value, often by several orders of magnitude.
When this happens, the second term in YIC tends to dominate the criterion function,
indicating over-parameterisation. An alternative justification of the YIC can be obtained
from statistical considerations (see e.g. Young, 1989). Although heuristic, the YIC has
proven very useful in practical identification terms over many years.
The usual way to deal with the identification problem is by calculating these criteria for a
set of possible orders for each of the polynomials in (6.2). The user may then select the
most appropriate structure by utilising a combination of the numerical criteria calculated
by rivid, visualisation of the model fit, impulse response etc. and, given the objectives of
the modelling study, their own judgement.

6.4 Validation of Discrete-Time TF models
Before using the model for the purposes it is being built, the final stage in a TF modelling
exercise is the validation. Does the model provide an adequate representation of the data?
In this regard, the usual objective is that the estimated residuals should all have the
standard properties of white noise, i.e. they should behave as a random sequence coming
from a gaussian distribution with constant variance and zero mean. The validation tools
available in CAPTAIN are designed to check this hypothesis.
In the context of TF modelling, the two most important tests are:
Autocorrelation test of the residuals using the function acf.
Cross-correlation tests between residuals and inputs (or pre-whitened inputs)

using the function ccf.
A hypothesis probably not highlighted too much in the chapter so far, is that the system
and noise models are independent of each other. In fact, the estimation methods outlined
above assume these circumstances. Therefore, checking the independency of the residuals
with respect to the input variables is an important validation step. In this regard, consider if
the true model is given by equation (6.2),
et = v(L )ut + D(L )et

B ( L) 1
yt = ut + (6.57)
A( L) C ( L)
Assuming now that an incorrect model yields the alternative residuals et* , i.e.,
yt = v * (L )u t + D * (L )et* (6.58)
These latter residuals may be written as,
[ ]
et* = D *1 (L ) v(L ) v * (L ) u t + D *1 (L )D(L )et (6.59)
There are two possible outcomes of a failed modelling procedure:
While the system model is correct, the noise model is incorrect. In this case
equation (6.59) simplifies to et* = D *1 (L )D(L )et and the estimated residuals
will be auto-correlated.
The system model is incorrect. It is clear that no simplification applies to

equation (6.59), so the residuals would be cross-correlated with the inputs and
they will also be auto-correlated. This auto-correlation would not disappear even
if the noise model is correct.

The importance of these tests is now clear: if the estimated residuals are auto-correlated, it
may be due either to a poorly estimated noise model or the system model. However, the
cross-correlation with inputs would clarify which is the weak part of the model.
6.5 Further Examples
The discussion above has largely focused on developing the various identification and
estimation algorithms, while the simulation examples have been designed to illustrate the
differences between LS, IV and RIV (Tables 6.1 to 6.3). By contrast, the final three
examples utilise both riv and rivid for the analysis of real data.
Example 6.4 Gas furnace
Figure 6.1 illustrates 296 samples of the input gas feed rate and corresponding output CO2
concentration from a gas furnace read at intervals of 9 seconds, taken from Box et al.
(1994). This example is interesting, not only because it is the central example of TF model
estimation cited in the previous reference, but also because the recursive IV analysis below
highlights certain important aspects of the data not pointed out by those authors.
>> load gas.dat

>> y = gas(:, 1); u = gas(:, 2); t = (9 : 9 : 9*296)';
>> subplot(211) ; plot(t, u); subplot(212) ; plot(t, y)
Coded Input Gas Rate

4
2
Input Gas Rate
-2
-4
5 10 15 20 25 30 35 40
Output Concentration (%CO2)

65
Output Concentration
60
55
50
45
5 10 15 20 25 30 35 40
Minutes
Figure 6.1 Input gas rate and output CO2 concentration from a gas furnace.

Box et al. (1994) identify the TF model structure using the CCF of the pre-whitened output
and input variables. This procedure is easily implemented using CAPTAIN, as shown below.
The first step is to find an ARIMA model for the stochastic input, i.e.,
>> u = u - mean(u); y = y - mean(y); % Remove means

>> acf(u, 12); % Identification of input model
>> aic(u, [], 1); % Identification via AIC
The sample autocorrelation and partial autocorrelation functions (Figure 6.2) suggest either
an AR(3) or an AR(4) model, while AIC identifies the AR(4) model.
ACF PACF
0.8
0.6 0.5
Correlation
Correlation
0.4
0
0.2
0
-0.5
-0.2
0 4 8 0 4 8
Figure 6.2 Model identification of stochastic input.
The pre-whitening can be performed using the standard MATLAB filter function,
>> a = getpar(mar(u, 3)); % Estimation of model for input

>> yf = filter(a, 1, y); % Pre-whitening the output
>> uf = filter(a, 1, u); % Pre-whitening the input
>> ccf(yf, uf, 10); % Impulse response of TF
The second and third lines of this code generate the filtered output and input, i.e.,
(
y t = 1 + a1 L + a 2 L2 + a 3 L3 y t )
= (1 + a L + a L + a L ) u
(6.60)
2 3
u t 1 2 3 t
Based on Figure 6.3, which shows the impulse response between the pre-whitened input
and output, the model structure ascertained by Box and Jenkins is of the form,
b0 + b1 L + b2 L2
yt = u t 3 (6.61)
1 + a1 L

CCF between X(t) Y(t-k) CCF between X(t-k) Y(t)

0.1
0
-0.1
-0.2
-0.3
-0.4
-8 -4 0 4 8
Figure 6.3 TF impulse response identification of input gas rate and output CO2 concentration.
The identification of the final noise model is based on an estimate of the perturbation
signal, obtained by subtraction of the model response from the output, i.e.,
b0 + b1 L + b2 L2
t = yt u t 3 (6.62)
1 + a1 L
For the purposes of this example, the following code utilises bf from the MATLAB
Identification Toolbox in order to obtain the ML estimates. However, clearly the Captain
function riv can be used instead.
>> TH = bj([y u], [3 1 0 0 3]); % Estimation of TF by ML

>> [A, B, C, D, F] = th2poly(TH); % Polynomial conversion
>> yh = filter(B, A, u); % Filtered output
>> e = y - yh; % Perturbation
>> acf(e, 10); % Noise model identification
>> aic(e, [], 1); % Identification via AIC
For brevity, the output from acf and aic are not shown here. However, they clearly identify
an AR(2) model for the noise. Therefore, the final model is of the form,
b0 + b1 L + b2 L2 1
yt = u t 3 + et (6.63)
1 + a1 L 1 + c1 L + c2 L2
The ML and RIV parameter estimates are obtained as follows,
>> TH = bj([y u], [3 0 2 1 3]); % Final TF estimation by ML

>> [PARml, COVml] = th2par(TH);
>> th = riv([y u], [1 3 3 2]); % Estimation by RIV
>> [a, b, c, P] = getpar(th);
These estimates of the model (6.63) are shown in Table 6.4, together with the estimation
results given by Box et al. (1994).

Parameters ML RIV Box et al. (1994)

b0 -0.62 (0.08) -0.52 (0.07) -0.53 (0.08)
b1 -0.27 (0.11) -0.37 (0.10) -0.37 (0.15)
b2 -0.42 (0.11) -0.52 (0.11) -0.51 (0.16)
c1 -1.51 (0.05) -1.53 (0.04) -1.53 (0.05)
c2 0.59 (0.06) 0.63 (0.04) 0.63 (0.05)
a1 -0.59 (0.06) -0.53 (0.05) -0.57 (0.21)
Table 6.4 Estimation of TF models for the gas furnace data by ML and RIV. The results reported by
Box et al. (1994) as also shown. Standard errors are given in parenthesis.
Interestingly, the RIV estimates are closer than the ML to the estimates reported by Box
et al. (1994), while the RIV standard errors are much closer to those of ML than the ones
in the original reference. In fact, the RIV standard errors are marginally better than the ML
case. Finally, note that the standard error of the a1 parameter reported by Box et al. (1994)
is considerably higher than for the other two estimation methods and this may well be an
error in their calculations.
All the standard tests discussed above may be performed on the residuals from these
models but, for brevity, this will not be pursued further here. However, one aspect of the
modelling errors deserves particular attention Figure 6.4 shows that the errors near the
end of the data set are quite large compared with the rest of the sample.
Error
4
3
2
1
0
-1
5 10 15 20 25 30 35 40
Minutes
Figure 6.4 Errors from TF model estimated by RIV in Table 6.4.
Such deviation could arise for several reasons, but the most likely explanation is that the
data are non-stationary, particularly towards the end of the experiment. One
straightforward way to check this hypothesis with CAPTAIN, is by means of the dtfm
function introduced in Chapter 4.
>> [tfs, fit, fitse, par] = dtfm(y, u, [1 3 3 2], 0, 0, [], [], 0);
>> subplot(411), plot(t, par( :, 1))

The time varying parameters obtained from this analysis are illustrated in Figure 6.5
below. It would be wrong to draw any firm conclusions from these preliminary results
because no information is given on the detailed nature of the experiment. For example, it
could be that something went wrong at the end of the experiment, or it may be that there
are changes in the dynamic characteristics of the gas furnace arising from some associated
changes in the physical characteristics of the process itself.
Recursive IV estimates
-0.7
a(t)
-0.75
-0.8
-0.2
b0(t)
-0.4
-0.6
-0.5
b1(t)
-1
-1.5
1
b2(t)
0.5
0
5 10 15 20 25 30 35 40
Minutes
Figure 6.5 Recursive estimation by IV methods of TF model in Table 6.4.
The poor definition of the recursive estimates suggest identifiability problems. However,
the inputs appear to be persistently exciting and there do not appear to be any obvious
identifiability problems. Alternatively, the estimates may be poorly defined if the model
structure selected by Box et al. (1994) is over-parameterised. Both these issues warrant
further attention in any subsequent analysis.
Example 6.5 Unemployment rate in the USA
In this example, the relationship between the quarterly rate of unemployment in the US
(from the second quarter of 1948 to the second quarter of 1998) and other macroeconomic
variables is considered. The results shown here are based on Young and Pedregal (1999),
which followed up earlier research by Young (1994) and Young and Pedregal (1997).

The data are illustrated in Figure 6.6, where the top graph shows quarterly variations in
unemployment rate for the USA. Below this are graphs of the quarterly variations of total
private investment and public expenditure, both given as percentage ratios of the Gross
National Product (GNP). These ratios will be called Relative Private Investment ( RPI t )
and Relative Public Expenditure ( RPEt ).
>> load usemp.dat

>> un = usemp(:, 1); gnp = usemp(:, 2);
>> g = usemp(:, 3); pi = usemp(:, 4);
>> RPI = pi./gnp*100; RPE = g./gnp*100;
>> t = (1948 : 0.25 : 1998.25)';
>> clf, subplot(311), plot(t, un, 'k');
>> subplot(312), plot(t, RPI, 'k');
>> subplot(313), plot(t, RPE, 'k');
USA Unemployment Rate (%) 1948(2)-1998(2)

Unemployment Rate
10
8
6
4
Relative Private Investment 1948(2)-1998(2)
16
% of GNP
14
12
Relative Public Expenditure 1948(2)-1998(2)
30
% of GNP
25
20
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995
Figure 6.6 Quarterly unemployment rate in the USA, together with the
Relative Private Investment and Public Expenditure.
These plots suggest that the long term rise in the unemployment appears to be due to the
decline of Government spending, rather than the level of private investment relative to
GNP. On the other hand, RPI t seems to be the responsible for the short term fluctuations
in the unemployment rate, where a rise of RPI t yields a drop in unemployment and vice
versa.

A TF model for these relationships takes the form,
B1 (L ) B (L ) 1
yt = RPI t + 2 RPEt + et (6.64)
A(L ) A(L ) C (L )
The first step is to find out the optimal orders of the polynomials involved. These may be
determined using the ccf function, as in the previous example, or by rivid as follows,
>> Z = prepz([un RPI RPE], [], size(un, 1));

>> rivid(Z(1:164, :), [0 1 1 0 0 1; 1 2 2 2 2 2], 1, [3 2 3 0 0 1 0]);
In this case, the best 6 models in terms of YIC are listed in Table 6.5 below.
Parameter orders YIC R2 AIC

[a b1 b2 d1 d2 c]
[1 1 1 0 2 2] -3.13 0.450 0.485
[1 1 1 0 1 2] -3.03 0.454 0.478
[1 1 1 0 1 1] -2.97 0.314 0.706
[1 1 1 0 0 2] -2.89 0.494 0.403
[1 1 1 0 0 1] -2.86 0.349 0.654
[1 1 1 0 2 1] -2.80 0.238 0.812
Table 6.5 Partial output of function rivid applied to the unemployment data.
The model selected for further analysis is the fourth line of Table 6.5. This is because it has
the highest RT2 while, in this case, YIC is very similar for all the models shown. This latter
model has the form,
b1 b2 1
yt = RPEt + RPI t + et (6.65)
1 + a1 L 1 + a1 L 1 + c1 L + c2 L2
The RIV estimation results are displayed in Table 6.6 below.
Parameter Estimate Standard Error

a1 -0.514 0.0768
b1 -33.531 4.261
b2 -14.322 3.852
c1 -1.358 0.070
c2 0.414 0.070
Table 6.6 RIV estimation results for the unemployment data.
The adequacy of this model is illustrated by the range of potential tests implemented in
CAPTAIN, including: a plot of the residuals; their associated simple and partial
autocorrelation functions; and the various cross correlation functions. Although the plots
are not shown here, the relevant code is as follows,

>> [th, stats, E] = riv(Z(1 : 164, :), [1 1 1 0 0 2], [3 2 3 0 0 1 0]);

>> [a, b, c, Ps] = getpar(th) % parameter estimates
>> e = filter(c, 1, E); % Residuals
>> acf(e, 12);
>> ccf(e, Z(:, 2), 5);
>> ccf(e, Z(:, 3), 5);
It is interesting to explore what the model tells us about the relative effectiveness of public
and private investment in affecting the level of unemployment. Referring to the two TFs
in the model (Table 6.6), we see that the steady state gain (or the long term multiplier) of
the TF between RPI t and the unemployment rate ( G1 = b1 /(1 + a1 ) = 69.0 ) is 2.34 times
the steady state gain between RPEt and the unemployment rate ( G2 = b2 /(1 + a1 ) = 29.5 ).
Since the steady state gain is the steady level that the output of the TF concerned achieves
following a sustained unit step in the input variable, this means that a 0.01 (1%) permanent
increase in the relative level of private investment would lead to a permanent reduction of
1.45% in the unemployment rate; while a similar permanent increase in the relative level of
Government spending would only lead to a reduction of 0.62%.
A final interesting exercise is to generate forecasts for the unemployment between the first
quarter of 1989 to the end of the sample, a period of about ten years. The main problem
when using a TF for forecasting purposes is that if true, out-of-sample (ex-ante) forecasts
are required, it is necessary to forecast the input themselves. Obviously, MATLAB and
CAPTAIN incorporate a wide range of tools to complete such forecasts, but experiments of
this kind are left to the reader. Instead, a reference forecast is illustrated in Figure 6.7,
where the exogenous variables are assumed known over the ten year period. This type of
reference forecast illustrates the best possible results that could be obtained from the model
and is useful for comparative purposes.
Unemployment forecasts (1989 - 1998)

12
Unemployment (%)
10
4
1980 1982 1984 1986 1988 1990 1992 1994 1996 1998
Quarters
Figure 6.7 Forecasting exercise from first quarter 1989 to second quarter of 1998.
It is clear that the forecasts are reasonably good and so give some additional confidence in
the model. This forecasting performance depends upon whether, at the forecasting origin,

the RPI t can be forecast well into the future based on its past behaviour. Since private
investment is notoriously difficult to forecast, certainly over such a long, ten year ahead
period, we can assume that the forecasting performance of the model will not always be as
good as that illustrated in Figure 6.7.
Example 6.6 Ventilation data re-visited
The final demonstration returns to the ventilation data set introduced in Example 1.2
(Chapter 1). In the earlier analysis, a first order model structure, with three samples time
delay was assumed a priori. By contrast, the present example utilises rivid to assess the
appropriateness of this model structure. To simplify the analysis, the Graphical User
Interface (GUI) is activated by means of the 5th input argument to rivid, as shown below.
>> load vent.dat

>> [z, m] = prepz(vent, [], 25);
>> [th, stats, e] = rivid(z, [1 1 0 0; 3 3 3 0], [], [], 1);
The GUI displays a list of converged models, while the response of the currently
highlighted model (initially the model at the top of the list) is compared with the first
column of z in a separate figure window. In this case, the [1 1 3 0] model considered in
Example 1.2 has the lowest YIC, while the figure window is equivalent to the top graph of
Figure 1.3 (Chapter 1). The response of other models in the list can be quickly examined,
by clicking on the appropriate row.
In addition to the useful ability to immediately see the fit of any model in the list, the GUI
provides two further options. In the first place, a button can be clicked to eliminate models
with a relatively poor fit5. This latter option can be used to automatically remove
occasional problematic model structures that may have a good YIC, but with a relatively
low RT2 . Clicking the same button again returns to the original list of models. Secondly,
the models can be resorted in order of RT2 , rather than the default YIC, by selecting the
appropriate radio button.
Finally, when the return key is pressed, the GUI is closed and the rivid output arguments
are returned to the workspace in the normal way. In this case, for example, the model
parameters are obtained using getpar, as shown in Example 1.2.
5 Here, the threshold is those models with a fit 10% less than the highest RT2 on the list.

6.6 Conclusion
In this chapter, the procedures implemented in CAPTAIN for the identification, estimation
and validation of Multiple-Input, Single-Output (MISO) Discrete Transfer Function (TF)
models have been described. The identification stage is based on the Cross-Correlation
function between the pre-whitened inputs and the output, or by utilising a range of
statistical criteria, primarily the Coefficient of Determination ( RT2 ), Young Information
Criterion (YIC) and Akaike Information Criterion (AIC).
The estimation is usually based on either the Refined Instrumental Variable (RIV)
algorithm or, when a model for the noise is not required, the Simplified Refined
Instrumental Variable (SRIV) algorithm. Since the RIV/SRIV approach may be less
familiar to many readers than the conventional Maximum Likelihood approach, the
algorithms have been discussed in some detail, showing that RIV is a fully optimal
procedure by which both the TF model parameters and the noise model are estimated
simultaneously.
The present chapter completes the discussion of discrete-time models, while the final
chapter considers the equivalent RIV algorithm for the estimation of continuous-time TF
models.

CHAPTER 7
CONTINUOUS-TIME
TRANSFER FUNCTION
MODELS
Chapter 6 introduced a Refined Instrumental Variable (RIV) approach for the estimation of
time invariant (although potentially recursively updated) linear, input-output models,
represented in discrete-time Transfer Function (TF) form. To complete the discussion,
therefore, the present Chapter considers the equivalent RIVC algorithm for the estimation
of continuous-time TF models expressed in terms of the derivative operator.
This approach contrasts with Chapters 2 to 5, which consider the identification of various
time variable and state dependent parameter state space models. Since state space models
originated from the state variable method of describing differential equations, it seems as
though the present text has now come a full circle, turning to the problem of directly
identifying such differential equations from data. In many cases, the associated hyper-
parameters for the state space models are obtained by Maximum Likelihood (ML)
estimation. The present Chapter shows how, by virtue of its special exploitation of
adaptive prefilters on the input-output signals, the RIVC method can also be interpreted in
optimal ML terms. In this manner, the Chapter covers the final bullet point in the list of
parameter estimation algorithms (Section 2.3), whilst also providing an opening for the
toolbox user into the field of continuous-time analysis.
CAPTAIN provides two main functions in this context: (i) rivc for parameter estimation
when the model structure is specified by the user; and (ii) rividc for system identification,
the latter allowing the user to automatically search over a whole range of different model
orders. Both functions provide similar statistical diagnostics to their discrete-time
equivalents, including the Coefficient of Determination RT2 and Youngs Identification
Criterion (YIC). As previously discussed in Chapter 6, all these functions return the
modelling results in the form of a special theta matrix, from which the various parameters
and their standard errors may be extracted using getpar. As in the discrete time case, such
parameters may subsequently be utilised for simulation and forecasting through
conventional MATLAB commands or by using SIMULINK (MathWorks, 2001).
Chapter 7 Continuous-Time Transfer Function models
7.1 Continuous-Time Estimation
Since the early 1960s, numerous different approaches have been suggested for the
identification and estimation of continuous-time, linear TF models from normal operating
data (see e.g. the reviews by Young, 1981, and Unbehauen and Rao, 1997). However,
CAPTAIN utilises one of the algorithms first suggested many years ago, namely the iterative
or recursive-iterative Refined Instrumental Variable (RIVC) algorithm (Young and
Jakeman, 1980). This approach derives from the discrete-time RIV algorithm introduced in
the previous chapter and is also a logical development of the earlier, more heuristic
methods developed by the present third author (Young, 1970).
The RIVC algorithm has been used for many years in a wide range of practical
applications (e.g. Price et al., 1999). Furthermore, its advantages in comparison with other
algorithms have recently been demonstrated by Young (2002), who re-examines the
simulation example used by Wang and Gawthrop (2001). We will utilise this system in
Section 7.3 below. In particular, the RIVC estimation results obtained by Young (2002) are
compared with those obtained by Wang and Gawthrop, as well as those obtained using
another IV approach, namely the IVGPMF algorithm of Garnier et al. (1995). This third
algorithm uses the so-called Poisson Moment Functional (PMF) implementation of the
State Variable Filter (SVF) concept and it also relates very closely to much earlier work by
Young (1970), who referred to the PMF filter chain as the Method of Multiple Filters
(MMF).
In contrast to these other two algorithms, however, the RIVC approach does not require the
user to specify any aspect of the prefilters other than their dynamic order. Rather these
prefilters, which prove so important in continuous-time TF estimation, are adjusted in an
iterative fashion, so that they can perform two simultaneous functions: first to optimally
filter the data and so make the estimation more statistically efficient (i.e. lower and, in the
Gaussian normal case, minimum variance parameter estimates); and secondly, to generate
the filtered derivatives of the input and output signals.
In addition, the iterative, adaptive mode of solution used by the RIVC algorithm not only
ensures that, on convergence, the estimates have statistically optimum properties, it also
generates information on the parametric error covariance matrix. This information is useful
for subsequent Monte Carlo Simulation (MCS) analysis, as well as providing the standard
error bounds on the parameter estimates.

7.2 The RIVC Algorithm
The theoretical basis for the RIVC method can be outlined by considering the following
Single-Input, Single-Output (SISO) system, although the Multiple-Input, Single-Output
(MISO) extension is straightforward and also implemented in CAPTAIN (see Section 7.4).
B( s)
x(t ) = u (t ) (7.1)
A( s )
y (t ) = x(t ) + e(t ) (7.2)
Here A(s ) and B(s ) are polynomials in the derivative operator s = d dt of the form,
A( s ) = s n + a1s n 1 + K + a n 1s + a n
(7.3)
B ( s ) = b0 s m + b1s m 1 + K + bm 1s + bm
and is any pure time delay in time units.
This model structure is denoted by the triad [n, m, ]. In (7.1), u (t ) is the input signal, x(t )
is the noise free output signal and y (t ) is the noisy output signal. Initially, the noise e(t )
is considered as zero mean, white noise with Gaussian amplitude distribution and variance
2 , although we will see later that this assumption is not restrictive. Finally, it should be
pointed out that the notation above distinguishes all these continuous-time variables from
their discrete-time equivalents, e.g. compare y (t ) with yt utilised in earlier chapters.
Likelihood Function
Following the ML approach, an error function that defines the likelihood is given by,
B(s)
(t ) = y (t ) u (t ) =
1
{A(s) y(t ) B(s)u (t )} (7.4)
A( s ) A( s )
which is the basis for the response or output error estimation methods. However, since the
operators commute in this case, the 1 A( s ) filter can be taken inside the brackets to yield
the expression,
(t ) = A( s ) y (t ) B( s)u (t ) (7.5)
or,
(t ) = s n y (t ) + a1s n 1 y (t ) + K + a n 1sy (t ) + a n y (t )
(7.6)
b0 s m u (t ) b1s m 1u (t ) K bm u (t )

where the superscript indicates that the associated variable has been prefiltered by
1 A( s ) . The advantage of this transformation is that (7.5) is now linear in the unknown
parameters ai , i = 1, K , n ; b j , j = 1,K , m so that the associated estimation model can be
written in the form:
s n y (t ) = z (t )T a + e(t ) (7.7)
where,
z (t ) = [ s n 1 y (t ) K y (t ) s m u (t ) K u (t )]T
(7.8)
a = [a1 K a n b0 K bn ]T
As a result, all of the prefiltered derivatives appearing as variables in this estimation model
are measurable as the inputs of the integrators that appear in the realization of the prefilter
1 A( s ) . Thus, provided we assume that A(s ) is known, the estimation model (7.7) forms a
basis for the definition of a likelihood function and ML estimation.
There are two problems with this formulation. The obvious one is, of course, that A(s ) is
not known a priori. The less obvious one is that, in practical applications, we cannot
assume that the noise e(t ) will have the white noise properties assumed above: it is likely
that the noise will be a coloured noise process, say (t ) . Both of these problems can be
solved by employing a similar approach to that used for discrete-time system estimation. In
other words, a relaxation optimization procedure is devised that adaptively adjusts an
initial estimate A 0 ( s ) of A(s ) iteratively until it converges on an optimal estimate of
A(s ) . Similarly, the coloured noise problem is solved conveniently by exploiting
Instrumental Variable (IV) estimation within this iterative optimization algorithm.
Iterative steps
For brevity, only the essential outline of the RIVC algorithm is described below. For
details, the reader is directed to Chapter 6 which fully describes the discrete-time version
from which RIVC derives. Like most IV methods, RIVC exploits an IV variable x (t )
generated from the following auxiliary model (Young, 1970).
B ( s )
x (t ) = u (t ) (7.9)
A ( s )
and an associated IV vector defined as,
x (t ) = [ s n 1 x (t ) K x (t ) s m u (t ) K u (t )]T (7.10)

The RIVC algorithm can then be summarized as follows.
1. Select the initial A 0 ( s ) polynomial either automatically or manually (see below).
2. Use A 0 ( s ) to generate the pre-filtered variables y (t ) and u (t ) and obtain estimates

A1 ( s ) and B1 ( s ) using linear least squares.
3. Iterate: k = 2 : ni (default ni = 4 )
a) Using A k 1 ( s ) and B k 1 ( s ) to replace A(s ) and B (s ) , respectively, generate

both the pre- filtered data vector z (t ) and the prefiltered instrumental variable
vector x (t ) , the latter using the IV variable x (t ) generated by the auxiliary
model (7.9).
b) Calculate the IV estimate,
a = C 1b (7.11)
N
C = x (t i ) z (t i )T (7.12)
i =1
N
b = x (t i ) s n y (t i ) (7.13)
i =1
where t i denotes the ith of N sampling instants.
4. Generate an estimate of the parametric covariance matrix P using the symmetric

version of the IV algorithm (Young, 1970b, 1984), i.e.,
1
N

P = x (t i ) x (t i )
2 T
(7.14)
i =1
where 2 is the variance of the model residuals. The square root of the diagonal
elements of P provides the standard errors on a (see Theorem 7.1 below).
Comments on pre-filter initialisation
The initiation of the above RIVC algorithm involves the selection of a suitable A 0 ( s )
prefilter polynomial. This is not a difficult task and there are several approaches available
in CAPTAIN. As shown later, these are specified through the 3rd and 4th input arguments to
rivc and rividc. An automatic option is based on the user specifying a suitable sub-
sampling interval for initial discrete-time estimation (this can often be the actual sampling
interval, since the discrete-time RIV algorithm is quite robust and works well even with

rapidly sampled data). The discrete-time model obtained in this manner is then
automatically transformed to continuous-time form to provide the A 0 ( s ) polynomial.
Manual options include the user specification of a suitable single pole value to generate an
all-pole MMF/PMF prefilter or a full order prefilter. The latter two options can be
specified without iterative adaption, in which case the former is equivalent to IVGPMF.
Of course, the final estimates are quite robust to the specification of A 0 ( s ) since the
prefilter is thereafter adaptively adjusted by the algorithm: A 0 ( s ) is simply required as a
device to allow for the initial least squares estimation step. In Example 7.1 below, for
instance, automatic A 0 ( s ) selection via a discrete-time model based on the actual
sampling interval is clearly the simplest user-option, but identical results are obtained if the
all-pole filter option (MMF/PMF) is selected for any specified pole value in the range
0.007 to -6. In effect, therefore, this latter option is simple to use and virtually automatic.
Based on experience, it is the authors preferred option.
Finally, note that the RIVC algorithm is computationally efficient. Again, in Example 7.1
below, it is over 5 times faster than the IVGPMF algorithm when both are implemented on
the authors computer system1.
Theoretical Justification for the IV Variable
The following theorem is a generalization of a similar theorem for discrete-time TF models

proven rigorously by Pierce (1972) and later by Young (1984, p.213-215) using simpler
and somewhat less rigorous, prediction error analysis.
Theorem 7.1
(i) If the e(t ) in (7.2) is a zero mean, Gaussian white noise process;
(ii) the parameter values are admissible (i.e. the model is stable and identifiable); and
(iii) u (t ) is persistently exciting, then the ML estimate a N , obtained from the data set of N
samples, possesses a limiting normal distribution, such that the asymptotic covariance
matrix of the estimation errors associated with the estimate a N is of the form:
1
N
T
P = x (t i ) x (t i )
2 (7.15)
i =1
1 In this regard, the authors are most grateful to Professor Hughes Garnier for providing the IVGPMF
algorithm (as part of the CONTSID Matlab R_Toolbox) and for his advice on the use of the algorithm.

Proof. Modification of the proof in Pierce (1972) and Young (1984) to the continuous-time
case. Although there is no formal proof, simple arguments based on the nature of the IV
iterations, comprehensive MCS studies, and experience over the last 20 years, has shown
that the algorithm is strongly convergent. Provided the above conditions are satisfied,
therefore, the converged RIVC estimates are optimal in a ML sense.
7.3 The Wang - Gawthrop Example
The present section is based on the same simulation example as recently employed by
Wang and Gawthorp (2000). For brevity, these authors will subsequently be referred to as
WG. The analysis is concerned with the identification and estimation of the following
continuous-time TF model,
2s + 1
y (t ) = u (t ) + e(t ) (7.16)
s + 1.6s 2 + 1.6s + 1
3
The TF (7.16) is enclosed within a feedback loop in series with a switch and a relay for
hysteresis: see WG (2000) for details. A typical noisy data set generated in this manner is
shown in Figure 7.1, which has the same general form and noise level as that used by WG.
10
-5
-10
0 5 10 15 20 25 30
2
1
0
-1
-2
0 5 10 15 20 25 30
Time (seconds)
Figure 7.1 Typical input (lower panel) and output data obtained from the WG relay experiment.

Example 7.1 Model Identification for the WG Example
The relatively short data set illustrated in Figure 7.1 provides a difficult estimation
challenge, particularly since the input excitation is rather poor (Young, 2002). These data,
which are a loaded into the MATLAB workspace using the commands below, have a
sampling rate of 0.0333 seconds.
>> load swg01.dat

>> Ts = 100/3000; % sampling rate
>> y = swg01 (:, 1); % output
>> u = swg01 (:, 2); % input
>> t = [0:Ts:1000*Ts-Ts]'; % time (seconds)
>> subplot(211), plot(t, y)
>> subplot(212), plot(t, u)
Assuming no prior knowledge, rivcid is utilised to identify a satisfactory model structure,

searching for up to 4 parameters each for the numerator and denominator polynomials and
specifying a time delay of either zero or unity. Although 13 model structures converge to a
solution, for brevity, only the first five in ascending order of YIC are shown below.
>> flag = [4 Ts 2 0 1]
>> [th, stats, e] = rivcid([y u], [1 1 0; 4 4 1], flag , 1);
___________BEST 5 models___________
den num del YIC Rt2 AIC S2 EVN condP
2 2 1 -7.8975 0.919303 0 1.1587e+000 0 0
2 2 0 -7.7328 0.917128 0 1.1899e+000 0 0
3 2 0 -7.2324 0.940471 0 8.5472e-001 0 0
3 2 1 -7.1765 0.940488 0 8.5449e-001 0 0
4 1 0 -7.0860 0.886135 0 1.6349e+000 0 0
The first two elements of the flag variable stipulate 4 iterations of the RIVC algorithm, i.e.
k = 4 in Section 7.2 above, whilst the 2nd element is the sampling rate of the raw data.
The 5th element specifies that the models are sorted according to their YIC, hence the
[2 2 1] structure with YIC = 7.90 appears at the top of the list here. However, this is not
necessarily the most appropriate model: as for the discrete-time case, the models may
instead be sorted by RT2 . The unity 3rd input argument to rivcid ensures that the initial
A 0 ( s ) polynomial is transformed from an automatically estimated discrete-time model.
Here, the 3rd and 4th elements of flag set the discrete-time sampling rate (2) and ensure
that the pre-filter is adaptively updated at each iteration (0) respectively.
The output arguments take a similar form to the discrete-time equivalent rivid. In
particular, the parameters and other model information are returned in the th variable, stats
contains various statistical diagnostics and e are the modelling errors. However, in this

case, we are primarily interested in the displayed list of models. In this regard, it appears
that the [3 2 0] model returns the best compromise between YIC and RT2 , with the models
above this in the list yielding a relatively poor fit. Of course, [3 2 0] is the structure of the
TF model (7.16) used to generate these data, so this is no surprise. This example
demonstrates that no identification criterion should be used in isolation.
Example 7.2 Parameter Estimation for the WG Example
The estimated parameters in the [3 2 0] model are obtained as follows,
>> [th, stats, e] = rivc([y u], [3 2 0], [20 Ts 2 0], 1);

>> [a, b, c, Ps] = getpar(th);
>> P = diag(Ps);
>> [a(2:end); sqrt(P(1:3))']
ans =
1.5967 1.6131 0.9931
0.0868 0.0437 0.0645
>> [b; sqrt(P(4:5))']
ans =
-2.0223 1.0226
0.1154 0.0319
The square roots of the diagonal elements of the estimated covariance matrix provide the
parameter standard errors, shown on the second row of each ans matrix above. These
estimates are very close to the actual parameter values from (7.16) and, not surprisingly,
the model output matches the deterministic output of the true simulated system (supplied
as the third column of the data file) very well, as shown in Figure 7.2.
>> x = swg01(:, 3); % deterministic system output

>> plot(t, x)
>> hold on
>> plot(t, y-e, 'linewidth', 2) % model response
>> plot(t, x-y+e+8) % off-set errors
>> plot(t, ones(size(y))*8)
Comparison with other approaches
Young (2002) compares and contrasts these results with those of two other, sub-optimal,
continuous-time TF estimation procedures that have been suggested recently, namely the
WG (2000) and IVGPMF (Garnier et al., 1995) approaches mentioned in Section 7.1. In
particular, a MCS exercise similar to that carried out by WG reveals that the RIVC
algorithm is performing optimally and so verifies the theory behind the methodology.
Furthermore, the RIVC algorithm yields the best mean parameter values and standard
errors of the three methods.

10
-2
-4
-6
-8
0 5 10 15 20 25 30
Time (seconds)
Figure 7.2 Comparison of the deterministic TF output (7.16) and estimated model (thick trace),
together with the error signal, where the latter is off-set by +8.
One of the main differences between the RIVC and both the WG and IVGPMF algorithms
is the way in which the prefilters are chosen. In particular, the optimal RIVC prefilter is
generated via an iteratively updated, adaptive optimization procedure. For the present
example, this yields a prefilter A( s ) = s 3 + 1.5967 s 2 + 1.6131s + 0.9931 upon convergence
of this iterative procedure (compared with the optimal s 3 + 1.6s 2 + 1.6s + 1.0 ).
In the WG and IVGPMF approaches, on the other hand, the prefilter is fixed and defined as
1 C ( s ) using either (i) a much more computationally intensive optimization procedure (see
the WG paper), or (ii) manual selection (IVGPMF). Young (2002) examines the Bode
plots of these filters, where it is apparent that the RIVC filter is more precisely defined in
relation to the passband of the system and so its noise attenuation properties are better than
the WG and IVGPMF prefilters, with the result that the statistical efficiency is
correspondingly higher.
7.4 Multiple-Input Example
The discussion above has focused on a single input, single output simulation example. The
present section, therefore, turns to a multiple-input system based on a winding pilot plant
(Bastogne et al., 1998). Winding systems are in general continuous, nonlinear processes.
They are encountered in a wide variety of industrial plants such as rolling mills in the steel

industry, plants involving web conveyance including coating, papermaking and polymer
film extrusion processes. The main role of a winding process is to control the web
conveyance in order to avoid the effects of friction and sliding, as well as the problems of
material distortion which can also damage the quality of the final product.
Example 7.3 Analysis of winding pilot plant data
For the purposes of the present example, we consider experimental data with three input
signals, illustrated in Figure 7.3. These inputs are based on actuated motor currents and an
angular speed set point. They yield the angular speed output signal illustrated in Figure 7.4,
where the model fit is obtained using CAPTAIN as follows.
>> load wind.dat;

>> dt = 0.01; % sampling interval
>> t = [0:dt:(length(wind)-1)*dt]'; % time vector
>> z = prepz(wind, [], length(wind));
>> flags = [-1, dt, 1, 0, 2];
>> [th, stats, e] = rivcid(z, [1 1 1 1 0 0 0; 2 2 2 2 0 0 0], flags, 1);
>> plot(t, z(:, 1))
>> hold on
>> plot(t, z(:, 1)-e)
>> plot(t, e + 0.2)
In the analysis above, rivcid is utilised to identify and estimate a satisfactory model
structure, searching for up to 2 parameters each for the numerator and denominator
polynomials and specifying a time delay of zero for each input. The first column of wind
are the output data, while the remaining columns are the input signals. The function prepz
pre-processes these data, in this case by simply removing the mean of the entire series. As
for Example 7.1, the initial A 0 ( s ) polynomial is converted from an estimated discrete-time
model, here with a unity discrete-time sampling rate.
The model fit and estimated parameters are found below,
>> RT2 = stats(3)

RT2 =
0.9836
a =
1.0000 21.8398 60.0745
b =
-0.3738 10.3659
0.8663 72.3962
1.7392 14.9255

Figure 7.3 Winding pilot plant input data from Bastogne et al. (1998).
Figure 7.4 Winding pilot plant angular speed output (noisy), TF model (7.16) simulated output
and error signal, where the latter is off-set by +2.

In this case, it is clear that over 98% of the variation in the data are explained by the
following 2nd order TF model,
0.3738s + 10.3659
y (t ) = u1 (t )
s + 21.8398s + 60.0745
2
(7.17)
0.8663s + 72.3962 1.7392s + 14.9255
+ u 2 (t ) + u 3 (t )
s + 21.8398s + 60.0745
2
s + 21.8398s + 60.0745
2
where u1 (t ) , u 2 (t ) and u3 (t ) are the second, third and fourth columns of the data matrix.
If required, it is straightforward to solve such a continuous-time TF model using transfer
function blocks in SIMULINK, although this is outside the scope of the present discussion.
7.5 Conclusions
The present Chapter has outlined the optimal RIVC algorithm for the identification and
estimation of continuous-time TF models. Despite its optimality, the algorithm is rather
easier to use than other more recently proposed algorithms, since it does not require
manual specification of the prefilter parameters and is computationally quite a lot faster.
It is interesting to note that, despite the obvious advantages of the RIVC algorithm, it is not
widely used, although availability of the present CAPTAIN toolbox may go some way to
readdress this issue. Perhaps this is because RIVC exploits iteratively adaptive prefilters
and so it is incorrectly perceived as too complicated and not robust enough for practical
usage. The examples in this chapter demonstrate that this is clearly not the case. The RIVC
algorithm involves a simple, rapidly convergent, iterative optimization procedure and
appears to be very robust in practical terms.

CHAPTER 8
TRUE DIGITAL CONTROL
The underlying philosophy of the True Digital Control (TDC) approach is that the entire
design procedure, from the identification and estimation of a suitable model, through to the
practical implementation of the final control algorithm, is carried out in discrete-time.
Therefore, in order to develop a control algorithm of the type used in TDC, we first require
a linearised, discrete-time Transfer Function (TF) representation of the system in question.
The identification and estimation of such models has been discussed in Chapter 6. The
approach yields Proportional-Integral-Plus (PIP) control systems with inherent model
predictive control action (Young et al., 1987; Taylor et al., 1998; 2000).
For now, consider a TF representation of the single-input, single-output system given by

equation (6.2). However, for consistency with numerous earlier publications, the analysis
will utilise the z 1 notation introduced in Chapter 5 (rather than L). Furthermore, only the
deterministic version of the model is required, as follows,
b1 z 1 + ... + bm z m B( z 1 )
yk = 1
uk = uk (8.1)
1 + a1 z + ... + a n z n A( z 1 )
where y k is the measured output (to be controlled) and u k is the control input (or signal to
the actuator), while A( z 1 ) and B ( z 1 ) are appropriately defined polynomials in the
backward shift operator z 1 . For convenience, any pure time delay of samples will be
accounted for by setting the leading parameters of the B ( z 1 ) polynomial to zero.
8.1 Non-Minimal State Space (NMSS) form
Equation (8.1) is represented by the following Non-Minimum State-Space (NMSS) model,
x k = Fx k 1 + gu k 1 + dy d ,k
(8.2)
y k = hx k
The n+m dimensional state vector x k consists of the present and past sampled values of
the output variable y k and the past sampled values of the input variable u k , i.e.,
Chapter 8 True Digital Control
x k = [ y k y k 1 L y k n+1 u k 1 L u k m+1 z k ] T (8.3)
where z k is the integral-of-error between the reference or command input y d ,k and the
sampled output y k , defined as follows,
zk = zk 1 + yd ,k yk (8.4)
The state transition matrix F, input vector g, command input vector d and output vector h
of the NMSS system are subsequently defined below:
- a1 - a 2 L - a n-1 - a n b2 b3 L bm-1 bm 0
1 0 L 0 0 0 0 L 0 0 0
0 1 L 0 0 0 0 L 0 0 0

M M O M M M M O M M M
0 0 L 1 0 0 0 L 0 0 0
F= 0 0 L 0 0 0 0 L 0 0 0
0 0 L 0 0 1 0 L 0 0 0

0 0 L 0 0 0 1 L 0 0 0
0 0 L 0 0 0 0 L 1 0 0
a L a n-1 a n - b2 - b3 L - bm-1 - bm 1
1 a2 (8.5)
g = [b1 0 0 L 0 1 0 0 L 0 b1 ] T
d = [0 0 0 L 0 0 0 0 L 0 1] T
h = [1 0 L 0 0 0 0 L 0 0 0]
Inherent type 1 servomechanism performance is introduced by means of the integral-of-

error part of the state vector. If the closed-loop system is stable, then this ensures that
steady-state tracking of the command level is inherent in the basic design.
Example 8.1 Non-minimal state space form for the ventilation model
Consider again the ventilation system introduced by Example 1.2 and the associated TF
model given by equation (1.5), now represented using z 1 ,
79.8 z 3
yk = uk (8.6)
1 0.438 z 1
where y k is the airflow (m3/h) and u k is the applied voltage to the fan expressed as a
percentage. The control algorithms developed below require that the model has at least one
sample pure time delay and that the first element of the denominator polynomial is unity.
For these reasons, equation (8.6) will be defined in MATLAB as follows:
>> A=-0.4381 % denominator polynomial with leading unity removed

>> B=[0 0 79.7835] % numerator with one sample time delay removed

The help message for control functions in CAPTAIN refers to this as the truncated form of
the TF polynomials. For this example, x k = [ y k u k 1 u k 2 z k ]T , while the NMSS matrices
and vectors in equations (8.6) are determined using nmssform as shown below.
>> [F, g, d, h] = nmssform(a, b)

F =
0.4381 0 79.7835 0
0 0 0 0
0 1.0000 0 0
-0.4381 0 -79.7835 1.0000
g =
0
1
0
0
d =
0
0
0
1
h =
1 0 0 0
For this example, b1 = b2 = 0 , while b3 = 79.7835 and a1 = 0.4381 .
8.2 Proportional-Integral-Plus (PIP) control
The PIP control law associated with the NMSS model (8.2) takes the standard state
variable feedback form,
uk = kx k (8.7)
where k is the n + m dimensional control gain vector,
k = [ f 0 f1 L f n1 g1 L g m1 k I ] (8.8)
In more conventional block-diagram terms, the PIP controller (8.7) can be implemented as
shown in Figure 8.1, where it is clear that it can be considered as one particular extension
of the ubiquitous Proportional-Integral (PI) controller, in which the PI action is, in general,
enhanced by the higher order forward path and feedback compensators 1 G ( z 1 )
and F1 ( z 1 ) , respectively, where G ( z 1 ) and F1 ( z 1 ) are defined as follows,
F1 ( z 1 ) = f1 z 1 + L + f n1z ( n1)
(8.9)
G ( z 1 ) = 1 + g1 z 1 + L + g m1z ( m1)

yd (k ) y(k)
+ kI + + 1
u(k ) B(z1 )
- 1z 1 - - G(z 1 ) A(z1 )
Integral
control F1 (z 1 )
f0
Proportional gain
Figure 8.1 The PIP control system implemented in standard feedback form.
Note that the proportional control component (with gain f 0 ) is often incorporated into
1
F1 (z ) to form a single feedback filter F ( z 1 ) = f 0 + f1z 1 + ... + f n1 z ( n1) .
Since all the state variables in x k are readily stored in the digital computer, the PIP
controller is straightforward to implement in practice, without resort to the design and
implementation of a deterministic state reconstructor (observer) or a stochastic Kalman
filter. However, because it exploits the power of state variable feedback, PIP control still
allows for well-known strategies such as closed-loop pole assignment or optimisation in
terms of a Linear Quadratic (LQ) cost function of the form,

{
J = x i T Qx i + r u i 2
i =0
} (8.11)
where Q is a n+m by n+m matrix and r is a scalar weight on the input. In the latter regard,
it is worth noting that, due to the special structure of the non-minimal state vector, the
elements of the LQ weighting matrices have particularly simple interpretation, since the
diagonal elements directly define weights assigned to the measured variables and integral
of error states. For example, Q can be formed conveniently as a diagonal matrix with
elements defined as follows,
Q = diag[q1 q2 K qn qn+1 K qn+ m1 qn+ m ] (8.12)
Here, the user defined output weighting parameters q1 , q2 , K , qn and input weighting
parameters q n +1 , q n + 2 , K , q n + m 1 are generally set equal to common values of q y
and qu respectively; while q m+ n is denoted by q e to indicate that it provides a weighting
constraint on the integral-of-error state variable z k . In this formulation, the input weight is
defined as r = qu . The default PIP controller is then obtained using total optimal control
weights of unity, i.e., q y = 1 n , qu = 1 m and qe = 1 . The resulting state variable
feedback gains are obtained from the steady state solution of the well known discrete time
matrix Riccati equation (e.g. Astrom and Wittenmark, 1984).

1000
m / hour
3 500
0 10 20 30 40 50 60 70 80 90 100
4
%
-2
0 10 20 30 40 50 60 70 80 90 100
Time (2 second samples)
Figure 8.2 Top: ventilation rate command input and simulated closed-loop response (thick trace).
Bottom: applied voltage expressed as a percentage.
Example 8.2 Pole assignment design for the ventilation model
The MATLAB commands below specify closed-loop poles for the ventilation system (8.6)
and plot the time response. For the purposes of this example, the poles have been
arbitrarily chosen as 0.5, 0.6 0.1i and 0.7 on the complex z-plane.
>> A = -0.4381; B = [0 0 79.7835];

>> k = pip(A, B, [0.5 0.6+0.1i 0.6-0.1i 0.7]); % control gains
>> [acl, bcl, bclu] = pipcl(A, B, k); % closed-loop TF
>> r = zeros(100, 1); r(10:59) = ones(50, 1)+1000; % command input
>> y = filter(bcl, acl, r); subplot(211); plot([y r]) % output variable
>> u = filter(bclu, acl, r); subplot(212); plot(u) % input variable
Note that nmssform is automatically called by the pole assignment function pip and hence
is not explicitly required by the user in this case. The output argument of pip is the control
gain vector defined by equation (8.8), while pipcl determines the closed-loop input and
output polynomials obtained from Figure 8.1 (see e.g. Taylor et al., 1998). The standard
MATLAB function filter uses these to evaluate the time response illustrated, in Figure 8.2.
Finally, to confirm that the closed-loop poles are those expected:
>> roots(acl)

ans =
0.7000
0.6000 + 0.1000i
0.6000 - 0.1000i
0.5000
Example 8.3 Optimal design for the ventilation model
PIP-LQ optimal design is implemented using pipopt, for which the last three input
arguments are multipliers for the diagonal elements of Q in equation (8.12), i.e. q e , qu
and q y in that order. The example below is again based on the ventilation system (8.6),
here with qe = 0.1 , qu = 6 m = 2 and q y = 1 n =1 . For complete freedom to specify any Q
(e.g. with non zero off-diagonal elements) use dlqri instead (see Section 8.4).
>> A = -0.4381; B = [0 0 79.7835];

>> [k, F, g, d, h, Q, R] = pipopt(A, B, .1, 6, 1)
Q =
1.0000 0 0 0
0 2.0000 0 0
0 0 2.0000 0
0 0 0 0.1000
R =
2
Finally, gains can employed at any time to extract the control polynomials from k. In this
case, F ( z 1 ) = 0.0059 , G ( z 1 ) = 1 + 1.0561z 1 + 1.0807 z 2 and k I = 0.0077 .
>> [fpip, gpip, kpip]=gains(A, B, k)

fpip =
0.0032
gpip =
1.0000 0.7078 0.5801
kpip =
0.0034
If the SIMULINK extension to MATLAB is installed, then it is often more convenient to

use this package rather than pipcl and filter for closed-loop simulation, as illustrated in
Figure 8.3. In this manner, it is straightforward to investigate various PIP control
structures, the response to disturbance inputs and other issues. For example, if a nonlinear
simulation model is available, PIP control can be applied to this model using SIMULINK
(e.g. Taylor and Shaban, 2006).
In this case, CAPTAIN includes a SIMULINK library piplib.mdl for implementing various
PIP control structures, including those discussed in Sections 8.3 and 8.4 below. To open
the library, simply enter its name without the extension, i.e. piplib.

Figure 8.3 Top: dialogue box for PIP control showing the control filters obtained using gains as
demonstrated in Example 8.3. Bottom: SIMULINK diagram for PIP control of Equation (8.6).
8.3 PIP control structures
There are a number of different methods available for implementing the PIP control law, in
addition to that shown by Figure 8.1. In practical applications, for example, the PIP control
law derived from Figure 8.1 is always implemented in the following incremental form,
u k = u k 1 + k I { y d ,k y k } F ( z 1 )y k (G ( z 1 ) 1)u k (8.13)
where = 1 z1 is the difference operator. Equation (8.13) allows for the correction that
any lagged u k variables represent the practically realised input signal, accounting for any
level constraints. The latter are given default values of infinity in Figure 8.3 but will, in
general, be based on the physical system under study.
Equation (8.13) is not only the most obvious and convenient for digital implementation,
but also provides an inherent means of avoiding integral windup, a problem that may occur
when the controller is subject to saturation on the control input. Without the correction
discussed above, a prolonged period when the input is saturated and unable to keep the

output at the set point, causes the integrated error signal to build up, resulting in ever larger
input signals that are not achievable in practice. Hence, when the saturation finally ends,
the controller could take several samples to recover and might even drive the system into
instability. The control algorithms included in piplib are all implemented in an appropriate
incremental form: using SIMULINK the user can Look Under Mask to confirm this.
As another example of control structure, it is straightforward to eliminate the inner loop of

Figure 8.1 completely, to form a single forward path TF (or pre-compensation filter), so
that the control algorithm is based on a unity feedback of the output variable, as illustrated
in Figure 8.3. Here, the circumflex notation is introduced to represent the estimated model
parameters, as opposed to the unknown system parameters.
yd (k )
+ kI + 1 B(z 1 ) y(k)
- 1 z 1 - G(z 1 ) A(z 1 )
B (z 1 )
F(z1 )
A (z 1 )
Figure 8.3 The PIP control system implemented in forward path form.
The CAPTAIN block library piplib includes both feedback and forward path forms of PIP
control. These will yield the same closed-loop response in the ideal case, when there is no
model mismatch and no disturbances. Finally, piplib also includes a block for PIP control
with command input anticipation, for which the control gains are obtained using pipcom.
For more information about the relative advantages and disadvantages of these and other
PIP control structures, see Taylor et al. (1998, 2000).
8.4 Multivariable PIP control system design
The extension of the SISO techniques to the multi-input, multi-output (MIMO) case is
relatively straightforward and the present section only gives a brief overview of the results
described by, for example: Young et al. (1994); Taylor and Shaban (2006). Consider the
following p-input, p-output, discrete-time system represented in terms of the left Matrix
Fraction Description or MFD:
y k = [ A( z 1 )]1 B( z 1 )u k or A ( z 1 )y k = B( z 1 )u k (8.14)
[
where y k = y1,k y 2,k K y p ,k ]T [
and u k = u1,k u2,k K u p,k ]T , while the model
1 1 n 1 1
coefficients are A( z ) = I + A1 z + KK + A n z and B( z ) = B1 z + KK + B m z m , in
which Ai (i = 1, 2, K, n) and Bi (i = 1,2,K, m) are p by p matrices. If required, some of the

initial B terms could take null values to accommodate pure time delays in the system. The
state vector for the multivariable NMSS form is then defined as,
x k = [y k K u k m + 1T z k ]
T T T T T T
y k 1 K y k n+1 u k 1 (8.15)
The integral-of-error vector is z k = z k 1 + y d ,k y k where y d ,k is the command input

vector, each element being associated with the relevant system output. Having defined the
above state vector, the NMSS representation can be formulated as follows,
x k = Fx k 1 + Gu k 1 + Dy d ,k
(8.16)
y k = Hx k
A1 A 2 L A n1 A n B 2 B3 L B m1 Bm 0
Ip 0 L 0 0 0 0 L 0 0 0

0 Ip L 0 0 0 0 L 0 0 0
0 0 L Ip 0 0 0 L 0 0 0

F= 0 0 L 0 0 0 0 L 0 0 0
0 0 L 0 0 Ip 0 L 0 0 0
0 0 L 0 0 0 Ip L 0 0 0

0 0 L 0 0 0 0 L Ip 0 0
A A2 L A n1 A n B 2 B3 L B m1 B m I p
1
[
G = B1 0 0 K 0 I p 0 0 K 0 B1 T ]
[
DT = 0 0 0 L 0 0 0 L 0 I p ]
[
H = Ip 0 0 L 0 0 0 L 0 0 ]
Finally, Ip denotes a (p by p) identity matrix and 0 is an appropriately defined matrix of
zeros. Inherent type 1 servomechanism performance is introduced by means of the
integral-of-error state vector and, consequently, if the closed-loop system is stable, then
steady-state decoupling is inherent in the design. The multivariable state variable feedback
control law is defined in the usual fashion, i.e.,
u k = Kx k (8.17)
where K is the control gain matrix. The equation for the closed loop system becomes,
x k = (F GK )x k 1 + Dy d ,k (8.18)

and the poles (or eigenvalues) can be arbitrarily assigned if and only if the pair [F, G] is
completely controllable. However, it is well known that assignment of the closed-loop
poles of a multivariable system does not, in itself, uniquely specify the feedback gain
matrix K. However, as in the SISO situation, the NMSS form can be used as a foundation
for the design of PIP-LQ optimal controllers. Here, the requirement is to design a control
gain matrix which minimises the following quadratic performance criterion:

J = x i Qx i + u i Ru i
T T
(8.19)
i =0
where Q and R are symmetric positive semi-definite and symmetric positive definite
weighting matrices, respectively. The feedback gain matrix which minimises the cost
function is again determined by the steady state solution of the discrete time matrix Riccati
equation (e.g. Astrom and Wittenmark, 1984).
Multivariable PIP control system design using CAPTAIN
As discussed in Chapter 6, the CAPTAIN identification routines riv and rivid are used to
estimate TF models for single output, multiple input systems. These are straightforward to
combine into the required multivariable form (8.14). In particular, CAPTAIN includes
mfdform and nmssform to convert the TF models into MFD form (8.14) and hence into
the equivalent NMSS form (8.16). The LQ solution is obtained with dlqri as follows:
K = dlqri(F, G, Q, R)
Here, the output argument is the control gain matrix K; the 1st and 2nd input arguments
are the state transition and input matrices F and G from equations (8.16); and, finally, the
last two input arguments are the LQ weighting matrices Q and R. The multivariable PIP
control algorithm is most straightforwardly implemented using SIMULINK. In the latter
regard, piplib includes blocks for multivariable PIP control implemented in either
feedback or forward path form. The necessary settings are determined using mpipinit.
CAPTAIN provides on-line demonstrations of all these functions in action, whilst numerous
publications illustrate the application of both SISO and multivariable PIP control to a wide
range of difficult practical examples. An incomplete list includes: Chotai et al. (1991);
Young et al. (1994); Gu et al. (2004); Taylor et al. (2004); Taylor and Shaban (2006).

BIBLIOGRAPHY
Akaike, H. (1974) A new look at the statistical model identification, IEEE Transactions on
Automatic Control, AC-19, 716-723.
Akaike, H. (1980) Seasonal adjustment by a bayesian modeling, Journal of Time Series
Analysis, 1, 1, 1-13.
Ansley, C.F. (1979) An algorithm for the exact likelihood of a mixed auto-regressive
moving average process, Biometrika, 66, 59-65.
Astrom, K.J., and Wittenmark, B. (1984) Computer controlled systems: theory and design,
Prentice-Hall information and system sciences series.
Balke, N.S. (1993) Detecting level shifts in time series, Journal of Business and Economic
Statistics, 11, 1, 81-92.
Bastogne, T., Noura, H. Sibille, P. and Richard, A. (1998) Multivariable identification of
winding process by subspace methods for a tension control. Control Engineering
Practice, 6, 9, 1077-1088.
Beck, M.B. and Young, P.C. (1975) A Dynamic Model for DO-BOD Relationships in a
Non-Tidal Stream, Water Research, 9, 769-776
Biran, A. and Breiner, M. (1995), MATLAB for Engineers, Addison Wesley.
Box, G.E.P. and Jenkins, G.M. (1970), Time Series Analysis: Forecasting and Control,
revised edn, San Francisco: Holden-Day, 1976.
Box, G.E.P., Hillmer, S.C. and Tiao, G.C. (1978) Analysis and modelling of seasonal time
series, in A. Zellner (ed), Seasonal Analysis of Economic time Series, Washington D.
C.: US Dept. of Commerce-Bureau of the Census, 309-334.
Box, G.E.P., Jenkins, G.M. and Reinsel, G.C. (1994) Time Series Analysis, Forecasting
and Control, Englewood Cliffs, New Jersey: Prentice Hall International.
Brown, R.L., Durbin, J. and Evans, J.M. (1975) Techniques of testing the constancy of
regression relationships over time. Journal of the Royal Statistical Society B, 37,
141-92.
Bryson, A.E. and Ho, Y.C. (1969) Applied Optimal Control, Optimization, Estimation and
Control, Waltham: Blaisdell Publishing Company.
Bibliography
Chotai, A., Young, P.C. and Behzadi, M.A. (1991) Self-adaptive design of a non-linear
temperature control system, special issue on self-tuning control, IEE Proceedings
Control Theory and Applications, 38, 41-49.
Casals J., Jerez M. and Sotoca S. (2000) Exact Smoothing for stationary and non-stationary
time series, International Journal of Forecasting, 16, 59-69.
Cobb, G.W. (1978) The problem of the Nile: conditional solution to a change-point
problem, Biometrika, 65, 243-251.
De Jong, P. (1988) The Likelihood for a State Space Model, Biometrika, 75, 1, 165-169.
De Jong, P. (1991) Stable algorithms for the State Space model, Journal of Time Series
Analysis, 12, 2, 143-157.
Durbin, J. and Koopman, S.J. (2001) Time series analysis by state space methods, Oxford:
Oxford University Press.
Etter, D.M. (1993), Engineering problem solving with MATLAB, Prentice-Hall.
Findley, D.F., Monsell, B.C., Bell, W.R., Otto, M.C. and Chen, B.C. (1996) New
capabilities and methods of the X-12 ARIMA seasonal adjustment program, U.S.
Bureau of the Census, mimeo.
Garnier H., Sibille, P. and Richard, A. (1995) Continuous-time canonical state-space
model identification via Poisson moment functionals, Proceedings 34th IEEE
Conference on Decision and Control (CDC95), New Orleans - USA, 3004-3009.
Gu, J., Taylor J. and Seward, D. (2004) The automation of bucket position for the
intelligent excavator LUCIE using the Proportional-Integral-Plus (PIP) control strategy,
Journal of Computer-Aided Civil and Infrastructure Engineering, 19, 16-27.
Harrison, P.J. and Stevens, C.F. (1976) Bayesian Forecasting, Journal Royal Statistical
Society, Series B., 38, 205-247.
Harvey, A.C. (1989), Forecasting Structural Time Series Models and the Kalman Filter,
Cambridge: Cambridge University Press.
Hillmer, S.C. and Tiao, G.C. (1982) An ARIMA-Model based approach to seasonal
adjustment, Journal of the American Statistical Association, 77, 63-70.
Hillmer, S.C., Bell, W.R. and Tiao, G.C. (1983) Modelling Considerations in the Seasonal
Adjustment of Economic Time Series, in A. Zellner (Ed.), Applied Time Series Analysis
of Economic Data, Washington D. C.: US Dept. of Commerce-Bureau of the Census,
74-100.
Hodrick, T. and Prescott, E. (1997) Post-war US business cycles: an empirical
investigation, Journal of Money, Credit and Banking, 29, 1-16.
Hu, J., Kumamaru, K. and Hirasawa, K. (2001) A quasi-ARMAX approach to modelling
nonlinear systems, International Journal of Control, 74, 1754-1766.

Bibliography
Imbrie, J., Boyle, E.A., Clemens, S.C., Duffy, A., Howard, W.R.,Kukla, G., Kutzback, J.,
Martinson, D.G., McIntyre, A., Mix, A.C., Molfino, B., Morley, J.J., Peterson, L.C.,
Pisias, N.G., Prell, W.L., Raymo, M.E., Shackleton, N.J. and Toggweiler, J.R. (1992)
On the structure and origin of major glaciation cycles: 1. Linear responses to
Milankovitch forcing. Paleoceanography, 7, 701-738.
Jakeman, A.J. and Young, P.C. (1981) Recursive filtering and the inversion of ill-posed
causal problems, Utilitas Math, 25, 351-376.
Jarque, C.M. and A.K. Bera (1980) Efficient tests for normality, homoskedasticity and
serial independence of regression residuals, Economic Letters, 6, 255-259.
Jarvis, A.J., Young, P.C., Taylor, C.J. and Davies, W.J. (1999) An analysis of the dynamic
response of stomatal conductance to a reduction in humidity over leaves of cedrella
odorata, Plant, Cell and Environment, 22, 913-924.
Kalman, R.E. (1960), A new approach to linear filtering and prediction problems, ASME
Transactions Journal Basic Engineering, 83-D , 95-108.
Kalman, R.E. and Bucy, R.S. (1961) New results in linear filtering and prediction theory,
ASME Transactions, Journal of Basic Engineering, 83-D, 95.
Koopman, S.J., A.C. Harvey, J.A. Doornik and Shephard, N. (2000) STAMP 6.0:
Structural Time Series Analyser Modeller and Predictor, Timberlake Consultants.
Ljung, G.M. and Box, G.E.P. (1978) On a measure of lack of fit in time series models,
Biometrika, 65, 297-303.
Maravall, A. and Gmez, V. (1998) Programs TRAMO and SEATS, Instructions for the
User (Beta Version: June 1998), Madrid: Bank of Spain.
Mathworks, The, (2001) Matlab, the language of technical computing. Version 6.1
(Release 12.1). The MathWorks Inc. (www.mathworks.com).
Mees, A. I., Aihara, K., Adachi, M., Judd, K., Ikeguchi, T. and Matsumoto, G. (1992)
Deterministic prediction and chaos in squid axon response, Physics Letters A,
169, 41-45.
Ng, C.N. and Young, P.C. (1990) Recursive estimation and forecasting of nonstationary
time series, Journal of Forecasting, 9, 173-204.
Parkinson, S. and Young, P.C. (1998) Uncertainty and sensitivity in global carbon cycle
modelling, Climate Research, 9, 157-174.
Pierce, D.A. (1972) Least squares estimation in dynamic disturbance time-series models,
Biometrika, 59, 73-78.
Price, L., Young, P., Berckmans, D., Janssens, K. and Taylor, J. (1999) Data-Based
Mechanistic Modelling (DBM) and Control of Mass and Energy transfer in agricultural
buildings, Annual Reviews in Control, 23, 71-82.

Bibliography
Price, L.E., Goodwill, P., Young, P.C. and Rowan, J.S. (2000) A data-based mechanistic
modelling (DBM) approach to understanding dynamic sediment transmission through
Wyresdale Park Reservoir, Lancashire, UK, Hydrological Processes, 14, 1, 63-78.
Price, L.E., Bacon, M.A., Young, P.C. and Davies, W.J. (2001) High-resolution analysis of
tomato leaf elongation: the application of novel time-series analysis techniques, Journal
of Experimental Botany, 52, 362, 1925-1932.
Priestley, M.B. (1989) Spectral Analysis and Time Series, 2 vols. (6th printing), London
and New York: Academic Press.
Ratto, M., Tarantola, S., Saltelli, A. and Young, P.C. (2004) Accelerated estimation of
sensitivity indices using State Dependent Parameter models, 4th International
Conference on Sensitivity Analysis of Model Output, SAMO 2004, March 8-11, Santa
Fe, NM, USA.
Schweppe, F. (1965) Evaluation of likelihood function for Gaussian signals, IEEE
Transaction Information Theory, 11, 61-70.
Shackley, S., Young, P.C., Parkinson, S. and Wynne, B. (1998) Uncertainty, complexity
and concepts of good science in climate change modelling: are GCMs the best tools?
Climate Change, 38, 159-205.
Sims, C.A. (1980) Macroeconomics and reality, Econometrica, 48, 1-48.
Taylor, C.J. and Shaban, E.M. (2006) Multivariable Proportional-Integral-Plus (PIP)
control of the ALSTOM nonlinear gasifier simulation, IEE Proceedings Control Theory
and Applications, 153, 3, 277-285.
Taylor, C.J., Chotai, A. and Young, P.C. (1998) Proportional-Integral-Plus (PIP) control of
time delay systems, IMECHE Proceedings Journal of Systems and Control
Engineering, 212, Part I, 37-48.
Taylor, C.J., Chotai, A. and Young P.C. (2000) State space control system design based on
non-minimal state-variable feedback : Further generalisation and unification results,
International Journal of Control, 73, 1329-1345.
Taylor, C.J., Leigh, P., Price, L., Young, P.C., Berckmans, D., Janssens, K., Vranken, E.
and Gevers, R. (2004) Proportional-Integral-Plus (PIP) control of ventilation rate in
agricultural buildings, Control Engineering Practice, 12, 2, 225-233.
Taylor, C.J., Pedregal, D.J., Young, P.C. and Tych, W. (2007) Environmental Time Series
Analysis and Forecasting with the Captain Toolbox, Environmental Modelling and
Software, 22, 797-814.
Tych, W., Young, P.C., Pedregal, D. and Davies, J. (1999) A software package for multi-
rate unobserved component forecasting of telephone call demand. International Journal
of Forecasting.

Bibliography
Unbehauen, H. and Rao, G.P. (1997) Identification of continuous-time systems: a tutorial,

Proceedings 11th IFAC Symposium on System Identification, Kitakyushu, Japan,
1023-1049.
Wang, L. and Gawthrop, P. (2001) On the estimation of continuous-time TFs,
International Journal of Control, 74, 889-904.
West, M. and Harrison, J. (1989) Bayesian Forecasting and Dynamic Models, New York:
Springer-Verlag.
Ye, W., Jakeman, A.J. and Young, P.C. (1998) Identification of improved rainfall-runoff
models for an ephemeral low yielding Australian catchment, Environmental Modelling
and Software, 13, 59-74.
Young, P.C. (1970) An instrumental variable method for real-time identification of a noisy
process, Automatica, 6, 271-287.
Young, P.C. (1978) A general theory of modeling for badly defined dynamic systems, in
G.C. Vansteenkiste (ed.), Modeling, Identification and Control in Environmental
Systems, North Holland: Amsterdam, 103-135.
Young, P.C. (1981) Parameter estimation for continuous-time models - a survey,
Automatica, 17, 23-39.
Young, P.C. (1983) The validity and credibility of models for badly defined systems, in
M.B. Beck and G. Van Straten (eds.), Uncertainty and Forecasting of Water Quality,
Springer Verlag: Berlin, 69-100.
Young, P.C. (1984) Recursive Estimation and Time-Series Analysis, Berlin: Springer-
Verlag.
Young, P.C. (1985) The instrumental variable method: a practical approach to
identification and system parameter estimation. In: Barker, H. A. and Young, P.C.
(Eds.), Identification and System Parameter Estimation, Pergamon: Oxford, 1-15.
Young, P.C. (1993a) Concise Encyclopedia of Environmental Systems, Oxford: Pergamon
Press.
Young, P.C. (1993b) Time variable and state dependent modelling of nonstationary and
nonlinear time series. In: Subba Rao. T. (Ed.), Developments in time series analysis,
Chapman and Hall: London, 374-413.
Young, P.C. (1994) Time-variable parameter and trend estimation in non-stationary
economic time series, Journal of Forecasting, 13, 179-210.
Young, P.C. (1998a) Data-based mechanistic modelling of engineering systems, Journal of
Vibration and Control, 4, 5-28.

Bibliography
Young, P.C., (1998b) Data-based mechanistic modelling of environmental, ecological,

economic and engineering systems, Environmental Modelling and Software, 13,
105-122.
Young, P.C. (1999a) Nonstationary time series analysis and forecasting, Progress in
Environmental Science, 1, 3-48.
Young, P.C. (1999b) Data-based mechanistic modelling, generalised sensitivity and
dominant mode analysis, Computer Physics Communications, 117, 113-129.
Young, P.C. (2000) Stochastic, dynamic modelling and signal processing: time variable
and state dependent parameter estimation. In: W. J. Fitzgerald et al. (Eds.) Nonlinear
and nonstationary signal processing, Cambridge University Press: Cambridge, 74-114.
Young, P.C. (2001a) The identification and estimation of nonlinear stochastic systems. In:
A.I. Mees (Ed.), Nonlinear Dynamics and Statistics, Birkhauser: Boston.
Young, P.C. (2001b) Comment on A quasi-ARMAX approach to the modelling of
nonlinear systems by J. Hu et al., International Journal of Control, 74, 1767-1771.
Young, P.C. (2002) Optimal iv identification and estimation of continuous-time TF
models, International Federation of Automatic Control Triennial World Congress,
Barcelona.
Young, P.C. and Benner, S. (1991) microCAPTAIN 2 User handbook, Centre for Research
on Environmental Systems and Statistics, Lancaster University.
Young, P.C. and Beven, K.J. (1994) Data-based mechanistic modelling and the rainfall-
flow nonlinearity, Environmetrics, 5, 335-363.
Young, P.C. and Jakeman, A.J. (1980) Refined instrumental variable methods of recursive
time-series analysis: part III, extensions, International Journal of Control, 31, 741-764.
Young, P.C. and Lees, M.J. (1993) The Active Mixing Volume (AMV): a new concept in
modelling environmental systems, Chapter 1 in V. Barnett and K.F. Turkman (eds.),
Statistics for the Environment. J. Wiley: Chichester, 3-44 .
Young, P.C. and Minchin, P. (1991) Environmetric time-series analysis: modelling natural
systems from experimental time-series data, Int. J. Biol. Macromol., 13, 190-201.
Young, P.C. and Pedregal, D.J. (1996) Recursive fixed interval smoothing and the
evaluation of LIDAR measurements: A comment on the paper by Holst et al.
Environmetrics, 7, 417-427.
Young, P.C. and Pedregal, D.J. (1997) Data-Based Mechanistic Modelling, in C. Heij et al.
(eds.), System Dynamics in Economic and Financial models, Chichester: J. Wiley.
Young, P.C. and Pedregal, D.J. (1998) Data-based mechanistic modelling (of macro
economic systems). Chapter 6 in C. Heij et al (Eds.), System Dynamics in Economic and
Financial Models, J. Wiley: Chichester and new York.

Bibliography
Young, P.C. and Pedregal, D.J. (1999a) Macro-economic relativity: government spending,
private investment and unemployment in the USA 1948-1998, Structural Change and
Economic Dynamics, 10, 359-380.
Young, P.C. and Pedregal, D.J. (1999b) Recursive and en bloc approaches to signal
extraction. Journal of Applied Statistics, 26, 103-128.
Young, P.C., Behzadi, M.A., Wang, C.L. and Chotai, A. (1987) Direct digital and adaptive
control by input-output, state variable feedback pole assignment, Int. J. of Control, 46,
1867-1881.
Young, P.C., Lees, M.J., Chotai, A., Tych, W. and Chalabi, Z. (1994) Modelling and PIP
control of a glasshouse microclimate, Control Eng. Practice, 2, 591-604.
Young, P., Parkinson, S. and Lees, M. (1996) Simplicity Out Of Complexity: Occam's
Razor Revisited, Journal Of Applied Statistics, 23, 165-210.
Young, P.C., Jakeman, A.J. and Post, D.A. (1997) Recent advances in the data-based
modelling and analysis of hydrological systems. Water Science and Technology, 36,
99-116.
Young, P.C. Pedregal D.J. and Tych, W. (1999) Dynamic Harmonic Regression, Journal
of Forecasting, 18, 369-394.
Young, P.C., Price, L.E., Berckmans, D. and Janssens, K. (2000) Recent developments in
the modelling of imperfectly mixed airspaces, Computers and Electronics in
Agriculture, 26, 3, 239-254.
Young, P.C., McKenna, P. and Bruun, J. (2001) Identification of non-linear stochastic
systems by state dependent parameter estimation, International Journal of Control, 74,
1837-1857.

APPENDIX 1
REFERENCE GUIDE
Installation instructions and conditions of use are given in the preface. Since CAPTAIN is
largely a command line toolbox, it is assumed that the reader is already familiar with basic
MATLAB usage. However, the examples given in the body of the text, together with the
on-line demonstrations, should enable the inexperienced user to get started straight away.
CAPTAIN is usually distributed as a mixture of pre-parsed pseudo-code (P-files) and

conventional M-files. The user accessible M-files automatically call the P-files as required.
Appendix 1 lists each M-file in alphabetical order, showing the calling syntax and a brief
description of the associated input and output arguments. Straightforward examples of
function usage are provided in some cases. However, it should be stressed that, for brevity,
Appendix 1 does not discuss the background theory, acknowledge previous publications or
provide any worked examples. Furthermore, any models described are given in a
simplified form. For the formal definitions of these models and examples, the reader is
directed to the relevant chapters of the text by the See Also section of each entry.
On-line help information for CAPTAIN follows MATLAB conventions. For example, to
obtain a full list of functions, type help captain in the Command Window, where captain
is the name of the installation directory. Similarly, the brief calling syntax for each
function is obtained by entering its name without any input arguments, while more
information is provided using the standard help command. For example, the command
help period gives information about the CAPTAIN function for estimating the periodogram
of a series. In the latter case, each input argument is described in turn, followed by the
output arguments and any other information. Default values for any optional inputs are
given in brackets, whilst any necessary inputs are listed with an asterix (*).
Such on-line help messages in CAPTAIN are kept deliberately concise, so that the
experienced user can find information quickly. For this reason, the present chapter
provides more descriptive information about each of the options. However, these pages are
best utilised in conjunction with the on-line help, since the exact calling syntax and default
values may depend on a particular CAPTAIN toolbox version and MATLAB release.
Appendix 1 Reference Guide
ACF
Sample Autocorrelation and Partial Autocorrelation functions. Computes these two
functions and graphs them against the lag. Additional outputs are the standard deviations of
both functions, together with the Ljung-Box Q autocorrelation test.
Synopsis
tab = acf(y, ncoef, s, out, par)
Description
The sample autocorrelation function measures the linear correlation between a time series
and several past values. The representation of these coefficients against the lag is called the
Autocorrelation Function (ACF). The Partial Autocorrelation Function (PACF) also
measures the linear correlation between different lags of a variable, but when all the
intermediate lags have been taken into account simultaneously.
The time series vector y (column) is the only compulsory input to this function.
The input argument ncoef is the number of lags to include in the graphical/tabular output.
A typcial value is the length of the series divided by four. Input s is the seasonal period of
the time series, which adjusts the plot of all the seasonal coefficients in order to make the
identification of such lags more straightforward. Input out turns the tabular output on or
off and par is the number of parameters in the model, assuming that the series y are
effectively the residuals from a previously estimated model. This input is necessary in
order to use the correct degrees of freedom of the distribution of the Ljung-Box Q
autocorrelation statistic under the null hypothesis.
The output tab is the output table shown in the MATLAB command window when out is
set to 1. Here, the columns are the ACF function and its standard error; the Q statistic and
its probability value (the probability at the right hand side of the statistic given by its
distribution); and the PACF function and standard errors.
Example
>> acf(randn(200, 1), 24, 4);
See Also
References and background theory are given in Chapter 2. See also Examples 3.1, 3.2. and
6.4. Related functions include arspec, aic, ccf and statist.

AIC
Akaike Information Criterium (AIC) for Auto-Regressive (AR) models. Computes a range
of AR models and returns the best one according to AIC.
Synopsis
arpoly = aic(y, mod, out)
Description
The second input argument mod defines the range of AR model orders for which to
compute the AIC. It may be a scalar, in which case the function will search from order 1 to
the specified order; or it may be a vector of dimension 2 indicating the minimum and
maximum AR orders to search in. The default value is the minimum of N/2 and 32, where
N is the series length. Finally, out specifies whether function should provide tabular and
graphical output (1) or not (0 - default).
The function returns arpoly, i.e. the AR polynomial of the model selected by the AIC.
Examples
>> arpoly = aic(y);
Calculates the AR coefficients of the best model according to AIC, searching for model
orders 1 to 32 if the number of samples in y is greater than 64, or from order 1 to half the
size of the series otherwise.
>> arpoly = aic(y, [20 30], 1);
Selects the best AR model according to AIC, searching for model orders between 20 and
30 and showing the tabular and graphical outputs.
See Also
References and background theory are given in Chapter 3. See also Examples 3.2 and 4.2.
Related functions include mar, univ and univopt.

ARSPEC
Computes the Auto-Regressive (AR) spectrum of a time series.
Synopsis
[amp, t, pks, amps] = arspec(y, nar, out, N, v)
Description
The second input argument nar is the desired AR model order on which the spectrum
estimation is based. If it is not supplied, is an empty matrix or is zero, the AR order is
automatically selected via the AIC criterium (see aic). The next output argument out, is a
vector of dimension 2 that controls the tabular and graphical output of the function, allows
for a log scale, or selects a range of frequencies on which the AR-spectrum should be
estimated. In particular, a value of 1 for out(1) sets the graphical and text output on, if it is
-1 there is no graphical output, while both graphics and text are off for a 0 value. The scale
for the power axis is set always to logarithmical but the frequency axis scale may be
selected by setting out(2) to 1. Any other value of out(2) is regarded as a desired period,
with the graphical output then limited between the range and out(2).
N may be a scalar or a vector. When scalar it refers to the number of points in which the
frequency axis should be divided (default value is 1032). Alternatively, a vector would be
considered as the frequency axis itself, normalised between 0 and 0.5 (corresponding to
periods from to 2). In order to make the visual inspection of the graphical output more
straightforward, v can be supplied as a vector of any size, indicating the periods at which
vertical lines should be plotted.
The function returns amp, the estimated AR spectrum; t, the frequency axis at which the
spectrum is estimated; pks, the frequency at which peaks occur; and amps, the amplitude
of the peaks. All these outputs are shown by the optional tabular output.
Examples
>> arspec(y)
Displays the AR spectrum selecting the optimal AR order by AIC.

>> arspec(y, 24, [1 1], 500, [12 6])
Computes the results for an AR(24) spectrum. The spectrum is calculated on 500
frequencies, with both tabular and graphical output generated. For the latter case, the
spectrum uses a logarithmic scale for the frequency axis and plots two vertical lines at
periods of 12 and 6.
See Also
References and background theory are given in Chapter 3. See also Examples 3.1, 3.2, 3.3
and 3.4. Related functions include aic, acf, period, mar and dhropt.

BOXCOX
Optimal Box-Cox transformation for homoskedasticity. This function also plots the range-
mean plot and standard error-mean plot.
Synopsis
xt = boxcox(y, trans, bin, out)
Description
This function produces several diagnostics in order to look for time varying mean and
variance in a time series. It computes several such Box-Cox transformations and assesses
each of them in terms of several criteria, including the standard error of the transformed
variable (it should be a minimum); the mean over the standard error (maximum); and the
likelihood function (maximum).
In addition, it provides two graphical plots, useful to check for stationarity in the mean and
variance simultaneously. In order to build these graphs, the time series is divided into
several bins and the range, standard deviation and mean of each bin is estimated. The plots
consist of representing the mean against the standard deviation of each bin (standard error-
mean plot) and the range against the mean (range-mean plot).
The input trans is a vector of real values that selects the particular transformations in order
to compute the criteria with; bin is the number of bins to divide the variable or, if a vector,
the limits of each bin; finally, out sets the output on or off.
This function returns the variable transformed according to the likelihood criteria.
Examples
>> boxcox(y);
>> boxcox(y, (-1 : 0.1 : 1), 20);
See Also
Related functions include cusum and histon.

CAPTDEMO
Captain Toolbox demonstrations.
Synopsis
captdemo
Description
This command initialises the standard MATLAB Demo Window for access to the on-line
demonstrations. It provides basic background information about CAPTAIN, slideshows and
numerous Command Line demos. The latter demos utilise the MATLAB Command
Window for input and output, as well as generating graphs in a separate figure window.
See Also
See also Section 1.3.

CCF
Sample Cross-Correlation Function between two variables.
Synopsis
tab = ccf(y, x, ncoef, s, title, out)
Description
The CCF measures the linear correlation between the selected output and input variables at
different lags. It is defined for positive and negative value of the lag. Positive values mean
that the output leads the input and vice versa. The function displays a standardised plot of
both the time series and the CCF in graphical format (with confidence bands so that a
significance test for each coefficient may be completed) and numerical output.
The time series vectors y (column) and x are the only compulsory inputs to this function.
The input argument ncoef is an indication of the number of coefficient to calculate. The
total number of coefficients is 2*ncoef+1, because it starts at -ncoef and ends at ncoef,
including lag 0, where the latter is the contemporaneous linear correlation between both
variables.
Other inputs to this functions are s, the seasonal period that is used in the graphs for a
quick location of such values; title is a string variable indicating the required figure title;
and out sets the graphical and numerical output on or off, depending on whether the user
wishes only to compute the CCF or prefers to visualise the output.
The output is a table displayed on the MATLAB command window. Here, the columns are
the ccf values; their standard deviation; the Ljung-Box Q statistic of cross correlation; and
its probability value, i.e. the probability that the Q statistics leaves at the right hand side of
its distribution.
Examples
>> x= [zeros(100, 1); ones(100, 1)];

>> y= filter([2 0.8], [1 0.8], x) + randn(200, 1);
>> ccf(y, x, 10);
See Also
References and background theory are given in Chapter 6. See also Examples 4.1. and 6.4.
Related functions include acf and statist.

CREATETH
Modifies -matrix to specify parametric uncertainty for Monte Carlo Simulation.
Synopsis
th = createth(th, P, a, b)
Description
Here, th is the -matrix with information about the transfer function model structure,
estimated parameters and their estimated accuracy (see help theta), while P is the
parameter covariance matrix. The 3rd and 4th input arguments are the truncated form
system numerator b (with assumed unit delay) and denominator a (with assumed leading
unity) polynomials. If these are omitted then the parameter values in th are left unchanged
and only the covariance matrix component is modified. The function returns a modified th
matrix. It is typically used to evaluate the robustness of Proportional-Integral-Plus (PIP)
control systems using Monte Carlo Simulation.
Example
>> th = riv([y u], [1 1 1 0])

>> [a, b, c, P]=getpar(th)
>> th = createth(th, P*10)
>> [aa, bb] = mcpar(th, 50)
>> v = pipopt(aa(1, :), bb(1, :), 1, 1, 1)
>> for f=1:50
>> [acl, bcl]=pipcl(aa(f, :), bb(f, :), v);
>> y(:, f)=filter(bcl, acl, ones(30, 1));
>> end
>> plot(y)
Generate 50 Monte Carlo realisations from an estimated transfer function model and
evaluate the robustness of a closed-loop PIP control system. In this example, createth is
used to scale up the parametric uncertainty by a factor of 10.
See Also
References and background theory are given in Chapters 6 and 8. See also mcpar.

CUSUM
CUSUM and CUSUMSQ tests for constancy of the mean and variance of a time series.
Synopsis
[CUSUM, CUSUMSQ] = cusum(y)
Description
This function provides the usual CUSUM (cumulative sum) and CUSUM of squares of a
given time series. The CUSUM is computed as the cumulative sum of the standardised
series. If the mean changes along the series there will be runs of positive or negative
observations; here, the cumulative sum will drift up or down the zero line. The formal
assessment is completed on the basis of the 5% confidence bands that are also plotted in
the graphical output. Crossing any of the boundaries is an indication of lack of constancy
in the mean.
The CUSUM of squares is similar to the previous description, except that the cumulative
sum is performed on the cumulative sum of squares. Therefore, it always increases as time
increases, but if the time series is homoskedastic, this increment should be linear, i.e. it
should not depart from the 45 line.
The input y is just the time series, and the outputs CUSUM and CUSUMSQ are the
cumulative sum and the cumulative sum of squares, respectively.
Example
>> cusum(randn(200, 1));
See Also
Related functions include boxcox and histon.

DAR
Dynamic Auto-Regression (DAR) and time frequency analysis. Computes the DAR model
for user specified model structure and optionally plots the time frequency spectra.
Synopsis
[fit, fitse, par, parse, comp, e, H, y0] =

dar(y, na, TVP, nvr, alpha, P0, x0, sm, ALG, PL, res, Nc)
Description
The time series vector y (column) is the only compulsory input to this function. The
function automatically handles missing values in y. In fact, y may be appended with
additional NaNs to forecast or backcast beyond the original series. It is usually preferable
to standardise y before using this function although this is left up to the user, see e.g.
help stand. The remaining input arguments are optional. The AR model structure is
defined by na, which is a scalar or vector listing the required past output variables used in
the model. For example, [1:5, 20] specifies a model based on yt 1 to yt 5 plus a yt 20
component (i.e. subset AR).
TVP is a vector specifying the model associated with each AR model parameter, listed in
order of higher powers of the backward shift operator L. Choices include a RW/AR(1)
model by default (0) or a IRW/SRW model (1). For the case of AR(1) or SRW models,
alpha less than unity specifies the additional parameter, while the default value of unity
implies a RW or IRW model. For example, a 1st order autoregressive process requires
TVP set to zero and 0<alpha<1, where alpha is the AR(1) parameter. Similarly, for a
SRW model, TVP is set to unity and 0<alpha<1, where alpha is the smoothing parameter.
nvr is a vector of NVR hyperparameters for each regressor where, for example, zero
(default) implies time invariant parameters. The initial state vector and diagonal of the
P-matrix may be specified using x0 and P0, with default values of 0 and 1e5 respectively.
FIS may be turned off by changing sm from its default unity to 0. In this case, the model fit
and estimated parameters are their filtered values. This speeds up the algorithm and
reduces memory usage in cases when smoothing is not required. Finally, either the P (0) or
default Q (1) smoothing algorithms are selected using the ALG input argument. Here, the
latter is often more robust for RW/IRW models, while SRW models require use of the
former. In general, should convergence problems be encountered, changing the algorithm
in this manner may help.

Use PL to specify a time spectra graph. The default zero calculates the spectra and returns
the output argument H but does not display the graph, while -1 returns an empty H which
saves memory and increases computation time. A value of 1 graphs a 3d coloured surface
with contours, 2 a 2d contour plot and 3 a stacked plot. The associated input arguments res
and Nc control the dimension of H, namely 2^res by length(par), and the number of
contours (PL = 2) or stacking distance (PL = 3) respectively. Experiment with res and Nc
to obtain the best plot for different computer systems and data sets. The value of Nc is
ignored when PL = 0 or PL = 1.
If the lengths of TVP, nvr, alpha, P0 or x0 are less than the AR model order, then they are
automatically expanded to the correct dimensions by using the final element of the
specified input vector. For example, if z has 3 columns but TVP is defined as [1 0], then
TVP is automatically expanded to [1 0 0]. Similarly, a scalar P0 implies an identity matrix
scaled by this value.
The function returns the model fit (with the same dimensions as y) and parameters par
(one column for each regressor), together with the associated standard errors in each case,
fitse and parse. It also returns each of the linear components of the model comp, the
normalised innovations sequence e and interpolated data y0, where the latter consist of the
original series with any missing data replaced by the model. Note that fit is the sum of the
columns in comp, while the normalised innovations are padded with initial NaNs to ensure
that the vector is the same size as y. If statistical tests on the innovations are required,
remove these NaNs with the command e = e(~isnan(e)). Finally, H is the DAR spectrum;
for example, use surf(H) to plot the time spectra as a surface.
Examples
>> fit = dar(y, [1 12], [1 0], 0.001, [0.95 1])
AR type model for the output yt = a1,t yt 1 + a 2,t yt 12 with an SRW model ( = 0.95 )
for the first TVP a1,t and an RW model for a 2,t . NVR = 0.001 in both cases.
>> dar(y, [1:3], [], [0.1 0])
3rd order AR model with RW models for the parameters, using NVR = 0.1 for the a1,t
parameter and 0 for all the rest.
See Also
References and background theory are given in Chapter 4. See also Example 4.2. Related
functions include daropt, darsp, fcast, stand, darx and dtfm.

DARDEMO
Demonstration script for Dynamic Auto-Regression (DAR).
Synopsis
dardemo
Description
Type the name of this script and press return to run the on-line demo.

DAROPT
Hyper-parameter estimation for Dynamic Auto-Regression (DAR) analysis.
Synopsis
[nvr, alpha, opts, parse] =

daropt(y, na, TVP, meth, nvrc, alphac, nvr0, alpha0, opts, ALG, tab, P0)
Description
function automatically handles missing values in y. In fact, y may be appended with
additional NaNs to forecast or backcast beyond the original series. It is usually preferable
to standardise y before using this function although this is left up to the user, see e.g.
help stand. The remaining input arguments are optional. The AR model structure is
defined by na, which is a scalar or vector listing the required past output variables used in
the model. For example, [1:5, 20] specifies a model based on yt 1 to yt 5 plus a yt 20
component (i.e. subset AR).
TVP is a vector specifying the model associated with each regression parameter, listed in
order of higher powers of the backward shift operator L. Choices include a RW/AR(1)
model by default (0) or a IRW/SRW model (1). meth is the estimation method, where the
default Maximum Likelihood ml may be replaced by f# to compute the sum of squares
of the #-step-ahead forecasting errors.
nvrc defines the constraints for each NVR, where -2 implies free estimation, -1
constrained estimation (all parameters with nvrc = -1 are equal) and >=0 implies the
associated NVR is constrained to this value (it is not estimated). alphac defines similar
constraints for each parameter (-2, -1, or >=0 as for nvrc). Initial NVR and hyper-
parameters may be specified using nvr0 and alpha0 respectively. For example, to
optimise for a RW or IRW model, ensure alphac and alpha0 are unity (the default). To
optimise component i for a SRW, set the i'th element of TVP to unity and alphac(i) to -2,
-1, or fixed at 0<alphac<1. This normally produces an improved fit to the spectrum, but
computation time is longer.
Optimisation options may be set using opt (type help foptions for details), while ALG
specifies the optimisation algorithm: fmins (0), fminu (1) or leastsq (2). Here, ALG
selects between the more efficient gradient search methods of fminu (see help fminu) and
the more robust (especially for discontinuous problems) direct search methods of fmins
(see help fmins). When meth = f# there is the additional option of using leastsq (see

help leastsq). Note that if ALG = 2 and meth = ml, then an error occurs, since leastsq
cannot be used in the Maximum Likelihood case. The Optimisation Toolbox for MATLAB
is required to use fminu or leastsq.
Finally, tab defines the display options and P0 specifies the initial diagonal of the P
matrix. Here, if tab = 1 or 2, then the final results are displayed in tabular form.
Additionally, if tab = 2, a window appears during optimisation showing the latest value of
the Likelihood Function or the Sum-of-Squares for the #-step-ahead forecasting errors.
When ALG = 0 (fmins) and tab = 2, a stop button will appear below the update window:
click to terminate the optimisation and return the current estimates.
If the lengths of TVP, nvrc, alphac, nvr0, alpha0 or P0 or x0 are less than the AR model
order, then they are automatically expanded to the correct dimensions by using the final
element of the specified input vector. For example, if z has 3 columns but TVP is defined
as [1 0], then TVP is automatically expanded to [1 0 0]. Similarly, a scalar P0 implies an
identity matrix scaled by this value.
The function returns vectors of NVR and hyperparameters, nvr and alpha respectively.
opts provides confirmation of the options utilised, together with the number of function
evaluations etc. (type help foptions for details). For the case of AR(1) or SRW models,
implies a RW or IRW model. Finally, parse are the standard errors of the NVR and alpha
(if optimised) hyper-parameters. However, computation time can sometimes be greatly
reduced if this 4th output argument is omitted from the function call.
Examples
>> nvr = daropt(y, [1 12], 0, [0 -2])
AR type model yt = a1,t yt 1 + a 2,t yt 12 with a RW model for both parameters and a1,t
assumed constant (NVR fixed at zero).
>> nvr = daropt(y, [1:3])
3rd order AR model, with a RW model for all the parameters and the 3 NVR hyper-
parameters estimated simultaneously.
>> daropt(y, [1:3], [], 'f4 36')
Optimisation based on sum of the squares of the 4-step-ahead forecasting errors with initial
conditions from the AR(36) spectrum.
See Also
functions include dar, fcast and stand.

DARSP
Dynamic Auto-Regression (DAR) spectra plot. Computes the time spectra plot using time
varying DAR parameters.
Synopsis
[H, ph] = darsp(par, res, PL, Nc, dt)
Description
The parameter matrix par from DAR is the only compulsory input to this function.
Use PL to specify the type of time spectra graph. The default zero calculates the spectra
and returns the output arguments but does not display the graph. A value of 1 graphs a 3d
coloured surface with contours, 2 a 2d contour plot and 3 a stacked plot. The associated
input arguments res and Nc control the dimension of H, namely 2^res by length(par), and
the number of contours (PL = 2) or stacking distance (PL = 3) respectively. Experiment
with res and Nc to obtain the best plot for different computer systems and data sets. The
value of Nc is ignored when PL = 0 or PL = 1. Finally, dt is the sampling rate (e.g.
seconds) used when plotting the graph.
The outputs arguments H and ph are the DAR spectrum (for example, use surf(H) to plot
the time spectra as a surface) and phase (in radians) respectively.
Example
>> [fit, comp, par] = dar(y, p)

>> darsp(par)
See Also
functions include dar, daropt, aic and arspec.

DARX
Dynamic AutoRegressive multi-eXogenous (DARX) variables analysis.
Synopsis
[tfs, fit, fitse, par, parse, comp, e, y0] =

darx(y,u,nn,TVP,nvr,alpha,P0,x0,sm,ALG)
Description
The time series vector y (column) and input matrix u are specified by the user. Each
column of u represents an input signal. The function automatically handles missing values
in y. In fact, y may be appended with additional NaNs to forecast or backcast beyond the
original series, as long as appropriate values for u are also specified. The remaining input
arguments are optional. The DARX model structure is defined by nn, which takes the form
[n, m, ] where, in transfer function terms, n and m are the number of denominator and
numerator parameters respectively, while is the number of samples time delay. A first
order model with unity time delay and one numerator parameter [1, 1, 1] is utilised by
default.
TVP is a vector specifying the model associated with each DARX model parameter, listed
in order of each denominator parameter and then the numerator parameters for each input,
i.e. a1t , K a nt , b0t , K bmt for the single input, single output example in equation (4.10).
Choices include a RW/AR(1) model by default (0) or a IRW/SRW model (1). For the case
of AR(1) or SRW models, alpha less than unity specifies the additional parameter, while
the default value of unity implies a RW or IRW model. For example, a 1st order
autoregressive process requires TVP set to zero and 0<alpha<1, where alpha is the AR(1)
parameter. Similarly, for a SRW model, TVP is set to unity and 0<alpha<1, where alpha
is the smoothing parameter.

If the lengths of TVP, nvr, alpha, P0 or x0 are less than the total number of parameters,
then they are automatically expanded to the correct dimensions by using the final element
of the specified input vector. For example, if the DARX model has 3 parameters but TVP
is defined as [1 0], then TVP is automatically expanded to [1 0 0]. Similarly, a scalar P0
implies an identity matrix scaled by this value.
The function returns the simulation response tfs (with the same dimensions as y),
regression fit and parameters par (one column for each parameter), together with the
associated standard errors in the latter two cases, fitse and parse. Here, tfs is based on
feeding the input signal through the model (the output signal is not used, except to
establish the initial conditions), while fit represents the 1-step ahead predictions and is
equivalent to the fit returned by dlr.
The function also returns each of the linear components of the model comp, i.e. the
components associated with each input and output and their past values, the normalised
innovations sequence e and interpolated data y0, where the latter consist of the original
series with any missing data replaced by the model. Note that fit is the sum of the columns
in comp, while the normalised innovations are padded with initial NaNs to ensure that the
vector is the same size as y. If statistical tests on the innovations are required, remove these
NaNs with the command e = e(~isnan(e)).
Example
>> darx(y, [u1 u2], [1 1 1 2 3], 0, 0.001)
Difference equation yt = at yt 1 + b1,t u1,t 2 + b2,t u 2,t 3 with RW models for all three
parameters (NVR = 0.001).
See Also
functions include darxopt, fcast, stand, dar and dtfm.

DARXDEMO
Demonstration script for Dynamic AutoRegressive multi-eXogenous (DARX) variables
analysis.
Synopsis
darxdemo
Description

DARXOPT
Hyper-parameter estimation for Dynamic AutoRegressive multi-eXogenous (DARX)
variables analysis.
Synopsis
[nvr, opts, parse]

= darxopt(y, u, nn, TVP, meth, nvrc, nvr0, opts, ALG, tab, P0)
Description
arguments are optional. The DARX model structure is defined by nn, which takes the form
default.
TVP is a vector specifying the model associated with each DARX parameter, listed in
order of each denominator parameter and then the numerator parameters for each input, i.e.
a1t , K a nt , b0t , K bmt for the single input, single output example in equation (4.10).
Choices include a RW model by default (0) or a IRW model (1). meth is the estimation
method, where the default Maximum Likelihood ml may be replaced by f# to compute
the sum of squares of the #-step-ahead forecasting errors. nvrc defines the constraints for
each NVR, where -2 implies free estimation, -1 constrained estimation (all parameters with
nvrc = -1 are equal) and >=0 implies the associated NVR is constrained to this value (it is
not estimated). Initial NVR hyper-parameters may be specified using nvr0 respectively.
Optimisation options may be set using opts (type help foptions for details), while ALG

If the lengths of TVP, nvrc, alphac, nvr0, alpha0 or P0 or x0 are less than the total
number of parameters, then they are automatically expanded to the correct dimensions by
using the final element of the specified input vector. For example, if the DARX model has
3 parameters but TVP is defined as [1 0], then TVP is automatically expanded to [1 0 0].
Similarly, a scalar P0 implies an identity matrix scaled by this value.
The function returns nvr, the vector of NVR hyper-parameters. opts provides confirmation
of the options utilised, together with the number of function evaluations etc. (type help
foptions for details). Finally, parse are the standard errors of the NVR hyper-parameters.
However, computation time can sometimes be greatly reduced if this 3rd output argument
is omitted from the function call.
Example
>> darxopt(y, [u1 u2], [1 1 1 2 3], 0, [], [0 -2])
Difference equation yt = a yt 1 + b1,t u1,t 2 + b2,t u 2,t 3 with RW models for all three
parameters, but with the denominator parameter a assumed constant (NVR fixed at zero).
See Also
References and background theory are given in Chapter 4. Related functions include darx,
fcast, stand, foptions and fmins.

DEL
Matrix of delayed variables.
Synopsis
yd = del(y, n)
Description
The time series vector y (column) is specified by the user, while the optional parameter n
(default unity) is the maximum lag required. The output yd is a matrix with n columns,
where the first column is y lagged by one sample, the second column is y lagged by two
samples and so on.
See Also
See also Examples 4.1 and 4.3.

DHR
Dynamic Harmonic Regression (DHR) analysis. Computes the DHR model for user
specified periodic or seasonal components.
Synopsis
[fit, fitse, tr, trse, comp, e, amp, phs, ts, tsse, y0, dhrse]
= dhr(y, P, TVP, nvr, alpha, P0, x0, sm, ALG, Int, IntD)
Description
The time series vector y (column) and associated m periodic components P are specified by
the user. Set the first element of P to include a trend. For example, [0 12] implies a trend
and a seasonal component for monthly data. The function automatically handles missing
values in y. In fact, y may be appended with additional NaNs to forecast or backcast
beyond the original series. The remaining input arguments are optional.
the same order as the elements of P. Choices include a RW/AR(1) model by default (0) or
a IRW/SRW model (1). For the case of AR(1) or SRW models, alpha ( < 1 ) specifies the
additional parameter, while the default value of unity implies a RW or IRW model. For
example, a 1st order autoregressive process requires TVP set to zero and 0<alpha<1,
where alpha is the AR(1) parameter. Similarly, for a SRW model, TVP is set to unity and
0<alpha<1, where alpha is the smoothing parameter. Finally, a LLT model is obtained by
using RW and IRW trends simultaneously, i.e. with P set to [0 0].
reduces memory usage in cases when smoothing is not required. Also, either the P (0) or
in this manner may help. Int allows for sharp (discontinuous) local changes in the
parameters at the user supplied intervention points. These need to be defined either
manually or by some detection method for sharp local changes. Here, Int should take the
same dimensions as y, with positive values indicating variance intervention required.
Finally, InD gives the diagonal of the variance intervention matrix.

If the lengths of TVP, nvr, alpha, P0 or x0 are less than the length of P, then they are
automatically expanded to the correct dimensions by using the final element of the
specified input vector. For example, if P has 3 elements but TVP is defined as [1 0], then
The function returns the model fit (with the same dimensions as y), trend tr and total
seasonal component ts (i.e. the sum of all the seasonal components, which are returned
individually as a matrix comp), together with the associated standard errors in each case,
fitse and trse and tsse. It also returns the normalised innovations sequence e, amplitude
amp and phase phs of the harmonic components, and the interpolated data y0, where the
latter consist of the original series with any missing data replaced by the model. Finally,
dhrse are the standard errors of all the components. Note that the normalised innovations
are padded with initial NaNs to ensure that the vector is the same size as y. If statistical
tests on these are required, remove the NaNs with the command e = e(~isnan(e)).
Examples
>> fit = dhr(y, [0 12./(1:6)], [1 0], [0.001 0.01], [0.95 1])
SRW trend model (NVR = 0.001, = 0.95 ), together with 6 periodic components (i.e. 12
and associated harmonics) each modelled with a RW (NVR = 0.01).
>> fit = dhr(y, [0 12./(1:6)], [1 0], [0.001 0.01], [1 0.95]);
IRW model for the trend and AR(1) for the harmonics ( = 0.95 ).
>> fit = dhr(y, [0 0 12./(1:6)], [1 0 1])
LLT model for the trend and IRW for the harmonics.
>> Int=zeros(size(y));
>> Interv(52)=1;
>> fit = dhr(y, [0 12./(1:6)], [], [], [], [], [], [], [], Int)
Instructs the algorithm to reset the P0 matrix at the 52nd data point.
See Also
Related functions include dhropt, fcast and stand.

DHRDEMO
Demonstration script for Dynamic Harmonic Regression (DHR).
Synopsis
dhrdemo
Description

DHROPT
Hyper-parameter estimation for Dynamic Harmonic Regression (DHR) analysis.
Synopsis
[nvr,alpha,opts,amp,parse] =
dhropt(y,P,TVP,meth,nvrc,alphac,nvr0,alpha0,opts,ALG,tab,tf,Int)
Description
The time series vector y (column) and associated m periodic components P are specified by
the user. Set the first element of P to include a trend. For example, [0 12] implies a trend
and a seasonal component for monthly data. The function automatically handles missing
values in y. In fact, y may be appended with additional NaNs to forecast or backcast
beyond the original series. The remaining input arguments are optional.
the same order as the elements of P. Choices include a RW/AR(1) model by default (0) or
a IRW/SRW model (1). The 4th input argument meth selects the estimation method, where
the default frequency domain optimisation based on the AR(24) spectrum may be replaced
by an AR(meth) spectrum by specifying a positive scalar. A negative scalar implies
Maximum Likelihood in time domain, with -meth used as the order of the AR spectrum to
estimate the initial conditions. Finally, specifying f# computes the sum of squares of the
#-step-ahead forecasting errors, with the initial conditions obtained from a frequency
domain optimisation using the AR(24) spectrum, or f# n to specify the AR(n) spectrum.

To set the display options, if tab = 1 or 2, then the final results are displayed in tabular
form. Additionally, if tab = 2, a window appears during optimisation showing the latest
value of the Likelihood Function or the Sum-of-Squares for the #-step-ahead forecasting
errors. When ALG = 0 (fmins) and tab = 2, a stop button will appear below the update
window: click to terminate the optimisation and return the current estimates.
For frequency domain optimisation, tf may be a scalar or a vector. When scalar it refers to
the number of points in which the frequency axis should be divided (default value is 1032).
Alternatively, a vector would be considered as the frequency axis itself, normalised
between 0 and 0.5 (corresponding to periods from to 2). Alternatively, a matrix
specifies the frequency axis (1st column) and AR spectrum amplitude directly.
Finally, Int allows for sharp (discontinuous) local changes in the parameters at the user
supplied intervention points. These need to be defined either manually or by some
detection method for sharp local changes. Here, Int should take the same dimensions as y,
with positive values indicating variance intervention required. Note that this option does
not influence the NVR estimates when meth>0 (the default option).
If the lengths of TVP, nvrc, alphac, nvr0, alpha0 or P0 or x0 are less than m, then they
are automatically expanded to the correct dimensions by using the final element of the
implies a RW or IRW model. amp returns the spectra used in the optmisation in the form
[t, am, ampm], where t is the frequency axis at which the spectra are evaluated, am is the
empirical spectrum and ampm is the fitted model spectrum. The associated spectrum can
be graphed with e.g. semilogy(t, amp, t, ampm).

Finally, parse are the standard errors of the NVR and alpha (if optimised) hyper-
parameters. However, computation time can sometimes be greatly reduced if this 4th
output argument is omitted from the function call.
Examples
>> dhropt(y, [0 12./(1:6)], [1 0], 24, -2, [-2 1])
Optimise for a SRW trend, together with 6 periodic components (12 and the associated
harmonics) each modelled with a RW (alpha fixed at 1).
>> dhropt(y, [0 12./(1:6)], [1 0])
IRW model for the trend and RW for harmonics.
>> dhropt(y, [0 12./(1:6)], 1, 24, -2, -2)
SRW model for the trend and harmonics.
>> dhropt(y, [0 12./(1:6)], [1 0], 24, -2, [1 -2])
IRW model for the trend and AR(1) for the harmonics.
>> dhropt(y, [0 12./(1:6)], [1 0], 24, [-2 -1])
IRW model for the trend and RW for the harmonics, but where all the harmonics are
constrained to have the same NVR.
>> dhropt(y, [0 0 12./(1:6)], [1 0 1])
LLT model for the trend and IRW for the harmonics.
>> dhropt(y, [0 12./(1:6)], [], 'f4 36')
Optimisation based on sum of the squares of the 4-step-ahead forecasting errors with the
initial conditions from the AR(36) spectrum. In this case, the default ALG is automatically
revised to unity (fminu).
See Also
Related functions include dhr, fcast and stand.

DLQRI
Iterative linear quadratic regulator design.
Synopsis
[k, p] = dlqri(a, b, q, r, del)
Description
Has similar functionality to dlqr from the MATLAB Control Toolbox. Implemented in an
iterative form to deal with singular state transition matrices.
The state transition matrix a, input vector b (or matrix in the multivariable case), state
weighting matrix q and input weighting scalar r (or matrix in the multivariable case) are
specified by the user. The convergence tolerance del may be optionally specified but is
usually left at the default value of 1e-8.
The function returns the converged control gain vector k (or matrix in the multivariable
case) and P matrix from the discrete time matrix Riccati equation.
Examples
>> [A, B] = nmssform(-0.9, [0 0.5])

>> k = dlqri(A, B, diag([1 0.5 1]), 0.5)
>> k(end) = -k(end)
Determine linear quadratic Proportional-Integral-Plus (PIP) control gain vector for a 1st
order system with 2 samples pure time delay. The third line of code puts k into the
conventional format for PIP control, i.e. with the integral control action in the forward path
of the block diagram (pipopt returns this format automatically). A multivariable example
is shown below, where a and b are matrices of parameters.
>> [amfd, bmfd] = mfdform(a, b)

>> [A, B] = mfd2nmss(amfd, bmfd)
>> k = dlqri(A, B, eye(size(A)), eye(size(B, 2)))
See Also
References and background theory are given in Chapter 8. Related functions include pip,
pipopt, pipcom, pipcl, piplib, mfd2nmss, mpipqr and mpipinit.

DLR
Dynamic Linear Regression (DLR) analysis. Computes the DLR model for user specified
regressors or exogenous inputs.
Synopsis
[fit, fitse, par, parse, comp, e, y0] =

dlr(y, z, TVP, nvr, alpha, P0, x0, sm, ALG)
Description
The time series vector y (column) and associated m regressors z are specified by the user.
Here, z has the same number of rows as y, with a column for each regressor. The function
automatically handles missing values in y. In fact, y may be appended with additional
NaNs to forecast or backcast beyond the original series. The remaining input arguments are
optional.
the same order as the columns of z. Choices include a RW/AR(1) model by default (0) or a
IRW/SRW model (1). For the case of AR(1) or SRW models, alpha less than unity
specifies the additional parameter, while the default value of unity implies a RW or IRW
model. For example, a 1st order autoregressive process requires TVP set to zero and
0<alpha<1, where alpha is the AR(1) parameter. Similarly, for a SRW model, TVP is set
to unity and 0<alpha<1, where alpha is the smoothing parameter.
If the lengths of TVP, nvr, alpha, P0 or x0 are less than m, then they are automatically
expanded to the correct dimensions by using the final element of the specified input vector.
For example, if z has 3 columns but TVP is defined as [1 0], then TVP is automatically
expanded to [1 0 0]. Similarly, a scalar P0 implies an identity matrix scaled by this value.

fitse and parse. It also returns each of the linear components of the model comp, the
normalised innovations sequence e and interpolated data y0, where the latter consist of the
original series with any missing data replaced by the model. Note that fit is the sum of the
columns in comp, while the normalised innovations are padded with initial NaNs to ensure
that the vector is the same size as y. If statistical tests on the innovations are required,
remove these NaNs with the command e = e(~isnan(e)).
Examples
>> fit = dlr(y, [ones(size(u)) u], [0 1], 0.001, [0.95 1])
Regression type model y (t ) = c1,t + c 2,t u t with an AR(1) model ( = 0.95 ) for the first
TVP c1,t and an IRW model for c 2,t . NVR = 0.001 in both cases.
>> fit = dlr(y, z, 0, [0.001 0])
RW model for all the regressors, with the NVR = 0.001 for the first regressor and 0
(constant parameters) for the remainder.
See Also
References and background theory are given in Chapter 4. See also Examples 1.1, 4.1 and
4.3. Related functions include dlropt, fcast and stand.

DLRDEMO
Demonstration script for Dynamic Linear Regression (DLR).
Synopsis
dlrdemo
Description

DLROPT
Hyper-parameter estimation for Dynamic Linear Regression (DLR) analysis.
Synopsis
[nvr, alpha, opts, parse] =

dlropt(y, z, TVP, meth, nvrc, alphac, nvr0, alpha0, opts, ALG, tab, P0)
Description
The time series vector y (column) and associated m regressors z are specified by the user.
Here, z has the same number of rows as y, with a column for each regressor. The function
automatically handles missing values in y. In fact, y may be appended with additional
NaNs to forecast or backcast beyond the original series. The remaining input arguments are
optional.
the same order as the columns of z. Choices include a RW/AR(1) model by default (0) or a
IRW/SRW model (1). meth is the estimation method, where the default Maximum
Likelihood ml may be replaced by f# to compute the sum of squares of the #-step-ahead
forecasting errors.

If the lengths of TVP, nvrc, alphac, nvr0, alpha0 or P0 or x0 are less than m, then they
are automatically expanded to the correct dimensions by using the final element of the
implies a RW or IRW model. Finally, parse are the standard errors of the NVR and alpha
(if optimised) hyper-parameters. However, computation time can sometimes be greatly
reduced if this 4th output argument is omitted from the function call.
Examples
>> nvr = dlropt(y, [ones(size(u)) u], 0, [], [0 -2])
Regression type model y (t ) = c1,t + c 2,t u t with a RW model for both parameters and c1,t
assumed constant (NVR fixed at zero).
>> nvr = dlropt(y, [ones(size(u)) u], [0 1])
RW for c1,t and IRW for c1,t , estimating both NVR parameters simultaneously
(identification problems may arise).
>> nvr = dlropt(y, z, [], 'f4 36')
Optimisation based on sum of the squares of the 4-step-ahead forecasting errors with initial
conditions from the AR(36) spectrum.
See Also
References and background theory are given in Chapter 4. See also Examples 1.1, 4.1 and
4.3. Related functions include dlr, fcast and stand.

DTFM
Multi-variable Dynamic Transfer Function (DTF) Estimation using Instrumental Variables.
Synopsis
[tfs, fit, fitse, par, parse, e, y0]

= dtfm(y, u, nn, TVP, nvr, P0, x0, sm, ALG, niv)
Description
arguments are optional. The DTF model structure is defined by nn, which takes the form
default.
TVP is a vector specifying the model associated with each DARX model parameter, listed
in order of each denominator parameter and then the numerator parameters for each input,
i.e. a1t , K a nt , b0t , K bmt for the single input, single output example in equation (4.15).
Choices include a RW model by default (0) or a IRW model (1). nvr is a vector of NVR
hyperparameters for each regressor where, for example, zero (default) implies time
invariant parameters. The initial state vector and diagonal of the P-matrix may be specified
using x0 and P0, with default values of 0 and 1e5 respectively. FIS may be turned off by
changing sm from its default unity to 0. In this case, the model fit and estimated
parameters are their filtered values. This speeds up the algorithm and reduces memory
usage in cases when smoothing is not required. Finally, either the P (0) or default Q (1)
smoothing algorithms are selected using the ALG input argument. In general, should
convergence problems be encountered, changing the algorithm may help.
If the lengths of TVP, nvr, alpha, P0 or x0 are less than the total number of parameters,
then they are automatically expanded to the correct dimensions by using the final element
of the specified input vector. For example, if the DTF model has 3 parameters but TVP is
defined as [1 0], then TVP is automatically expanded to [1 0 0]. Similarly, a scalar P0
implies an identity matrix scaled by this value.

The function returns the simulation response tfs (with the same dimensions as y),
regression fit and parameters par (one column for each parameter), together with the
associated standard errors in the latter two cases, fitse and parse. Here, tfs is based on
feeding the input signal through the model (the output signal is not used, except to
establish the initial conditions), while fit represents the 1-step ahead predictions and is
equivalent to the fit returned by dlr.
The function also returns each of the linear components of the model comp, i.e. the
components associated with each input and output and their past values, the normalised
innovations sequence e and interpolated data y0, where the latter consist of the original
series with any missing data replaced by the model. Note that fit is the sum of the columns
in comp, while the normalised innovations are padded with initial NaNs to ensure that the
vector is the same size as y. If statistical tests on the innovations are required, remove these
NaNs with the command e = e(~isnan(e)).
Example
>> dtfm(y, [u1 u2], [1 1 1 2 3], 0, 0.001)
Difference equation yt = at yt 1 + b1,t u1,t 2 + b2,t u 2,t 3 with RW models for all three
parameters (NVR = 0.001).
See Also
functions include dtfmopt, fcast, stand, darx and dtfm.

DTFMDEMO1
Demonstration script for Dynamic Transfer Function (DTF) Estimation using Instrumental
Variables.
Synopsis
dtfmdemo1
Description

DTFMDEMO2
Demonstration script for multivariable Dynamic Transfer Function (DTF) Estimation using
Instrumental Variables.
Synopsis
dtfmdemo2
Description

DTFMOPT
Hyper-parameter estimation for Multi-variable Dynamic Transfer Function (DTF)
Estimation using Instrumental Variables.
Synopsis
[nvr, opts, parse]

= dtfmopt(y, u, nn, TVP, meth, nvrc, nvr0, opts, ALG, tab, P0)
Description
arguments are optional. The DTF model structure is defined by nn, which takes the form
default.
TVP is a vector specifying the model associated with each DTF parameter, listed in order
of each denominator parameter and then the numerator parameters for each input, i.e.
a1t , K a nt , b0t , K bmt for the single input, single output example in equation (4.15).
Choices include a RW model by default (0) or a IRW model (1). meth is the estimation
method, where the default Maximum Likelihood ml may be replaced by f# to compute
the sum of squares of the #-step-ahead forecasting errors. nvrc defines the constraints for
each NVR, where -2 implies free estimation, -1 constrained estimation (all parameters with
nvrc = -1 are equal) and >=0 implies the associated NVR is constrained to this value (it is
not estimated). Initial NVR hyper-parameters may be specified using nvr0 respectively.
Optimisation options may be set using opts (type help foptions for details), while ALG

If the lengths of TVP, nvrc, alphac, nvr0, alpha0 or P0 or x0 are less than the total
number of parameters, then they are automatically expanded to the correct dimensions by
using the final element of the specified input vector. For example, if the DTF model has 3
parameters but TVP is defined as [1 0], then TVP is automatically expanded to [1 0 0].
Similarly, a scalar P0 implies an identity matrix scaled by this value.
The function returns nvr, the vector of NVR hyper-parameters. opts provides confirmation
of the options utilised, together with the number of function evaluations etc. (type help
foptions for details). Finally, parse are the standard errors of the NVR hyper-parameters.
However, computation time can sometimes be greatly reduced if this 3rd output argument
is omitted from the function call.
Example
>> dtfmopt(y, [u1 u2], [1 1 1 2 3], 0, [], [0 -2])
Difference equation yt = a yt 1 + b1,t u1,t 2 + b2,t u 2,t 3 with RW models for all three
parameters, but with the denominator parameter a assumed constant (NVR fixed at zero).
See Also
functions include dtfm, fcast, stand, foptions and fmins.

FCAST
Prepare data for forecasting and interpolation.
Synopsis
x = fcast(y, n)
Description
The time series vector y (column), together with the options n are selected by the user to
generate an equivalent time series x but with Not-a-Number (NaN) variables added to
indicate to the other CAPTAIN toolbox modelling functions where forecasting, backcasting
and/or interpolation is required. For the input argument n, a scalar p or vector [0 p]
appends the series with p NaNs, while [p 0] prepends the series with p NaNs for
backcasting and [p q] replaces samples p to q with NaNs for interpolation exercises.
Example
>> x = fcast(y, [14 19; 22 22; 0 10; 5 0]); fit=dhr(x, 0, 1)
Interpolate using dhr over samples 14 to 19 and sample 22 of the original data set, forecast
10 samples beyond the last datum and backcast 5 samples before the start of the series.
See Also
See also Examples 2.2, 2.3, 3.1 and 3.2. Related functions include stand and nan.

GAINS
Extracts Proportional-Integral-Plus (PIP) control polynomials from single-input, single-
output control gain vector.
Synopsis
[f, g, k, r] = gains(a, b, v)
Description
Truncated form system numerator b (with assumed unit delay) and denominator a (with
assumed leading unity) polynomials are used to determine the order of the PIP control
polynomials. Here, v is the control gain vector (e.g. returned by pip or pipopt), while f, g
and k are the output feedback polynomial, denominator polynomial for the input filter and
integral gain respectively. If v is obtained using pipcom, then r is the optional command
input polynomial.
Example
>> v = pipopt(-0.9, [0 0.5], 1, 1, 1)

>> [f, g, k] = gains(-0.9, [0 0.5], v)
Determine linear quadratic optimal PIP control polynomials, here with a scalar
proportional gain, 1st order input filter and integral gain for a 1st order system with 2
samples pure time delay.
See Also
functions include pip, pipopt, pipcom, pipcl and piplib.

GETPAR
Returns transfer function polynomials from previously estimated -matrix.
Synopsis
[a, b, c, P, d] = getpar(th)
Description
Returns the denominator a (including the leading unity), numerator b (including any zero
elements to represent the pure time delay) and noise c polynomials, together with the noise
covariance P matrix and (in the continuous time case) number of delays d. These variables
are all extracted from a previously estimated -matrix (see help theta). Each row of b
represents the numerator polynomial for each input variable.
Example
>> th = riv([y u], [1 1 1 0]);

Estimate a transfer function model using riv and extract the polynomials.
See Also
See also Examples 7.2 and 7.3. Related functions include riv, rivid, rivc, rivcid, mar,
prepz, scaleb and theta.

HISTON
Histogram over the gaussian distribution and Bera-Jarque normality test.
Synopsis
beraj = histon(y, nbar, Title)
Description
A histogram of the time series is plotted and the normal theoretical distribution with the
same mean and variance is superimposed. This allows for a visual inspection of the
distribution of the variable. A formal statistical test, the Bera-Jarque test, is also shown,
together with its probability value.
number of bars to plot in the histogram may be selected by nbar, and Title produces a title
in the figure.
The output is the value of the Bera-Jarque statistic and its probability value.
Examples
>> histon(randn(200, 1));

>> histon(rand(200, 1));
See Also
See also Example 2.4.

IRWSM
Integrated Random Walk (IRW) smoothing and decimation.
Synopsis
[t, deriv, err, filt, h, w, y0] = irwsm(y, TVP, nvr, Int, dt)
Description
The time series vector y (column) is specified by the user. The function automatically
handles missing values in y. In fact, y may be appended with additional NaNs to forecast
or backcast beyond the original series. The remaining input arguments are optional.
TVP is a vector specifying the model required. Choices include a RW model (0), an IRW
model by default (1) or a double integrated random walk model. nvr is a NVR
hyperparameter for the model where, for example, zero implies a straight line for the
smoothed series. The default valve for nvr is determined by 1605*(1/(2*dt))^4. Int allows
for sharp (discontinuous) local changes in the smoothed series at the user supplied
intervention points. These need to be defined either manually or by some detection method
for sharp local changes. Here, Int should take the same dimensions as y, with positive
values indicating variance intervention required. Finally, dt specifies the sampling rate,
where the default unity ensures that the sampling rate is the same as for the input series
(the function only smoothes the signal), while an integer greater than 1 will decimate the
series by an appropriate degree.
The function returns the smoothed series t (for dt = 1 this variable will have the same
dimensions as y) and the derivatives deriv, together with the associated standard errors
err. It also returns the filter frequency response filt, the frequency response h and the
associated frequency values w. Finally, the interpolated data y0 consist of the original
series with any missing data replaced by the model.
Example
>> load steel.dat

>> plot([steel irwsm(steel, 1, 0.001, [87 106])]);
In this example an IRW trend is calculated with intervention points at samples 87 and 106.
See Also
References and background theory are given in Chapter 2. See also Examples 2.1, 2.2, 2.4,
3.2 and 3.3. Related functions include irwsmopt, fcast, stand, dhr, dhropt and sdp.

IRWSMOPT
Hyper-parameter estimation for Integrated Random Walk (IRW) smoothing and
decimation.
Synopsis
[nvr, opts, parse] = irwsmopt(y, TVP, meth, Int, opts, ALG)
Description
or backcast beyond the original series. The remaining input arguments are optional. TVP is
a vector specifying the model required. Choices include a RW model (0), an IRW model
by default (1) or a double integrated random walk model. The 3rd input argument meth
selects the estimation method, where the default Maximum Likelihood ml may be
replaced by f# to compute the sum of squares of the #-step-ahead forecasting errors. Int
allows for sharp (discontinuous) local changes in the parameters at the user supplied
intervention points. These need to be defined either manually or by some detection method
for sharp local changes. Here, Int should take the same dimensions as y, with positive
values indicating variance intervention required. Note that this option does not influence
the NVR estimates when meth>0 (the default option).
is required to use fminu or leastsq. The function returns the hyperparameter nvr and opts,
where the latter provides confirmation of the options utilised, together with the number of
function evaluations etc. (type help foptions for details). Finally, parse are the standard
errors of the NVR and hyper-parameter. Computation time can sometimes be greatly
reduced if this 3rd output argument is omitted from the function call.
Example
>> load steel.dat;

>> nvr = irwsmopt(steel, 1, f12 , [87 106]);
>> plot([steel irwsm(steel, 1, 0.001, [87 106])]);

In this example, the NVR parameter for an IRW trend of the steel series is optimised using
the minimisation of the 12-step-ahead forecasting errors.
See Also
References and background theory are given in Chapter 2. See also Examples 2.2, 2.4, 3.2
and 3.3. Related functions include irwsm, dhropt, stand, foptions and fmins.

KALMANFIS
Fixed Interval Smoother for general state space system.
Synopsis
[inn, yhat, xhat, Py, Px, vr] =

kalmanfis(y, u, Phi, E, H, Q, R, Gam, D, P0, x0, sm, Int, IntD)
Description
This function gives the user access to quite general Kalman Filter (KF) and Fixed Interval
Smoothing (FIS) algorithms for the state space assimilation and forecasting of uniformly
sampled time series data. The function is included for an experienced user of the toolbox
who wishes to access the KF/FIS algorithms directly, without resort to the shells for
implementing numerous standard model types (such as dlr, dhr, dar and darx).
The user provides the data (often input-output data given by u and y respectively), as well
as the variance/covariance hyper-parameters that define the stochastic inputs and
observational error. Refer to the on-line help for the state space model structure and
covariance matrices, i.e. Phi, E, H, Q, R, Gam and D. The initial state vector and diagonal
of the P-matrix may be specified using x0 and P0, with default values of 0 and 1e5
respectively. FIS may be turned off by changing sm from its default unity
to 0. In this case, the model fit and estimated parameters are their filtered values. Int
allows for sharp (discontinuous) local changes in the parameters at the user supplied
intervention points. Here, Int should take the same dimensions as y, with positive values
indicating variance intervention required. Finally, InD gives the diagonal of the variance
intervention matrix.
The function generates the KF (filtered, forward pass) and FIS (smoothed, backward pass)
estimates of the state variables xhat and output yhat. The innovations series inn, together
with the covariance matrix of innovations (filtered or smoothed) Py, covariance matrix of
states (filtered or smoothed) Px and residual variance vr are also returned.
Often, the state space model required for this tool will be generated by prior RIVID/RIV
identification and estimation of a transfer function model that is then converted into the
required discrete-time state space form for kalmanfis. This is illustrated in a demonstration
script kfisdemo concerned with the assimilation of rainfall-flow data and the estimation of
unobserved states, where the latter are associated with the surface and groundwater flow
components that combine to produce the measured river flow.

Example
>> kfisdemo
Rainfall-flow analysis and forecasting example based on data from the ephemeral Canning
River in Western Australia.
See Also
Chapter 2 describes the general state space approach. Related functions include dlr, dhr,
dar, darx, dtfm, irwsm and sdp.

KFISDEMO
Demonstration script for Fixed Interval Smoother for general state space system.
Synopsis
kfisdemo
Description

MAR
Univariate Auto-Regressive model estimation using Least Squares.
Synopsis
[th, e, y0] = mar(y, na, a0, P0)
Description
handles missing values in y by switching to recursive mode. In fact, y may be appended
with additional NaNs to forecast or backcast beyond the original series. The AR model
structure is defined by na, which is a scalar or vector listing the required past output
variables used in the model. For example, [1:5, 20] specifies a model based on yt 1 to
yt 5 plus a yt 20 component (i.e. subset AR). a0 and P0 are the optional initial conditions
for the parameters and covariance matrix respectively, a vector of zeros and an identity
matrix with 100 diagonals by default. Note that specifying a 3rd input argument forces a
recursive calculation, rather than the default en block solution.
The output argument th contains information about the model structure, estimated
parameters and their estimated accuracy. The toolbox function getpar is generally used to
extract the parameter estimates and associated covariance matrix. The error series e are
defined by the model response subtracted from the data, i.e. the model response may be
found from y e. Finally, the interpolated data y0 consist of the original series with any
missing data replaced by the model.
Example
>> y = filter(1, [1 1.2 0.8], randn(100, 1));

>> getpar(mar(y, 2))
See Also
Related functions include aic, getpar, theta, univ and univopt.

MCPAR
Parameters for Monte Carlo Simulation.
Synopsis
[aa, bb] = mcpar(th, nt, ndat, plevel)
Description
Here, th is the -matrix with information about the transfer function model structure,
estimated parameters and their estimated accuracy (see theta) and nt is the number of
Monte Carlo realisations. The function returns, in truncated form, the system numerator bb
(with assumed unit delay) and denominator aa (with assumed leading unity) polynomials,
with each row representing one Monte Carlo realisation. Typically the function is used to
evaluate the robustness of Proportional-Integral-Plus (PIP) control systems.
The first row contains the deterministic parameters vector contained in th. The covariance
information from th is used in either of two ways. When the routine is called with two
input arguments, the pseudo-Gaussian random deviations of parameters are returned.
Secondly, when all four arguments are used, the parameters are generated uniformly over a
confidence ellipsoid determined by the covariance matrix eigenvectors and confidence
parameter found from the F-distribution with n.d.f. (nt, ndat-nt) at plevel of confidence. If
plevel>=1, this parameter is treated as the number of normalized standard deviations.
Example
>> th = riv([y u], [1 1 1 0])

>> [a, b, c, P]=getpar(th)
>> th = createth(th, P*10)
>> [aa, bb] = mcpar(th, 50)
>> v = pipopt(aa(1, :), bb(1, :), 1, 1, 1)
>> for f=1:50
>> [acl, bcl]=pipcl(aa(f, :), bb(f, :), v);
>> y(:, f)=filter(bcl, acl, ones(30, 1));
>> end
>> plot(y)
Generate 50 Monte Carlo realisations from an estimated transfer function model and
evaluate the robustness of a closed-loop PIP control system. In this example, createth is
used to scale up the parametric uncertainty by a factor of 10.
See Also
References and background theory are given in Chapters 6 and 8. See also createth.

MFD2NMSS
Multivariable Non-Minimum State Space (NMSS) form used for Proportional-Integral-
Plus (PIP) control system design.
Synopsis
[F, G, D, H] = mfd2nmss(amfd, bmfd, inc)
Description
The Matrix Fraction Description, for which amfd represents the denominator parameters
and bmfd the numerator parameters, is used to determine the NMSS form of a
multivariable system. Optional inclusion of a positive inc forces the routine to return the
regulator NMSS form without an integral-of-error state vector. The output arguments are
the state transition matrix F, input matrix G, command input matrix D and observation
matrix H.
Example

>> [F, G, D, H] = mfd2nmss(amfd, bmfd)
Multivariable NMSS form including integral-of-error state vector.
See Also
References and background theory are given in Chapter 8. Related functions include
mfdform, mpipqr, mpipinit, piplib and dlqri.

MFDFORM
Returns Matrix Fraction Description (MFD) form.
Synopsis
[amfd, bmfd] = mfdform(a, b)
Description
Returns the MFD form, for which amfd represents the denominator parameters and
bmfd the numerator parameters. The denominator matrix polynomial a is formed row by
row without the initial unity and is padded with zeros to uniform length. The numerator
matrix polynomial b is also formed row by row, padded with zeros and with unit delay
assumed. The format required for a and b is illustrated by the on-line multivariable
Proportional-Integral-Plus (PIP) control demonstration pipdemo3.
Example

>> [F, G, D, H] = mfd2nmss(amfd, bmfd)
Returns non-minimal state space form for a multivariable system.
See Also
mfd2nmss, mpipqr, mpipinit, piplib and dlqri.

MPIPINIT
Initialise block diagram for multivariable Proportional-Integral-Plus (PIP) control.
Synopsis
[pipfb, pipfp] = mpipinit(amfd, bmfd, k)
Description
Returns MATLAB cells for initialising both feedback pipbf and forward path pipfp
multivariable PIP control blocks in piplib (SIMULINK library). The input arguments are
the Matrix Fraction Description, for which amfd represents the denominator parameters
and bmfd the numerator parameters, together with the control gain matrix v.
Example

>> k = dlqri(A, B, eye(size(A)), eye(size(B, 2)))
>> [pipfb, pipfp] = mpipinit(amfd, bmfd, k)
>> piplib
Multivariable PIP control design; open the SIMULINK library for PIP control.
See Also
mfdform, mfd2nmss, mpipqr, piplib and dlqri.

MPIPQR
Linear Quadratic weights for multivariable Proportional-Integral-Plus (PIP) control.
Synopsis
[Q, R] = mpipqr(amfd, bmfd, ew, uw, xw)
Description
The first two input arguments are the Matrix Fraction Description, for which amfd
represents the denominator parameters and bmfd the numerator parameters. The final
three input arguments are the total weights assigned to the integral-of-error state variables
ew, input state variables uw and output state variables xw. The latter three take default
values of unity. The function returns the state Q and input R weighting matrices.
Example

>> [Q, R] = mpipqr(amfd, bmfd)
>> k = dlqri(A, B, Q, R)
Multivariable PIP control with unity total weights.
See Also
mfdform, mfd2nmss, mpipinit, piplib and dlqri.

NMSSFORM
Non-Minimal State Space (NMSS) form for single input, single output system.
Synopsis
[F, g, d, h] = nmssform(a, b, inc)
Description
assumed leading unity) polynomials are used to determine the NMSS form. Optional
inclusion of a positive inc forces the routine to return the regulator NMSS form without an
integral-of-error state vector. The output arguments are the state transition matrix F, input
vector g, command input vector d and observation vector h.
Example
>> [F, g, d, h] = nmssform(-0.9, [0 0.5])
Find NMSS form for a 1st order system with 2 samples pure time delay.
See Also
functions include pip, pipopt, pipcom, pipcl and piplib.

PERIOD
Estimate periodogram.
Synopsis
[per, f] = period(y, out, vec)
Description
The time series vector y (column) is specified by the user. The optional argument out
controls the tabular and graphical output, where out(1) turns the output on (1 by default) or
off (0) and out(2) specifies a log (1 by default) or normal scale (0).
With regards to the output arguments, per represents the estimated periodogram, while f is
the frequency axis at which the periodogram is estimated.
Example
>> y = filter(1, [1 1.2 0.8], randn(100, 1));

>> period(y);
See Also
See also Example 3.2. Related functions include arspec.

PIP
Univariate Proportional-Integral-Plus (PIP) pole assignment.
Synopsis
v = pip(a, b, p, chk)
Description
assumed leading unity) polynomials are used to determine the PIP control gain vector v for
a vector of pole positions p. If the length of p is less than the required number of poles,
then it is automatically expanded by repeating the last pole. For example, if p is a scalar,
then all the closed-loop poles are assigned to this value. If chk is positive, a and b are
tested for coprimeness (by default, this step is omitted).
Example
>> v = pip(-0.9, [0 0.5], 0)
Deadbeat PIP control for a 1st order system with 2 samples pure time delay.
See Also
functions include gains, pipopt, pipcom, pipcl and piplib.

PIPCL
Closed-loop transfer functions for univariate Proportional-Integral-Plus (PIP) control.
Synopsis
[acl, bcl, bclu, bclv] = pipcl(a, b, v, am, bm)
Description
assumed leading unity) polynomials and PIP control gain vector v (e.g. obtained using pip
or pipopt) are used to determine the closed-loop transfer functions. If bm (numerator) and
am (denominator) are omitted, the PIP feedback structure is used; otherwise, these are the
control model polynomials (again in truncated form) required for the PIP forward path
structure with mismatch. The command to output (Yd => y) transfer function is given by
bcl (numerator) and acl (denominator); the command to input (Yd => u) transfer function
is given by bclu (numerator) and acl; and the load disturbance transfer function is given by
bclv (numerator) and acl. Here, Yd represents the command input, y the controlled output
and u the control input, while bcl, acl, bclu and bclv are various polynomials in the
backward shift operator. These can be used with filter to find the closed-loop input and
output variables.
Without disturbances the closed-loop transfer functions are as follows: u = (bclu/acl).Yd

and y = (bcl/acl).Yd; for a load disturbance V, the output is the sum of two components,
i.e. y = (bcl/acl).Yd + (bclv/acl).V
To investigate other types of disturbances, multivariable systems or the affect of

disturbances on the input signal, use SIMULINK library piplib. Note that bm/am
represents the control model and so are not required for the feedback PIP control structure.
In the case of the forward path structure, the internal model is given by the transfer
function bm/am. Finally, the output arguments {acl,bcl,bclu,bclv} are all polynomials
represented in their "full" form, including the leading unity for the denominator and unit
time delay for the numerator. By contrast, the input polynomials {a,b,am,bm} should all
be supplied in "truncated form", i.e. not including the leading unity and with one less time
delay. This follows the notation of other PIP functions such as pipopt.

Example
>> v = pip(-0.9, [0 0.5], 0)

>> [acl, bcl] = pipcl(-0.9, [0 0.5], v);
>> y=filter(bcl, acl, ones(20, 1));
>> plot(y)
Plot response of deadbeat PIP control system.
See Also
functions include gains, pip, pipopt, pipcom and piplib.

PIPCOM
Univariate Proportional-Integral-Plus (PIP) control with command input anticipation.
Synopsis
[v, F, g, d, h, Q, R] = pipcom(a, b, ew, uw, xw, N)
Description
assumed leading unity) polynomials are used to determine the control gain vector v based
on PIP-LQ design with command input anticipation, where ew, uw and xw are the total
weights for the integral-of-error, input and output state variables respectively; and N is the
number of samples into the future required for the command level. If required, the function
also returns the non-minimal state space form state transition matrix F, input vector g,
command input vector d and observation vector h, together with the state weighting matrix
Q and scalar input weight r.
Example
>> v = pipopt(-0.9, [0 0.5], 1, 1, 1, 10)
PIP-LQ control with 10 step ahead command anticipation for a 1st order system with 2
samples pure time delay.
See Also
References and background theory are given in Chapter 8. Related functions include gains,
pip, pipopt, pipcl and piplib.

PIPDEMO1
Demonstration script for Proportional-Integral-Plus (PIP) control system design of a
system with two samples time delay.
Synopsis
pipdemo1
Description

PIPDEMO2
Demonstration script for Proportional-Integral-Plus (PIP) control system design for global
CO2 emissions.
Synopsis
pipdemo2
Description

PIPDEMO3
Demonstration script for multivariate Proportional-Integral-Plus (PIP) control design.
Synopsis
pipdemo3
Description

PIPLIB
SIMULINK library for Proportional-Integral-Plus (PIP) control.
Synopsis
piplib
Description
Type the name of this library and press return to open. Includes blocks for univariate and
multivariable PIP control in both feedback and forward path structure.

PIPOPT
Univariate Proportional-Integral-Plus Linear Quadratic (PIP-LQ) optimal control.
Synopsis
[v, F, g, d, h, Q, r] = pipopt(a, b, ew, uw, xw)
Description
assumed leading unity) polynomials are used to determine the control gain vector v based
on PIP-LQ design, where ew, uw and xw are the total weights for the integral-of-error,
input and output state variables respectively. If required, the function also returns the non-
minimal state space form state transition matrix F, input vector g, command input vector d
and observation vector h, together with the state weighting matrix Q and scalar input
weight r.
Example
>> v = pipopt(-0.9, [0 0.5], 1, 1, 1)
PIP-LQ control for a 1st order system with 2 samples pure time delay.
See Also
functions include gains, pip, pipopt, pipcom, pipcl and piplib.

PRBS
Pseudo Random Binary Signal generator.
Synopsis
y = prbs(N, dt, J)
Description
The PRBS y, is a function of the switching period dt (an integer > 2) and state of the
random number generator J (an integer), while N is the length of the series.
See Also
Related functions include riv and rivc.

PREPZ
Prepare data for input-output modelling.
Synopsis
[z, m, s] = prepz(z, sel, bas, nz, sc, dt)
Description
The only compulsory input argument is z, a matrix of input output data where the first
column is considered the output and the rest are inputs.
The remaining input arguments are optional, with default values set in case they are not
specified: sel = [st en] select a sub-sample of the original data for modelling from st to en
(default is the entire time span); bas tells the function to either subtract the initial value
from each series (1 the default), subtract the mean of the first n values (n), or leave the
series unchanged (0); nz adds nz initial values to each series (default is 0); sc sets on or off
the scaling of all inputs to the same magnitude as the output (default is 0, off); and dt
determines the subsampling interval (default is no subsampling, i.e. 1).
The output arguments are z, the data prepared according to the input values supplied by the
user; m is the vector of subtracted baselines; and s is the vector of input gains. Note that m
and s are typically used as input arguments for the function scaleb.
Example
>> z = prepz([y u1 u2]);
Subtract the initial value from y, u1 and u2.
See Also
References and background theory are given in Chapters 6 and 7. See also Examples 7.2
and 7.3. Related functions include riv, rivid, rivc, rivcid, getpar, scaleb, stand and theta.

RECONST
Reconstructs a series with jumps at intervention points.
Synopsis
yr = reconst(y, Int)
Description
The purpose of this function is to reconstruct the behaviour of a series under the hypothesis
that several given sudden jumps would never have happened.
The first input argument y is the original time series, while Int is a vector of intervention
points at which the jumps are observed. The output argument yr is the modified time
series.
Example
>> reconst([zeros(10,1); ones(10,1)], 11)
This command returns a vector of zeros, because the jump in the data between samples 10
and 11 is removed by the reconst function.
See Also
See also Example 2.3. Related functions include irwsm and dhr.

RIV
Estimation of a backward shift operator multi-input, single-output (MISO) discrete time
transfer function using the Refined Instrumental Variable (RIV) and related algorithms.
Synopsis
[th, stats, e, var, Ps, Pc, y0, AH, AHse, PH, Pr]
= riv(z, nn, flags, a0, P0)
Description
The argument z is a matrix consisting of the output variable in the first column and the
input (or inputs) in the remaining k columns, where k is the number of input variables.
The model orders are passed by to the function by means of the nn argument. This is a
vector with the format [na nb nd nc], where na is the order (a scalar) of the common
denominator polynomial; nb is a vector (of dimension k) of orders for all the numerator
polynomials; nd is a vector (similarly of dimension k) of delays for all the inputs in the
model; and nc is the order of the AR model for the perturbations (scalar). These are the
compulsory input arguments, while the rest are optional and are set to default values when
they are not supplied by the user.
The input argument flags is a vector of values [Ni Ft Nr Lr Rc Stb Yini] in which: Ni is
the number of IV/SRIV iterations (set to 3 by default); Ft specifies the filtering, where 1
indicates prefiltering using a stabilised denominator polynomial, 2 used a stabilised
denominator polynomial for both the instruments and the prefiltering (default) and 0 turns
the filtering operation off; Nr is the number of RIV iterations (for the co-ordination
between the system and noise models); Lr sets the linear regression method to either
standard least squares (0) or a SVD/QR algorithm with tolerance equal to Lr (in case
collinearity problems are suspected); and Rc switches between the en-block (0 default)
and recursive (1) solutions. Also, Stb specifies options for the stabilisation of the filter and
model polynomial, where 0 implies no stabilisation, 1 (default) stabilises the filter and
instrument generation, and 2 stabilises the filter only (enabling estimation of marginally
unstable systems). Finally, Yini specifies the initial conditions, where 0 (default) uses the
initial output values, whereas 1 takes the mean value of the initial output values.
If there are any missing (nan) values in the output, the algorithm automatically switches to
recursive mode. If only some of the values in flags require changing, the rest may be set to
1 (default values). The final input arguments (a0 and P0) are the initial conditions for the
mean and covariance of the parameters. These arguments are ignored in en block mode.

The first output argument th provides information about the estimated model in the form of
a standard theta matrix (see help theta), from which the estimated polynomials may be
recovered using getpar (see help getpar).
The second output argument (stats) is a vector of various criteria useful for the evaluation
of the model, i.e. [cond(P), YIC, RT2, AIC, S2, o2, EVN, Ybar, RT2AR, YICa, YICt].
Here, YIC, RT2 ( RT2 ) and AIC are defined by equations (6.46), 6.47) and (4.48). For each
of these criteria, the model fit is determined from the input-output part of the model, while
the RT2AR term above refers to the RT2 for the noise model. In this regard, the overall fit
may be calculated from RT2+RT2AR*(1-RT2). In stats, cond(P) refers to the
conditioning of the Pt covariance matrix, S2 and o2 are the variance of the residuals and
output variable respectively, EVN is the log of the average parameter standard errors and
Ybar is the mean of y0 (see below). Finally, YICa is based on the asymmetric matrix,
while the total YIC using the full quadratic form is given by YICt.
The model output errors are stored in e, with variance var. Ps and Pc are the covariance
matrix of the system and noise model parameter estimates respectively. Finally, y0
recovers the interpolated data, i.e. the missing output observations are replaced by the
estimated values.
The remaining outputs are returned for the recursive solution only. The recursive
estimation parameters history is given by AH, with the recursive estimates of the standard
deviations given by AHse. Similarly, PH is the recursive estimates covariance matrix
evolution, e.g. AH(:, 24) is the parameters vector estimate at sample 24, with the standard
deviations AHse(:, 24) and covariance matrix PH(:, 23*size(AH, 1)+(1:size(AH, 1))).
Finally, Pr is the symmetric RIV covariance matrix estimate.
Example
>> th = riv([y u], [2 1 3 0], [-1 0])
The model has one input and there is no noise AR structure. The numerator is of order one
with a delay of three samples, while the denominator is of order two. The model is
estimated using three IV iterations and no filtering.
See Also
References and background theory are given in Chapter 6. Related functions include rivid,
rivc, rivcid, getpar, prepz, scaleb and theta.

RIVC
Estimation of a continuous time Multi-Input, Single-Output (MISO) transfer function
model using the Simplified Refined Instrumental Variable (SRIV) algorithm.
Synopsis
[th, stats, e, thd, statsd] = rivc(z, nn, flags, c)
Description
vector with the format [na nb nd], where na is the order (a scalar) of the common
denominator polynomial; nb is a vector (of dimension k) of orders for all the numerator
polynomials; and nd is a vector (similarly of dimension k) of delays for all the inputs in the
model. These are the compulsory input arguments, while the rest are optional and are set to
default values when they are not supplied by the user.
The input argument flags is a vector of values [Ni dt ddt cf] in which: Ni is the number of
SRIV iterations (set to 3 by default); dt is the sampling interval for continuous time
estimation (1); ddt is the sampling interval for initial discrete time identification; and cf is
specifies either constant (1 default) or adaptive (0) pre-filtering.
The third input argument C selects the prefiltering polynomial. In the default case (1), a
discrete time filter is estimated and converted to continuous time. Setting C to zero or a
negative scalar uses a stable first order filter created automatically from this pole value.
This is equivalent to Young's method of multiple filters or the Poisson-Moment-Function
(PMF) method. Finally, to specify the filter directly, set C to the required polynomial (a
vector of length na).
of the model, i.e. [cond(P), YIC, RT2, 0, S2, o2, 0, Ybar]. Here, YIC and RT2 ( RT2 ) are
defined by equations (6.46) and (4.48), while cond(P) refers to the conditioning of the Pt
covariance matrix, S2 and o2 are the variance of the residuals and output variable

respectively and Ybar is the mean of the output. Note that some elements of stats are zero,
for compatibility with the equivalent output argument of the function riv.
The model output errors are stored in e, while thd is the theta matrix (see help theta) for
the initial discrete time model and statsd is the stats vector for the discrete time model
(see help riv for a definition of this latter stats vector).
Example
th=rivc([y u], [2 1 3], [2 -1 -1 1])
For the output data y and input data u, estimate a 2nd order model with 1 numerator
parameter and 3 time delays, using 2 SRIV iterations and a constant, automatically
estimated, prefilter.
See Also
functions include riv, rivid, rivcid, getpar, prepz, scaleb and theta.

RIVCDEMO
Demonstration script for Dynamic Harmonic Regression (DHR).
Synopsis
dhrdemo
Description

RIVCID
Identification of a continuous time multi-input, single-output (MISO) transfer function
model using the Simplified Refined Instrumental Variable (SRIV) algorithm, together with
various statistical criteria.
Synopsis
[th, stats, e, rr] = rivcid(z, nn, flags, c)
Description
This function estimates a set of models for a user specified range of orders, selecting the
best one according to YIC, RT2 , AIC or EVN (see below). The function generates a table of
the 20 best models (if this many solutions have converged) according to the selected
criterion. It also provides an optional Graphical User Interface in order to make the
identification procedure simpler. The function complements (and automatically calls as a
subroutine) rivc. For this reason, the input and output arguments or rivcid and rivc are
similar.
This is a matrix with the format [nafrom nbfrom ndfrom; nato nbto ndto], where
nafrom and nato are, respectively, the smallest and highest orders (scalars) of the common
denominator polynomial; nbfrom and nbto are row vectors (of dimension k) of the
minimum and maximum orders for all the numerator polynomials; ndfrom and ndto are
vectors (similarly of dimension k) of the minimum and maximum delays for all the inputs
in the model. These are the compulsory input arguments, while the rest are optional and are
set to default values when they are not supplied by the user.
The input argument flags is a vector of values [Ni dt ddt cf Sc] in which: Ni is the number
of SRIV iterations (set to 3 by default); dt is the sampling interval for continuous time
estimation (1); ddt is the sampling interval for initial discrete time identification; and cf is
specifies either constant (1 default) or adaptive (0) pre-filtering. Finally, Sc specifies the
identification criteria (see below), i.e. YIC (1 default), RT2 (2) , AIC (3) or EVN (4). If
only some of the values in flags require changing, the rest may be set to 1 for their default
values.
The third input argument C selects the prefiltering polynomial. In the default case (1), a
discrete time filter is estimated and converted to continuous time. Setting C to zero or a

negative scalar uses a stable first order filter created automatically from this pole value.
This is equivalent to Young's method of multiple filters or the Poisson-Moment-Function
(PMF) method. Finally, to specify the filter directly, set C to the required polynomial (a
vector of length na).
of the model, i.e. [cond(P), YIC, RT2, 0, S2, o2, 0, Ybar]. Here, YIC and RT2 ( RT2 ) are
defined by equations (6.46) and (4.48), while cond(P) refers to the conditioning of the Pt
covariance matrix, S2 and o2 are the variance of the residuals and output variable
respectively and Ybar is the mean of the output. Note that some elements of stats are zero,
for compatibility with the equivalent output argument of the function rivid.
The model output errors are stored in e, while thd is the theta matrix (see help theta) for
the initial discrete time model and statsd is the stats vector for the discrete time model
(see help riv for a definition of this latter stats vector). Finally, rr is the table of the best
20 models selected, as it appears in the MATLAB command window.
Example
>> th = rivcid([y u], [1 1 1; 3 3 5], [-1 -1 -1 -1 2])
The optimal model is estimated, searching among all the possibilities in the range
2
[1 1 1] to [3 3 5], with the models sorted in order of RT . In other words, 1 to 3
denominator and numerator parameters, together with 1 to 5 time delays.
See Also

RIVDEMO1
Demonstration script for estimation of backward shift operator discrete time transfer
function models using the Refined Instrumental Variable (RIV) and related algorithms.
Synopsis
rivdemo1
Description

RIVDEMO2
Demonstration script for identification of backward shift operator discrete time transfer
function models using the Refined Instrumental Variable (RIV) and related algorithms
Synopsis
rivdemo2
Description

RIVDEMO3
Demonstration script for identification of backward shift operator discrete time transfer
function models using the Refined Instrumental Variable (RIV) and related algorithms
using the graphical interface.
Synopsis
rivdemo3
Description

RIVID
Identification of a backward shift operator multi-input, single-output (MISO) discrete time
transfer function using the Refined Instrumental Variable (RIV) and related algorithms,
together with various statistical criteria.
Synopsis
[th, stats, e, var, Ps, Pc, y0, rr] = rivid(z, nn, sc, flags, g)
Description
This function estimates a set of models for a user specified range of orders, selecting the
best one according to YIC, RT2 , AIC or EVN (see below). The function generates a table of
the 20 best models (if this many solutions have converged) according to the selected
criterion. It also provides an optional Graphical User Interface in order to make the
identification procedure simpler. The function complements (and automatically calls as a
subroutine) riv. For this reason, the input and output arguments or rivid and riv are
similar.
matrix with the format [nafrom nbfrom ndfrom ncfrom; nato nbto ndto ncto], where
nafrom and nato are, respectively, the smallest and highest orders (scalars) of the common
denominator polynomial; nbfrom and nbto are row vectors (of dimension k) of the
minimum and maximum orders for all the numerator polynomials; ndfrom and ndto are
vectors (similarly of dimension k) of the minimum and maximum delays for all the inputs
in the model; and ncfrom and ncto are the minimum and maximum orders of the AR
model for the perturbations (scalar). These are the compulsory input arguments, while the
rest are optional and are set to default values when they are not supplied by the user.
The number of models to estimate are all the possible combinations of models ranging
from the minimum values for each polynomial order to the maximum. For example, if nn
is specified as [0 1 2 0; 2 2 3 0], then the model has a single input and the number of
models estimated are 12, i.e. [0 1 2 0]; [1 1 2 0]; [2 1 2 0]; [0 2 2 0];
[1 2 2 0]; [2 2 2 0]; [0 1 3 0]; [1 1 3 0]; [2 1 3 0]; [0 2 3 0];
[1 2 3 0]; and [2 2 3 0].

The 3rd input argument Sc specifies the identification criteria (see below for details), i.e.
YIC (1 default), RT2 (2) , AIC (3) or EVN (4).
The input argument flags is a vector of values [Ni Ft Nr Lr Rc Stb Yini] in which: Ni is
the number of IV/SRIV iterations (set to 3 by default); Ft specifies the filtering, where 1
indicates prefiltering using a stabilised denominator polynomial, 2 used a stabilised
denominator polynomial for both the instruments and the prefiltering (default) and 0 turns
the filtering operation off; Nr is the number of RIV iterations (for the co-ordination
between the system and noise models); Lr sets the linear regression method to either
standard least squares (0) or a SVD/QR algorithm with tolerance equal to Lr (in case
collinearity problems are suspected); and Rc switches between the en-block (0 default)
and recursive (1) solutions. If there are any missing (nan) values in the output, the
algorithm automatically switches to recursive mode. Also, Stb specifies options for the
stabilisation of the filter and model polynomial, where 0 implies no stabilisation, 1
(default) stabilises the filter and instrument generation, and 2 stabilises the filter only
(enabling estimation of marginally unstable systems). Finally, Yini specifies the initial
conditions, where 0 (default) uses the initial output values, whereas 1 takes the mean value
of the initial output values. If only some of the values in flags require changing, the rest
may be set to 1 for their default values.
Any positive value for the final input argument g activates a basic Graphical User Interface
that makes the selection process easier. It consists of a list of the best models and a plot of
the data with the model fit. The user may change the model and the selection criterion by a
simple mouse click, and a graphical window shows displays the fit of the highlighted
model. When the return key is pressed, the rivid output arguments are returned to the
workspace in the normal way.
of the model, i.e. [cond(P), YIC, RT2, AIC, S2, o2, EVN, Ybar, RT2AR]. Here, YIC,
RT2 ( RT2 ) and AIC are defined by equations (6.46), 6.47) and (4.48). For each of these
criteria, the model fit is determined from the input-output part of the model, while the
RT2AR term above refers to the RT2 for the noise model. In this regard, the overall fit may
be calculated from RT2+RT2AR*(1-RT2). In stats, cond(P) refers to the conditioning of
the Pt covariance matrix, S2 and o2 are the variance of the residuals and output variable
respectively, EVN is the log of the average parameter standard errors and, finally, Ybar is
the mean of y0 (see below).

The model output errors are stored in e, with variance var; Ps and Pc are the covariance
matrix of the system and noise model parameter estimates respectively; y0 recovers the
interpolated data, i.e. the missing output observations are replaced by the estimated values;
and, finally, rr is the table of the best 20 models selected, as it appears in the MATLAB
command window.
Example
>> th = rivid([y u], [1 1 1 0; 3 3 5 0], 2, [], 1)
The optimal model is estimated, searching among all the possibilities in the range
2
[1 1 1 0] to [3 3 5 0], with the models sorted in order of RT . In other words, 1 to 3
denominator and numerator parameters, 1 to 5 time delays and no model for the noise. The
default SRIV algorithm is utilized. In this example, the GUI is activated so that the
preferred model can be selected with a mouse click.
See Also
References and background theory are given in Chapter 6. Related functions include riv,
rivc, rivcid, getpar, prepz, scaleb and theta.

SCALEB
Rescale numerator polynomials.
Synopsis
bt = scaleb(b, s)
Description
The input argument b consists of the numerator polynomials of a Transfer Function model,
while s is the input scaling factors (see help prepz). The output is the rescaled numerator
polynomials with the steady state gain adjusted to match the original data.
Example
>> [z, m, s] = prepz([y u], [], [], [], 1);

>> [a, b] = getpar(riv(z, [2 1 3 0]));
>> bt = scale(b, s)
Here, prepz scales the input to the same numerical range as the output (y). A model is
estimated using riv. Finally, the numerator polynomial is rescaled for use in forecasting or
other applications.
See Also
Related functions include riv, rivid, rivc, rivcid, getpar, prepz and theta.

SDP
Non-parametric State Dependent Parameter (SDP) analysis and backfitting.
Synopsis
[fit, fitse, par, parse, xs, pars, parses, rsq, nvre, y0]
= sdp(y, z, x, TVP, nvr, opts, P0, x0, nvr0)
Description
The time series vector y (column), together with the associated m regressors z and states x
are specified by the user. Here, z and x have the same number of rows as y, with a column
for each regressor/state. The function automatically handles missing values in y. In fact, y
may be appended with additional NaNs to forecast or backcast beyond the original series.
The remaining input arguments are optional.
the same order as the columns of z. Choices are a RW model by default (0) or a IRW
model (1). nvr is a vector of NVR hyperparameters for each regressor where, for example,
zero (default) implies time invariant parameters. Negative values imply that the NVR
hyperparameters are automatically optimised over the first -nvr backfitting steps. opts is a
vector of options: [iter con meth sm ALG plotopt].
Here, iter is the number of backfitting iterations (default 10); con is the backfitting
convergence threshold (0.001); meth is the optimisation estimation method, which may be
either ML (0 default) or, for a positive integer, the sum of squares of the meth-step-
ahead forecasting errors; sm specifies whether FIS smoothing is on (1 default) or off (0
here, the model fit and estimated parameters are their filtered values, which speeds up the
algorithm and reduces memory usage in cases when smoothing is not required); ALG
selects either the P (0) or Q (1 default) smoothing algorithm (should convergence
problems be encountered, changing the algorithm in this manner may help); and, finally,
plotopt selects graphical display of the results during estimation (1) or turns this option off
(0 default). If only some of the values in opts require changing, the rest may be set to 1
for their default values.
The initial state vector and diagonal of the P-matrix may be specified using x0 and P0,
with default values of 0 and 1e5 respectively. Finally, nvr0 is a vector of initial NVR
hyperparameters, utilised by the first backfitting step (0.0001).

If the lengths of TVP, nvr, P0 or x0 are less than m, then they are automatically expanded
to the correct dimensions by using the final element of the specified input vector. For
example, if z has 3 columns but TVP is defined as [1 0], then TVP is automatically
expanded to [1 0 0]. Similarly, a scalar P0 implies an identity matrix scaled by this value.
fitse and parse. It also returns the vectors of sorted states, parameters and standard errors,
xs, pars and parses respectively. Finally, rsq is a measure of model fit R 2 , nvre are the
final NVR estimates and y0 the interpolated data, where the latter consist of the original
Example
>> sdp(y, [u1 u2], [x1 x2], [0 1], [-2 -1])
Regression type model with two inputs: yt = c1t ( x1t ) u1t + c 2t ( x 2t ) u 2t . Here, the first
SDP c1t is represented by a RW, while the dependent state is x1t . The second SDP c 2t is
represented by an IRW, while the dependent state is x 2t . The NVR for c1t is optimised at
the second iteration and at the first iteration for c 2t .
See Also
References and background theory are given in Chapter 5. Related functions include fcast,
stand, dlr and dhr.

SDPDEMO
Demonstration script for State Dependent Parameter (SDP) analysis and backfitting.
Synopsis
sdpdemo
Description

SHADE
Plot shaded confidence bounds.
Synopsis
H = shade(t, fit, fitse, y)
Description
Graphing shell for plotting model estimates fit, shaded standard errors fitse and data y
against time t, returning the graphics handle H.
Example
>> kfisdemo
The above demo illustrates use of this function.

STAND
Standardise or de-standardise a matrix by columns.
Synopsis
[x, my, sy] = stand(y, my, sy)
Description
When the inputs my and sy are not supplied, this function generates standardised variables
by subtracting the mean from each column of y and dividing by its standard deviation. In
this case, the function returns the standardised version of y in the output argument x,
together a vector of means (my) and standard deviations (sy).
If, on the contrary, the input vectors/or scalars my and sy are included in the call (i.e. there
are three input arguments), then the input matrix y is de-standardise column by column. In
this case, each column of y is multiplied by the standard deviation, the mean is added and
the de-standardised signal is returned as the output argument x.
Example
>> y = randn(100, 2);

>> [x, my, sy] = stand(y);
>> mean(x), std(x)
>> yy = stand(x, my, sy);
See Also
See also Example 4.2. Related functions include fcast and prepz.

STATIST
Sample descriptive statistics.
Synopsis
tab = statist(y, out)
Description
This function returns various statistics for each column vector of variables in the matrix y.
The list of statistics includes: the number of samples; number of missing observations;
minimum; first quartile; median; third quartile; maximum; mean; geometric mean; range;
interquartile range; standard deviation; variance; mean/standard deviation; skewness;
kurtosis; and the Jarque-Bera test.
The only compulsory input to this function is y. The second input arguments (out) turns
the on screen display of these statistics on (1 default) or off (0).
The output argument tab is simply the table shown in the command window, returned as a
matrix.
Example
>> statist(randn(100, 4))
See Also
Related functions include histon, acf and ccf.

THETA
Information about the Captain Toolbox theta matrix.
Synopsis
theta
Description
In the toolbox, theta is a matrix containing information about the transfer function model
structure, estimated parameters and their estimated accuracy. It is generated by a number
of Captain Toolbox modelling functions. The toolbox function getpar is generally used to
extract the parameter estimates and associated covariance matrix.
See Also

UNIV
Trend + Auto-Regressive (AR) model analysis. Computes the combined Trend + AR
model for user specified perturbation and trend models.
Synopsis
[fit,fitse,trend,trendse,comp,y0] = univ(y, ARp, TVP, nvr, ARt, Int, sm)
Description
or backcast beyond the original series. The remaining input arguments are optional,
although most should be normally specified as outputs from the univopt function.
ARp and ARt are the AR polynomials for the perturbations and the trend models (if
required), while TVP is a scalar specifying the model associated with the trend. Options
for TVP include a RW/AR(1) model by default (0) or a IRW/SRW model (1). nvr is the
NVR value for the trend. Int allows for sharp (discontinuous) local changes in the
parameters at the user supplied intervention points. These need to be defined either
manually or by some detection method for sharp local changes. Here, Int should take the
same dimensions as y, with positive values indicating variance intervention required. FIS
may be turned off by changing sm from its default unity to 0. In this case, the model fit and
estimated parameters are their filtered values. This speeds up the algorithm and reduces
memory usage in cases when smoothing is not required.
By selecting appropriate options for the trend and the perturbations, a wide range of overall
model structures are available using this function, including trend only models (RW, IRW,
IAR, DIAR, etc.), AR models, or various combinations of the two (RW+AR; IRW+AR;
IAR+AR; DIAR+AR; etc.), as discussed in Chapter 3.
The function returns the model fit (with the same dimensions as y), trend and total
seasonal component comp, together with the associated standard errors in each case, fitse
and trendse. It also returns the interpolated data y0, where the latter consist of the original
Examples
>> fit = univ(y, ARp, 1, 1e-3)
IRW trend model with 1e-3 NVR and polynomial ARp.

>> [fit, fitse, trend, trendse, comp] = univ(y, ARp, 1, nvr, ARt);
DIAR trend + AR model with ARp and ARt polynomials for the perturbations and the
trend components. All the components and standard errors are returned.
See Also
functions include univopt, aic, acf and mar.

UNIVDEMO
Demonstration script for Trend + Auto-Regressive (AR) model.
Synopsis
univdemo
Description

UNIVOPT
Hyper-parameter estimation for Trend + Auto-Regressive (AR) models.
Synopsis
[nvr, ARp, ARt, ARpse, ARtse] = univopt(y, p, TVP, nvr0, tar, out, Int)
Description
or backcast beyond the original series. The remaining input arguments are optional,
although most should be normally specified as outputs from the univopt function.
The first input argument p is a vector indicating the lags that must be used in the auto-
regression for the perturbational component about the trend. TVP is a scalar specifying the
model associated with the trend. Choices include a RW/AR(1) model by default (0) or a
IRW/SRW model (1). nvr0 is the NVR value for the trend. It is normally constrained by
the user to avoid identification problems. Although univopt provides default values, in
most practical instances these input arguments should be specified by the user, since it is
unlikely that the defaults will yield a sensible model for a given data set (see Chapter 3).
When a IAR/DIAR trend is required, the trend AR model order is governed by tar which
utilizes the same syntax as p above. For IAR trends set TVP to 0, while for DIAR trends
set TVP to 1; in both these cases, tar is the vector of indices for the AR trend polynomial.
The next input argument, out, specifies tabular display or the results (1) or not (0). Finally,
Int allows for sharp (discontinuous) variance intervention for local changes in the trend at
the user supplied intervention points. These need to be defined either manually or by some
detection method for sharp local changes. Here, Int should take the same dimensions as y,
with positive values indicating variance intervention required.
The function returns nvr, that is the corrected value of the trend NVR based on nvr0 and
the final estimate of the innovations variance. ARp and ARt are the AR polynomials for
the perturbations and the trend AR models, when required. ARpse and ARtse are the
standard errors of the perturbation and trend AR parameters.
Examples
>> [nvr, ARp] = univopt(y, [1:20], 1, 1e-3)
IRW trend + AR(20) model. The trend NVR is constrained to 1e-3.

>> [nvr, ARp, ARt, ARpse, ARtse] = univopt(y, [1 6 12], 1, 1e-3, [1:2])
DIAR trend + AR model. The AR model for the perturbation component contains just the
first and the 12-th lag, while the AR model for the trend is second order (both parameters
are estimated).
See Also
functions include univ, aic, acf, mar, irwsm and irwsmopt.

APPENDIX 2
DATA SETS, FUNCTIONS
AND ABBREVIATIONS
The table below lists the demonstration data sets and user accessible functions included in
CAPTAIN, together with some common abbreviations used in both the text and the on-line
help. The functions are organised by category, corresponding roughly to the chapters of
this book, although the various diagnostic and auxiliary functions are listed separately.
Data (.dat) Description
adv Expenditure and response to advertising (arbitary units)

air Monthly air passengers 1949-1961 (thousands).
cam River Cam daily sunlight (hours/day), DO (mg/l) and temperature (C).
canningflow Flow data from the Canning River, Western Australia (cumecs).
canninrain Rainfall data from the Canning River, Western Australia (mm).
canningtemp Temperature data from the Canning River, Western Australia (C).
canningtime Time data for the Canning River data.
cars Monthly UK car drivers killed and seriously injured 1969-1984.
chamber Ventilation rate (m/s) and fan voltage (percent) (2s samples).
co2 Monthly CO2 concentration at Mauna Loa (ppm).
gas CO2 concentration (%) and input gas rate.
globalco2 Global CO2 levels (1st column) and emissions.
nile Nile River annual volume 1871-1970 (10e8 cubic meters).
photo CO2 assimilation (umol/m2/s) and ambient CO2 (Pa) (1 minute samples).
sdar Simulated data: sawtooth changing frequency.
sdarx Simulated data: TF model with changing gain.
sdtfm1 Simulated data: TF model with changing time constant.
sdtfm2 Simulated data: TF model with two inputs.
squid Electrical signal from the giant axon of a squid (see Mees et al., 1992).
steel Quarterly UK steel consumption 1953-1992 (thousand tons).
swg01 Simulated data based on Wang and Gawthrop (2001).
traffic Accumulated traffic flow and velocity data for 15 locations (1990).
Appendix 2 Data sets, Functions and Abbreviations
usemp USA unemployment, investment and expenditure 1948-1998 (%).

usgdp Seasonally adjusted quarterly US Gross Domestic Product 1947-2002
(1e9 1996 US dollars). Bureau of Economic Analysis. National Income
and Product Accounts Tables.
vent Ventilation rate (m3/hour) and fan voltage (%) (2 second samples).
wind Motor speed data from a winding pilot plant (0.01 second samples).
Function (.m) Unobserved Component Models
dhr Dynamic Harmonic Regression (DHR) analysis.

dhropt DHR hyper-parameter estimation.
irwsm Integrated Random Walk (IRW) smoothing.
irwsmopt IRW hyper-parameter estimation.
uni Trend with AR component.
univopt Trend with AR hyper-parameter estimation.
Function (.m) Time Variable and State Dependent Parameter Models
dar Dynamic Auto-Regression and time frequency analysis.

daropt DAR hyper-parameter estimation.
darsp DAR spectra plot.
darx DAR eXogenous variables analysis.
darxopt DARX hyper-parameter estimation.
dlr Dynamic Linear Regression (DLR) analysis.
dlropt DLR hyper-parameter estimation.
dtfm Dynamic Transfer Function (DTF) analysis.
dtfmopt DTF hyper-parameter estimation.
sdp State Dependent Parameter (SDP) analysis.
Function (.m) Transfer Function Models
getpar Extract parameters from theta matrix.

riv Discrete-time Transfer Function (TF) model estimation.
rivid Discrete-time TF order identification.
rivc Continuous-time TF model estimation.
rivcid Continuous-time TF order identification.
Function (.m) Identification and Diagnostics
acf Autocorrelation and Partial Autocorrelation.

aic Akaike Information Criterium (AIC).
arspec Auto-Regression (AR) spectrum
boxcox Box-Cox transformation for homoskedasticity
ccf Sample Cross-Correlation Function.
cusum Cusum recursive test for time varying mean.
histon Histogram superimposed over Normal distribution
mar Auto-Regresive model estimation.

period Periodogram estimation.

statist Sample descriptive statistics.
Function (.m) Univariate control system design
dlqri Iterative linear quadratic regulator design

gains Proportional-Integral-Plus polynomials
nmssform Non-Minimal State Space form
pip PIP pole assignment
pipcl PIP closed-loop transfer functions
pipcom PIP with command input anticipation
piplib Simulink library for PIP control
pipopt PIP Linear Quadratic optimal
Function (.m) Multivariate control system design
mfdform Matrix Fraction Description form

mfd2nmss Multivariable non-minimum state space form
mpipinit Initialise Simulink diagram
mpipqr Linear Quadratic weightings for PIP control
Function (.m) Auxiliary Functions
createth Creates theta matrix from parameters.

del Matrix of delayed variables.
fcast Prepare data for forecasting.
kalmanfis Kalman Filter and Fixed Interval Smoother.
mcpar Parameters for Monte Carlo analysis.
prbs Pseudo Random Binary Signal generator.
prepz Prepare data for input-output modelling.
reconst Reconstructs a series with jumps.
scaleb Rescale numerator polynomials after using prepz.
shade Plot shaded confidence bounds.
stand Standardise or de-standardise matrix.
theta Information about the theta matrix.
Function (.m) Demos.
captdemo Demonstrations and background information.

chapt1 Figures from Chapter 1 of handbook.

dardemo DAR command line demo (sdar.dat).

darxdemo DARX command line demo (sdarx.dat).
dlrdemo DLR command line demo (adv.dat).
dhrdemo DHR command line demo (air.dat).
dtfmdemo1 DTFM command line demo (sdtfm1.dat).
dtfmdemo2 DTFM command line demo (sdtfm2.dat).
kfisdemo KALMANFIS command line demo (canningxxxx.dat).
pipdemo1 Univariate PIP control design.
pipdemo2 PIP control design for global CO2 (globalco2.dat).
pipdemo3 Multivariable PIP control design.
rivcdemo RIVC command line demo (wind.dat).
rivdemo1 RIV command line demo (simulated data).
rivdemo2 RIVID command line demo (simulated data).
rivdemo3 RIVID command line demo (photo.dat).
sdpdemo SDP command line demo (simulated data).
univdemo UNIV command line demo (air.dat).
Model (.mdl) Library and demos.
piplib SIMULINK library for PIP control.

delaypip PIP control example (called by pipdemo1).
driveopen Coupled drives in open-loop (pipdemo3)
drivepip Coupled drives in closed-loop (pipdemo3)
chapt5sim Chapter 5 nonlinear model.
chapt8sim Chapter 8 PIP control.
Abbreviation In full
ACF Autocorrelation Function.

AIC Akaike Information Criterium.
AR(p) Auto-Regression model (pth order).
ARMA AutoRegressive-Moving Average.
CCF Cross Correlation Function.
COD Coefficient of Determination.
BOD Biochemical Oxygen Demand.
BSM Basic Structural Model.
DAR Dynamic Auto-Regression.
DARX Dynamic Auto-Regression with eXogenous variables.
DBM Data Based Mechanistic (analysis).
DHR Dynamic Harmonic Regression.
DIAR Double Integrated Auto-Regressive.
DLR Dynamic Linear Regression.
DO Dissolved Oxygen.
DTF Dynamic Transfer Function.
FIS Fixed Interval Smoother.
FISIV Fixed Interval Smoothing Instrumental Variable.

GRW Generalised Random Walk.

HP Hodrick-Prescott (filter).
IAR Integrated Auto-Regressive.
IRW Integrated Random Walk.
IV Instrumental Variable.
KF Kalman Filter.
LLT Local Linear Trend.
LQ Linear Quadratic.
LS Least Squares.
MFD Matrix Fraction Description.
MISO Multiple Input Single Output.
MIMO Multi Input Multi Output
ML Maximum Likelihood.
NAN Not-A-Number.
NLS Nonlinear Least Squares.
NMSS Non-Minimum State Space.
NVR Noise Variance Ratio.
PACF Partial Autocorrelation Function.
PIP Proportional-Integral-Plus.
RIV Recursive Instrumental Variable.
RW Random Walk.
SDP State Dependent Parameter.
SISO Single Input Single Output.
SPECMAP Spectral Mapping.
SRIV Simplified Recursive Instrumental Variable.
SRW Smoothed Random Walk.
TDC True Digital Control.
SS State Space.
TF Transfer Function.
TVP Time Variable Parameter.
UC Unobserved Components.
WLS Weighted Least Squares
YIC Young Identification Criterium.

CAPTAIN Handbook

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CAPTAIN Handbook

Uploaded by

Copyright:

Available Formats

System Identification,

Time Series Analysis and Forecasting.

Handbook v2.0 February 2007

Copyright (c) 2007

Toolbox Authors by Department

Department of Environmental Science, Faculty of Science and Technology,

Prof. Peter C. Young p.young@lancaster.ac.uk

Engineering Department, Faculty of Science and Technology,

Dr. C. James Taylor c.taylor@lancaster.ac.uk

Escuela Tcnica Superior de Ingenieros Industriales, Edificio Politcnico,

Dr. Diego J. Pedregal diego.pedregal@uclm.es

Dr Paul McKenna, Department of Environmental Science, Lancaster University

Dr Renata Romanowicz, Department of Environmental Science, Lancaster University

CAPTAIN is usually distributed as a mixture of pre-parsed MATLAB pseudo-code (P-files)

Young, P.C., Taylor, C.J., Tych, W. and Pedregal, D.J. (2007)

For commercial applications, permission is required from the authors.

1.1 Interpolation of advertising data using DLR adv.dat 11

1.1 Modelling Philosophy

As we look around us, we perceive complexity in all directions: environmental, biological

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 2

Naturally, these publications introduce a wide range of modelling tools, encompassing

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 3

2. Nonstationary and nonlinear signal processing based on the identification and

1.2 Toolbox Overview

MATLAB is a high performance language published by The MathWorks, Inc., integrating

Unobserved Components models

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 4

It should be pointed out that, while it is sometimes convenient to categorise the

Time Variable Parameter models

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 5

State Dependent Parameter models

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 6

Multi-Input Transfer Function models

True Digital Control

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 7

Conventional Models, Identification Tools and Auxiliary functions

Similarly, system identification is inherent to the modelling approach utilised by most of

1.3 Getting Started

>> load air.dat

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 8

IRWSM Integrated Random Walk smoothing and decimation

>> help irwsm

IRWSM Integrated Random Walk smoothing and decimation

y: Time series (*)

t: Decimated (or simply smoothed if dt=1) series

See also IRWSMOPT, FCAST, STAND, DHR, DHROPT, SDP

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 9

>> load air.dat

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 10

Example 1.1 Interpolation of advertising data using DLR

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 11

>> load adv.dat

>> [fit, fitse, par] = dlr(y, z, [], nvr);

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 12

>> plot(y, 'o')

Example 1.2 Transfer function model estimation using RIV

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 13

yt = 0.438 yt 1 + 79.8 u t 3 (1.4)

>> load vent.dat

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 14

1.4 How to use this book

This publication is primarily intended as a tutorial guide to the data-based mechanistic

CAPTAIN handbook D. J. Pedregal, C. J. Taylor and P. C. Young page 15

Time variable parameter modelling is introduced in Chapter 2. Here, the filtering