You are on page 1of 42

Recursive Subspace Identification with Predictive Control

- a nuclear norm approach

Literature Survey

Bhagyashri Telsang

Delft Center for Systems and Control

Recursive Subspace Identification with


Predictive Control
- a nuclear norm approach

Literature Survey

Bhagyashri Telsang
February 14, 2016

Faculty of Mechanical, Maritime and Materials Engineering (3mE) Delft University of


Technology

c
Copyright
All rights reserved.

Table of Contents

Preface

iii

1 Introduction
1-1 Goals of the survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
3

1-2 Structure of the report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Subspace Identification

2-1 Nuclear norm based subspace identification . . . . . . . . . . . . . . . . . . . .

2-1-1
2-1-2

Nuclear norm for identification . . . . . . . . . . . . . . . . . . . . . . .


Nuclear norm optimization methods . . . . . . . . . . . . . . . . . . . .

7
8

2-1-3

Robust subspace identification . . . . . . . . . . . . . . . . . . . . . . .

10

2-1-4

Selection of the hyper-parameters for nuclear norm optimization . . . . .

10

2-1-5

Summary

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2-2 Recursive subspace identification . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2-3 Update/downdate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3 Controller formulation and validation


3-1 Adaptive model predictive control . . . . . . . . . . . . . . . . . . . . . . . . . .

17
18

3-2 Benchmark models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

4 Conclusion and Future Work


4-1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25
26

Bibliography

27

Glossary

33

List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Literature Survey

33

Bhagyashri Telsang

ii

Table of Contents

Bhagyashri Telsang

Literature Survey

Preface

This document reports the work being carried out as literature survey, which is a preliminary
to the thesis. The main objective of this work to analyze the development of the considered
fields and their interconnections, and in the process, point the potential direction of future
development which will form topic of the thesis.
I would like to thank to dr.ir. Jan-Willem van Wingerden and Prof.dr.ir. M. Verhaegen for
giving me the opportunity to carry out this survey, which gave me good insight into the field.
I would also like to thank Sachin Navalkar for his constant help and supervision throughout
the duration of work and compilation of the document.

Delft University of Technology


February 14, 2016

Literature Survey

Bhagyashri Telsang

Bhagyashri Telsang

iv

Bhagyashri Telsang

Preface

Literature Survey

Chapter 1
Introduction

Until the introduction of state-space representation of systems in 1960 by Rudolf Kalman,


control design techniques were mostly based on frequency or step response analysis. Development of state-space representation of systems not only made the analysis of Multi Input
Multi Output (MIMO) systems easier but also laid the foundation for state-space model based
controllers. The Linear-Quadratic optimal control technique was the cornerstone for modelbased controller design techniques [1]. Formulation of such controllers pushed the control
community to address issues in the then non-classical fields like process control. Classical
fields in which control community was active were generally electrical, mechanical, aerospace
or electro-mechanical for which obtaining a plant model was relatively easy. Extension of
control into the then new non-classical fields created the need for obtaining models without
physical modeling of the system. System identification developed in order to address this
need. Along with the rise of system identification, this need also led to the development
of data-based control techniques which either use the model obtained from identification or
formulate the controller directly using plant data without the use of a model. The focus of
this document will be on the former development.
Much of the work in the field of statistics for time-series identification was carried out in the
19th century. However, in the control community, system identification developed as a field
of its own with the introduction of two seminal papers - [2] and [3], each paving the way for
two different branches: Subspace IDentification (SID) and Prediction Error Methods (PEM).
There were plenty of different methods being proposed, due to which the field looked more
like a bag of tricks than a unified subject, [4]. However, the eventual analysis revealed that
all of them can be classified either into subspace techniques or prediction-error methods [1].
This unstructured rapid development of the field needed the techniques to be categorized in
terms of the steps involved and the approach taken. The survey by strm in [4], mentions
the previously characterized identification problem using three quantities: a class of models,
a class of input signals and a performance criterion. This characterization is more formally
introduced, in [5], as three basic steps of an identification cycle as shown in Figure 1-1. An
appropriate experiment setup is first designed to acquire the data from the system that has to
be identified. The data has to be such that it captures the behavior of the system accurately.
Literature Survey

Bhagyashri Telsang

Introduction

Figure 1-1: Steps carried out in a system identification cycle, [5]

In order to ensure that the behavior is captured for all frequencies, condition of persistency
of excitation, which was first introduced in [3], is enforced on the input to the system. Using
prior knowledge, a model structure, like ARX, ARMA, ARMAX etc., has to be chosen. The
criteria for the minimization of the cost function must be chosen. However, it is important to
note that this particular breakdown of the steps of identification applies to PEM; subspace
methods typically do not use a particular model structure. More on the characterization
of subspace methods will be dealt in Chapter 2. With these three main characterizations,
marked in color red in Figure 1-1, the identification cycle is implemented. The process can
be iterative if the estimated model does not pass the validation test. Different criteria and
techniques to validate a model are given in Section 3-2.
The purpose of identification can either be for analysis of the system or obtaining a model
in order to control the system. Irrespective of the purpose, the wide characterization of the
process of identification remains the same, although different factors need to be considered
based on the purpose. If identification is carried out in order to achieve control of the system,
then there lies an inherent assumption that the design can be divided into two steps: identification and control [4]. This assumption, known as Separation Hypothesis must be verified
for every design. The assumption implies that the results of the identification are used in
controller design. The validation of Separation Hypothesis must be carried out in order to
obtain answers to the following questions:
Are the results of the identification accurate enough in order to achieve satisfactory
control action?
Will the plant remain unchanged from the time when it is identified till the time when
control action is delivered?
Bhagyashri Telsang

Literature Survey

1-1 Goals of the survey

Unfortunately, much theory for this validation does not exist as of today. However, a sound
mathematical theory exists for the Separation Principle which facilitates the design of a
feedback controller using the states estimated from an observer. For example, if the observer
matrix satisfies Doyle-Stein Condition then the separation principle holds for that particular
design. Like the assumption Separation Hypothesis divides the design into identification and
control, Separation Principle divides the design into observer and controller design. The
question of an extension of the same theory for identification and control remains to be
answered.
The purpose of identification plays a role also in evaluating the accuracy of the obtained
model. If identification is carried out for the analysis of the underlying system, then the
accuracy should be judged based on the obtained model parameters. If the purpose is to
control the system then, under the validity of separation hypothesis, the accuracy should be
judged based on the output response of the system.
Based on the different aspects involved in the characterization of the identification technique,
distinct parametric and non-parametric types of identification were developed. As the name
suggests, parametric techniques have the criterion in terms of the model parameters. Most of
the criteria are based on prediction error of the underlying system and modifying the model
parameters as optimization variables in order to obtain minimum error. This formed the
class of PEM, taking its roots in the maximum likelihood method proposed in [3]. However,
this branch was mainly developed by Ljung in [5], who changed the outlook towards the
field by viewing identification as a design problem, in the sense that search for the "closest"
model instead of the true model was performed. He clearly separated two independent
concepts: the parametric model structure and the identification criterion, thus laying down
the prediction-error framework. The steps for identification as given in Figure 1-1 hold for
PEM but not for SID techniques.
The main drawback of PEM is that they are not easily extendable to MIMO systems, to
which SID methods stemmed as a solution. The focus of this document will be on SID
methods which are introduced and reviewed in detail in Chapter 2. Study on the most recent
developments in subspace identification reveals that the nuclear norm based techniques are
rapidly making their place in the literature. Use of the nuclear norm in SID facilitates the
combination of two steps in SID methods, namely estimation of a high order model and model
order reduction, into one step through rank minimization. The subspace resulting from the
rank minimization can be directly used to formulate a control law, without the knowledge of
the model order. The field of model predictive control provides a framework to carry out the
same.

1-1

Goals of the survey

The purpose of a literature survey is to review the fields - nuclear norm SID and predictive
control, and point out the open questions in the same. The main goal of this survey is to
identify gaps in the literature and accordingly propose different alternatives to close the gap.
Based on the study of existing techniques, the following goals are defined.
Goal 1: Review the recent developments in the field of SID and identify the areas where
development is necessary.
Literature Survey

Bhagyashri Telsang

Introduction

Goal 2: Choose a controller design technique to suit the requirements posed by the
identification method and review the developments in that field.
Goal 3: Accordingly, formulate and motivate the problem statement for the thesis
project.

1-2

Structure of the report

Review and analysis of the developments in subspace identification is taken up in Chapter 2,


which is necessary to identify the key places where there is room for improvement. Section 21 first reviews the evolution of techniques of subspace identification that use nuclear norm
as the minimization criteria. It further points out the general structure of usage of the
nuclear norm and explains a breakdown of the structure. The section concludes by listing
different algorithms to solve the nuclear norm problem and various applications that benefit
from the use of nuclear norm. Section 2-2 reviews the development of recursive techniques
in subspace identification. This is done by first analyzing the requirements of a recursive
algorithm, and then comparing different existing techniques through their advantages and
disadvantages. Alternatives to SVD using which a subspace can be tracked are listed and
studied in Section 2-3.
Chapter 3 focuses on the control and validation of the design. Starting with the motivation
to use Model Predictive Control in this design, Section 3-1 briefly notes the developments
in MPC from its basic techniques to currently existing adaptive solutions. The methods for
evaluation of performance of the to-be-formulated method are surveyed in Section 3-2.
Based on the review of the progress in the considered fields, the direction of further development for future is identified in Chapter 4. Accordingly a proposal for a thesis topic is made.
Based on the entire survey, future work that needs to be carried out is listed in the end.

Bhagyashri Telsang

Literature Survey

Chapter 2
Subspace Identification

Although the modeling and analysis of MIMO systems could be easily carried out using the
newly introduced state-space description of systems in 1960s, MIMO identification was still
an open problem until the late 1980s, [1]. PEM did not facilitate identification for MIMO
systems in a straight-forward manner unlike the case with SISO systems. This was mainly
because most PEM were based on transfer function model of the system which is limited
to SISO systems. Subspace identification methods mainly emerged to fill this gap. Major
developments in this regard happened in early 1990s; the introduction of three main SID
methods - N4SID, MOESP and CVA in [6], [7] and [8] respectively, gave rise to the field of
subspace identification. In these methods, unlike PEM, a particular model structure is not
chosen in order to carry out the identification. The fact that most SID methods are based on
state-space representation of the system facilitates their straightforward extension to MIMO
identification as well.
Another angle of comparison between PEM and SID methods is by examination of their
optimality. While [9] states that there are missing links in the guaranteed optimality of
subspace methods, comparison of PEM and SID in [10] confirms the suboptimal nature of
subspace methods through experiments. SID methods do not have a performance criterion
to be minimized, whereas PEM focus on minimizing a criterion globally but are susceptible
to getting stuck in local minima. Hence, although theoretically prediction-error methods are
more accurate than subspace methods, the same result does not always hold when used in
practical situations.
Most SID methods consist of the following steps, [11]:
1 Regression or projection: To estimate high-order models
2 Model reduction: Obtained high order model is reduced to lower dimensional subspace
3 Parameter estimation: Realization of the system matrices
Algorithms such as MOESP and N4SID follow the pattern of the above defined steps; their
comparison can be found in [12]. These techniques require estimation of the model order of
Literature Survey

Bhagyashri Telsang

Subspace Identification

the underlying system. This is done either by examination of dominant singular values or
by some model order estimation criteria like Akaike Information Criterion (AIC). Then the
estimated order is used in further steps of identification for realization of the system matrices.
A main drawback of these methods is that they are limited to open-loop identification. This
is because of the assumption of no correlation between noise and the input, which is true for
open-loop systems but not for closed-loop systems. Hence, when these techniques are used
for identification when data is acquired in closed loop, the resulting estimates are biased.
This was the main reason for the development of the branch of Predictor-Based Subspace
IDentification (PBSID) methods. ARX modeling and subspace identification were combined
to result in a technique called SSARX, [13]. The method in [14] avoided the problem
of biased estimates by estimating the innovation sequence so that there is no correlation
term between input and noise. These methods combine the idea of PEM and SID and yield
techniques that result in unbiased estimates for both open and closed loop identification.
As seen before, another feature in these subspace methods is that the estimation of order
of the system is done manually by examination of the dominant singular values. This can
be automated by combining the identification with rank minimization. In the process, the
optimal order of the system gets computed and utilized accordingly. The optimality of the
order depends on the definition of the problem. Roughly, it is the trade-off between model
complexity (implying high order) and accuracy of identification. This can be mathematically
included in the identification problem statement by the use of a rank minimization term
through nuclear norm as given in next section.

2-1

Nuclear norm based subspace identification

A class of subspace identification techniques that use nuclear norm to obtain a lower dimensional subspace describing the underlying system have been proposed recently. Development
in this direction started from [15], in which it was proved that the nuclear norm can be used
as a heuristic for the rank of a matrix. Reduction of the rank or model complexity is a desirable feature as it is easy to handle low order models. But if the reduced order is significantly
smaller than the actual system order, then the estimated model will deviate too much from
the actual plant.
The amount of rank minimization while maintaining desired accuracy can be quantified
through regularization by including rank minimization as only a part of the optimization
criterion. In other words, the use of regularization retains the parameters of the system that
are significant (analogous to dominant singular values) while considering accuracy [16]. In
a broad sense, regularization is using additional information in order to solve ill-conditioned
problems. In the case of system identification, maintaining the desired output (and hence
accuracy) can be considered as the additional information.
Regularization can be introduced in different mathematical forms - for example, Tikhonov is
an L2 regularization. Use of a prior probability in identification carried out in the Bayesian
framework is another type of regularization, which gives lower probability to more complex
models. Nuclear norm, employed for rank minimization, is an L1 type regularization. The
heuristic of using nuclear norm for rank minimization is attractive primarily because it forms
Bhagyashri Telsang

Literature Survey

2-1 Nuclear norm based subspace identification

a convex envelope on the rank function. This property can be exploited in order to obtain a
convex optimization problem as the minimization criterion in identification.
Nuclear norm of a matrix, also called as Ky Fan or trace norm, is defined as the sum of its
singular values. The rank minimization problem, in terms of l0 norm, min||(X)||0 can be
approximated in terms of the l1 norm as min||(X)||1 , which turns out to be the nuclear norm.
In other words, the role of the nuclear norm in the rank minimization problem is straightforward. This approximation is useful because it eliminates the use of the l0 norm, whose
minimization is regarded as NP-hard problem, by relaxing it to the higher order l1 norm.
The attractive mathematical features and well established properties of the l1 norm stem
from the basic fact that the norm is a summation function for which elementary operations
like differentiation can be performed easily. This feature is desirable in an optimization
problem as the search for minimum can be carried out analytically. However, approximation
of the rank minimization problem by use of l1 norm in place of l0 norm penalizes the larger
singular values more than the smaller ones. In order to address this imbalance, weighting is
considered. Also, the use of weights enables extra degrees of freedom in manipulating the
nature of the problem to have the desired mathematical properties. However, one must be
careful with the choice of weights as use of inappropriate weights might cause loss of convex
properties in the heuristic [17].

2-1-1

Nuclear norm for identification

Most of the nuclear norm based subspace identification techniques combine the first two steps:
regression/projection and model order reduction, by employing the nuclear norm. For a
given number of measurements, it was proved in [18] that the low rank solution describing
the underlying system can be recovered by solving Eq. (2-1). All the techniques use variations
of Eq. (2-1) as objective function for the minimization criterion.
min
y

||F (
y )||? + ||
y y||22

(2-1)

where F (
y ) is a function whose rank has to be minimized while keeping y (estimated value)
close to y (measured value) based on the regularization parameter . Different choices of
F (
y ) describe the different subspaces pertaining to the underlying system, whose order will
be minimized. Once the optimization is complete, using the obtained lower dimensional
subspace F (
y ), system matrices (and Kalman gain) can be explicitly computed.
All the techniques differ in their choice of
Function F (
y ), or
On the optimization algorithm employed to solve Eq. (2-1), or
Regularization parameter , or
On the method employed to extract system matrices
In [19], F (
y ) is weighted product of a projection of Hankel matrix of the output y, as given
in Eq. (2-2)
Literature Survey

Bhagyashri Telsang

Subspace Identification
Table 2-1: Comparison of different SID methods

Method

MOESP

W1

W2

( N1

U s,r,N

(p, q)

(r, N )

T )1
U s,r,N

N4SID

IVM

CVA

T
( N1 Ys,r,N
Ys,r,N
)1/2
U s,r,N
( N1 T )1/2

( N1 Ys,r,N

(r, 2s)

(r, 2s)

( N1

T )1

U s,r,N

(r, N )

F (
y ) = W1

( N1

T
Ys,r,N
)1/2
U s,r,N
T )1/2

U s,r,N

1
Ys,r,N
T W2
N
U s,r,N

(2-2)

where W1 and W2 are the weights, Ys,r,N is the Hankel matrix of y,

is a projection

U s,r,N

"

U0,s,N
is the instrumental variable with past
Y0,s,N
inputs and outputs. Depending on the choice of time indexes in Hankel matrix and weighting
matrices W1 and W2 , different methods like N4SID, MOESP, CVA or Instrumental Variable
Method (IVM) can be arrived at, as given in [19]. Table 2-1 lists the weighting matrices
for these SID methods and also gives dimension of the function F for each method. It can
be observed from Eq. (2-2) that the dimensions of F (
y ) in Rpq depends on the weighting
matrices. As F forms a part of the optimization problem, computational complexity of the
algorithm in turn depends on the size of F (
y ). As s is much smaller than N , IVM and CVA
are computationally more favorable than MOESP and N4SID.
matrix on the nullspace of Us,r,N and =

For N2SID in [20], the function considered is


F (
y ) = Ys Tu,s Us Ty,s Ys
which is the product of extended observability matrix and state sequence, hence representing
the column space of the system. N2SID is preferred to the identification method proposed in
[19] because it does not use any projections and hence uses all of the available information.
Here, Tu,s and Ty,s are lower triangular Toeplitz, containing system matrices and Kalman
gain describing the system in innovation form. Ys , Tu,s and Ty,s obtained from the solution
of Eq. (2-1) gives an approximation of the underlying system. Note that the model order
reduction step is not required here because the approximation is already low rank. This is
due to the use of nuclear norm in the objective function of the optimization problem. Using
SVD, the model order is computed as the number of dominant singular values and is used
in the computation of system matrices. This step can be skipped if knowledge of the system
matrices is not explicitly required for the controller design. More on this topic will be dealt
in Section 2-3.

2-1-2

Nuclear norm optimization methods

Minimization problem Eq. (2-1) can be solved using Semi-Definite Programming (SDP)
solvers, as in [21]. One of the standard algorithms to solve SDP problems was proposed
in [22], which uses an interior-point method to minimize the nuclear norm function by exploiting the problem structure. However, due to their scaling problems leading to increased
Bhagyashri Telsang

Literature Survey

2-1 Nuclear norm based subspace identification

computational burden, several alternate methods have been proposed. In both [19] and [20],
ADMM algorithm is used. A slightly different approach of solving the nuclear norm minimization problem was proposed in [23]. This method uses Augmented Lagrangian with ADMM
which results in a first order iterative algorithm, thus avoiding the computational complexity
of interior-point methods.
To address the nuclear norm minimization problem for large-scale problems, for which interiorpoint methods are not amenable, an algorithm based on SVD was proposed in [24]. The presented method, which turned out to be a powerful tool, is called Singular Value Thresholding
algorithm. As the name suggests, the method imposes soft-thresholding on the singular values of the matrix whose nuclear norm has to be minimized. Consider SVD of a matrix X as
X = U V T . Soft-thresholding operator D on X is defined as:

D (X) = U D ()V T ,

D () = diag({i }+ )

(2-3)

In other words, each singular value that is greater than is considered and the corresponding
difference i is utilized. This effectively shrinks the singular values towards zero and hence
the operator, called shrinkage operator, is reducing the nuclear norm of X in effect. This
operator is the crux of the algorithm and the resulting overall method is first-order.
An approach that uses gradient type minimization was proposed in [25]. It involves construction of a search point in order to accelerate the convergence rate. However, the accelerated
gradient-type method fails to solve large-scale problems. To address this issue, an algorithm
that uses active subspace selection was proposed in [26]. This method alternates between
identifying active row and column subspaces, and solving the reduced problem within the active subspace. It is also showed that the active subspace never changes in the neighborhood of
a global minimum. This approach opened up more solutions in the area of matrix completion
for large scale problems. For example, a nuclear norm regularized technique which relies on
soft-thresholding of SVD was presented in [27].
For appropriately weighted l1 norm, as mentioned earlier, the convex properties are maintained. A class of algorithms - Iterative Reweighted Least Squares (IRLS-p) for 0 < p 1, has
been proved to recover the low rank matrices defining the underlying system. Convergence
analysis for IRLS-p is given in [28]. Along with the proof of convergence, the paper mentions
that the necessary and sufficient condition for low rank recovery is Null Space Property which
is defined as follows.
Given > 0, the linear map A : Rmn Rp satisfies the Null space Property of order r
if for every Z N (A) \ {0} we have
r
X
i=1

i (Z) <

n
X

i (Z)

(2-4)

i=r+1

where N (A) denotes the null space of A and i (Z) denotes the ith largest singular value of
Z.
When this condition is satisfied, the paper proves that IRLS-1 is guaranteed to recover the
low rank matrix. A different version, sIRLS-p (s for short) is also proposed in the paper. The
Literature Survey

Bhagyashri Telsang

10

Subspace Identification

proposed technique is a first-order method for locally minimizing a smooth approximation to


a rank function.
On the other hand, when the use of certain weights makes the optimization problem nonconvex, there are solutions to address the non-convexity issue. To solve such concave problems,
an algorithm NESTA was presented in [17]. The proposed solution is based on Nesterovs
algorithm and uses Singular Value Thresholding to locally find the minimum at each iteration.

2-1-3

Robust subspace identification

Similar in nature to Matrix Completion problems, nuclear norm can also be employed in cases
where the identification dataset has some missing points or outliers. Such techniques fall under
the category of Robust subspace identification w.r.t data, [29] and [30]. The approach taken
in [30] is similar to the minimization strategy used in [19], except that the main aim initially
is to recover the missing measured values in y. This is done by solving the minimization with
a strict constraint y = y for the missing points, without guaranteeing a low rank solution.
Once the missing values are recovered, the entire set is used for nuclear norm minimization.
In [29], an error term v is introduced in the minimization function Eq. (2-1) such that v
captures the outlier and missing values in the measured dataset y. Nuclear norm minimization
takes place simultaneously with estimation of the error term by having both y and v as the
optimization variables. The paper also gives a method to compute the upper and lower
bounds on the regularization parameter such that the estimation is optimal.

2-1-4

Selection of the hyper-parameters for nuclear norm optimization

More generic kernel based, strategies for choosing the regularization parameter and weighting
matrices are given in [31]. The scheme for finding appropriate weighting matrices is based
on oblique projection of the future output data matrix on the row space generated by near
past input data matrix. Once the weighting matrices are found, they are kept fixed and a
minimization is performed to obtain the regularization parameters. This strategy is based
on a method of identification introduced in [32], which is in the Bayesian framework. In
this method, a Gaussian prior is formulated on the unknown impulse response of the underlying system. Through the estimation of hyper-parameters defining the prior, unknown
impulse response parameters are estimated. This method also accommodates for translation
to lower dimensional spaces through projection of nonparametric estimates onto a suitable
low dimensional parametric space.
Based on the concept of estimation of hyper-parameters instead of directly estimating impulse
response parameters, an identification method was proposed in [33]. Unlike other methods
discussed in this section, the heuristic employed here for rank minimization is a non-convex
function. l0 norm of x, ||x||0 , is replaced by the log-det heuristic, log|xi |, which is a
i

concave function of x. After including this heuristic for fixed regularization parameters in the
objective function, an analytic minimizer is first obtained. Then the regularization parameters
are treated as hyper-parameters and are estimated using marginal likelihood maximization.
Bhagyashri Telsang

Literature Survey

2-2 Recursive subspace identification

2-1-5

11

Summary

In this section, nuclear norm heuristic for a rank minimization problem was considered and
different methods to solve it were looked into. However, due to their computationally heavy
nature they cannot be easily implemented in real-time; this poses serious limitations for it to
be used in an adaptive controller, where identification has to be performed on-line. In order
to overcome this issue, there need to be made some internal changes in the algorithms too. To
do so, it is desirable to first understand the requirements and general structure of a recursive
algorithm and study their nature. Hence, it is important to survey some recursive subspace
identification algorithms, which need not be based on nuclear norm.

2-2

Recursive subspace identification

System identification plays a crucial role in most types of adaptive controllers, which require
the underlying system to be identified in real-time. As this is done online, generally at every
time instant, it is necessary to employ an identification algorithm that has low computational
complexity. But most of the subspace identification algorithms involve computation of an
SVD, which makes them computationally expensive to be implemented at every time instant.
This issue was recognized in the area of subspace identification soon after the most important
SID methods were published in 1990s and a number of solutions were proposed.
In the field of signal processing, an algorithm called Projection Approximation Subspace
Tracking was proposed, in [34], to recursively track the underlying subspace using the measurements from every time instant. As the name suggests, this is an approximation algorithm:
W T (t)x(t) is approximated as W T (t 1)x(t), where W (t) is the signal subspace and x(t) is
the measurement vector at the time instant t. Computation of the signal subspace W T (t),
through use of SVD, at time instant t is avoided by using the signal subspace from the last
time instant as an approximation for the current subspace. Before moving on to the next
time instant the subspace is computed as: W (t) = W (t 1) + c(t) where c(t) is the correction
term. This approximation causes loss of orthonormality of the signal subspace, the deviation
depending on the SNR of the measurements. The algorithm is computationally efficient as
there is no use of SVD or any other computationally cumbersome operation. Although the
algorithm employs the use of a forgetting factor in order to afford tracking capability in nonstationary environment, the underlying assumption in the method, that the signal subspace
does not drastically change every time instant, might not always hold.
Nevertheless, due to its attractive computational complexity, PAST was introduced in the
system identification community for different variants of MOESP [35]. As in MOESP, this
recursive technique computes an RQ factorization of the past input-output data vector. But
this factorization is not computed explicitly at every time instant, instead it is updated using
an appropriate set of Givens rotations. Hence, the only input required for this algorithm is
the measurement data vector that becomes available every time instant. Another buildup of
the PAST algorithm in the field of system identification was proposed in [36]. This algorithm
adopts gradient type subspace tracking to search for a global minimizer. It uses the same
minimization problem as in PAST, but the modifications involved in this algorithm render it
more robust to the choice of initial values than the corresponding PAST one.
Literature Survey

Bhagyashri Telsang

12

Subspace Identification

As all the modifications of PAST still involve approximation, a recursive subspace identification technique using the propagator method was proposed in [37]. Let a system with state
s(t) Rn and output z(t) RM described by equation Eq. (2-5):
z(t) = s(t) + b(t)

(2-5)

where b(t) RM is the measurement noise and RM n is the observability matrix. Assume
that the rank n is known. Split and z(t) as

"

} n
= 1
2 } M n

"

z (t) } n
z(t) = 1
z2 (t) } M n

such that 1 consists of all n independent rows of . As the rank is n, 2 is linearly dependent
on 1 and the relationship can be expressed as 2 = P T 1 , where P is the propagator. This
implies splitting of the observation space into signal and noise subspaces. When there is no
measurement noise, equation Eq. (2-6) holds and the propagator can be computed directly.
z2 = P T z 1

(2-6)

In the presence of noise, the relation is not straightforward, due to which the cost function in
Eq. (2-7) has to be minimized in order to estimate the propagator.
J(P ) = E||z2 P T z1 ||2

(2-7)

where E is the expectation operator. The propagator is P is a unique operator because of


the convex nature of the cost function J. This method results in biased estimates when the
noise covariance matrix is not proportional to identity matrix. To address this issue, an improvement over this was proposed in [38] and the new method was called EIVPM. A detailed
comparison of the computational complexity of PAST and different variants of Propagator
based Methods (PM) is given in Table 8 in [39]. From the corresponding information, it is
clear that both PAST and PM have a computational complexity which is lower than SVD,
hence achieving the main requirement of the recursive method. Further, it can be concluded
that despite being a non-approximation algorithm, PM has a lower computational complexity
than the corresponding PAST ones.
Apart from being an approximation algorithm, PAST cannot handle incomplete observations.
To fill this gap an algorithm PETRELS in [40] was proposed. It is a second order stochastic
gradient descent algorithm when the observed data is partial, else it is like PAST. In this
approach, the underlying low-dimensional subspace is identified by minimizing a geometrically
discounted sum of projection residuals on the observed entries at each time instant. If missing
entries are required, they are reconstructed using least squares estimation. In other words,
this algorithm involves matrix completion in the initial steps to handle sparse data. Although
this has nice convergence properties like PAST, it is also an approximation algorithm.
To address the issue of sparse data, GROUSE was proposed in [41]. Instead of obtaining
full-dimensional data and then finding a lower-dimensional subspace to describe the data,
Bhagyashri Telsang

Literature Survey

2-2 Recursive subspace identification

13

Table 2-2: Summary of a few recursive algorithms

Algorithm

Approximation

Features

PAST
PM
PETRELS
GROUSE
GRASTA

Yes
No
Yes
No
No

Computational complexity lesser than SVD


Computational complexity lesser than PAST
Build-up of PAST to handle incomplete observations
Handles sparse data but fails if the data is noisy or has outliers
Build-up of GROUSE to solve all of its problems

GROUSE looks for a lower-dimensional subspace directly from the incomplete data. This
is not an approximation algorithm, but it assumes that the low rank value is known. The
set of all subspaces with a particular rank value is known as a Grassmannian. Along the
gradient of the employed cost function, a short curve in the Grassmannian, of the assumed
low rank, is followed to find the required subspace. As this algorithm performs first-order
stochastic gradient descent on the orthogonal Grassmannian, it needs only a rank one update
per iteration. But due to the existence of barriers in its search path, GROUSE faces the risk
of getting trapped in local minima. Another potential problem in this algorithm is that its
performance strongly depends on proper tuning of the step size. However, the main limiting
factor of the algorithm is the form of the objective function it employs, which is l2 norm
based. Due to this, it faces problems when the data has outliers or is corrupted with noise.
To overcome all the problems associated with GROUSE, GRASTA in [42] was proposed. It
is based on the l1 norm, thus eliminating the problems faced in the presence of outliers and
noise. A method to adaptively vary the step size is also proposed along with this. To solve
the optimization problem, it uses Augmented Lagrangian with ADMM. Note that the nuclear
norm heuristic is also l1 norm and both GRASTA and nuclear norm subspace identification
methods are based on l1 minimization.

Table 2-2 briefly lists out the algorithms mentioned in this section along with their key
features. The pattern observed in the review of literature on recursive identification is that
many build-up algorithms are proposed in order to overcome some problems associated with
the original algorithm. However, in the process of overcoming the problem the complexity
sometimes increases. Hence it is important to explicitly note the computational complexity
of the new algorithm, but generally the new one has a complexity greater than or equal to
the old one.

Different recursive algorithms that keep a track of the underlying subspace were surveyed
in this section. It was noted that all the algorithms tried to bypass the use of SVD in
order to have favorable computational complexity. However, nuclear norm heuristic involves
computation of SVD and in order to implement nuclear norm recursively, bypassing SVD to
track the subspace is not feasible (unless nuclear norm is implemented using log-det heuristic).
Therefore, it is necessary to find alternatives to SVD that give the same results of SVD but
have lesser computational complexity. Review and analysis of such alternatives is the topic
of next section.
Literature Survey

Bhagyashri Telsang

14

2-3

Subspace Identification

Update/downdate

To carry out recursive identification, new measurement data becomes available at every time
instant and accordingly the subspace representing the underlying system needs to be computed. In order to maintain constant size of the involved matrices, for every new measurement
data that gets appended the oldest measurement data must be discarded. The computed subspace should be accurate during changing measurement values.
For example, consider the implementation of N2SID in [21]. The input and output data
measurements are expressed in the Hankel structure. Consider t = N as the current time
instant. At this time, the Hankel expression of the input is:

u(1)
u(2)
... u(N s + 1)

u(2)
u(3)
:

Us (t) =

:
u(s) u(s + 1) ...
u(N )
In the next time instant, at t = N + 1, new measurement data u(N + 1) becomes available
and has to be included in the Hankel form. In order to append the latest data and maintain
a constant size of the Hankel matrix, the oldest data must be discarded. Accordingly, the
Hankel structure at t = N + 1 will be:

u(2)
u(3)
... u(N s + 2)

u(3)
u(4)
:

Us (t) =

:
u(s + 1) u(s + 2) ...
u(N + 1)
This change at every time instant can be viewed as discarding the first column, followed by
shifting the columns to left and appending a last column containing the latest data. In the
face of these changes, the structure is preserved and this property must be exploited. All
the involved measurement Hankel matrices are changed in the same manner at every time
instant.
It is required to know the underlying subspace at all time instants. One way is to compute
an SVD at every time instant and perform the entire procedure. But this process is computationally expensive and hence an alternative method has to be used. Another way to do this is
to keep a track of the subspace. The main aim here is to compute the initial subspace (use of
this computationally expensive step is allowed initially as this is a one-time procedure), then
track it throughout just by updates and downdates, without explicitly performing SVD-like
operations at every time instant. The main requirement, apart from maintaining accuracy, is
that the update-downdate procedure should be computationally favorable.
Keeping this in mind, several alternatives to SVD that compute the underlying subspace
and track it are looked into. Consider a matrix X whose subspace needs to be computed
and tracked, but the available matrix X 0 is corrupted by noise V as X 0 = X + V . Obtaining
SVD(X) is straight-forward; the accuracy of SVD(X 0 ) is questionable when the small singular
values of X are equal to or smaller than the noise level, [43]. Therefore, the first step in
obtaining an accurate decomposition is to separate the small singular values from the noise
Bhagyashri Telsang

Literature Survey

2-3 Update/downdate

15

level. This can be achieved by having large number of measurements. In [43], it is pointed
out that doubling the number of measurements increases the smallest singular value three
fold, hence making it easy to distinguish it further from the noise level.
Obtaining extreme singular values and vectors in the presence of noise was dealt in [44], which
concludes that the Lanczos-type Bidiagonalization algorithm is more efficient especially if the
matrix under consideration has a structure like Toeplitz or Hankel. Unfortunately, most
papers that present an alternative to this algorithm, do not give a comparison of the two.
A parallel approach to obtaining the decomposition is rank-revealing QR factorization, as in
[45], where the matrix is decomposed as X = QR. is a permutation matrix and R is an
upper-triangular matrix which groups the large and small singular values of X in blocks. Using
and R, different subspaces of X can be computed. By appending new data and discarding
old data, the permutation matrix is updated using incremental condition estimate and R
is updated using Hyperbolic Householder transformation which are Junitary.
An intermediary between SVD and QR decomposition is URV which was proposed in [46].
In order to distinguish small singular values of X from noise, a tolerance level can be used.
A quantitative choice for computation of the tolerance is formulated in terms of a forgetting
factor and a noise size. The URV method decomposes X as in Eq. (2-8). U and V contain
the singular vectors of X, and unlike SVD the intermediate matrix containing the singular
values is not in diagonal form. R contains the large singular values and F and G are such
that they satisfy Eq. (2-9).
"

R
X=U
0
q

F
VT
G

||F ||2 + ||G||2 tolerance

(2-8)

(2-9)

Singular vectors are updated by the use of plane rotations and singular values are updated
using condition estimates. The reliability of this decomposition depends on the underlying
condition estimator used. A similar method - ULV decomposition - was proposed in [47]
which is very similar to URV. The main difference is that in this decomposition, the matrix
containing singular values is lower-triangular. Although the paper claims that ULV results in
higher quality estimate of null space, it does not include any conclusive proofs for the same.
The QR updating scheme is extended for subspace tracking in [48]. The derived fast SVD
updating technique is a combination of QR updating and Jacobi-type SVD. The method
accommodates for subspace tracking in a direct manner by updating the involved matrices
through just rotations. Even to maintain orthonormality, the method uses rotations instead
of more expensive reorthogonalization procedures. While SVD has the computational complexity of O(n3 ), this method is O(n2 ) while revealing all the information that SVD does.
This algorithm is free of the use of condition estimates in all of its steps.
A method, although less cited, for approximating a low-rank matrix was proposed in [49].
The presented method does not use condition estimates in any step; instead it uses Junitary
matrix and 2-norm approximation to obtain a corresponding low-rank matrix. The Junitary
matrix under consideration is obtained by a set of hyperbolic rotations and it is crucial to
keep track of the signature of these matrices. Signature here implies the sign of the elements
Literature Survey

Bhagyashri Telsang

16

Subspace Identification

in the matrix, which in turn imply the direction of rotation. Computational complexity of
the method is similar to that of a QR factorization and this method directly produces a
description of the column space of the approximant.
The main reason for consideration of the SVD in detail is because of its use in nuclear norm
minimization which is of immense importance in this survey. Consider the implementation
of ADMM in [19] and the minimizer X in Eq. (14) of the paper. The related equations from
the paper are mentioned here for clarity. The singular values, obtained from Eq. (2-10), that
are greater than a particular tolerance are used in formation of the minimizer.

A(x) B + Z/t = U diag()V T

(2-10)

(2-11)

X = U diag(max(0, 1/t))V

Observe that this operator is similar in nature to Singular Value Thresholding that was dealt
in detail in Section 2-1. Therefore, the operations like in [24] can be used here for faster
implementation. However, even those operations rely on the computation of full or partial
SVD. As the associated singular vectors are not explicitly used and only the soft-thresholding
operator is required, the step of computation of SVD can be skipped altogether. One such
algorithm that directly results in the soft-thresholding operator is presented in [50]. The
algorithm uses only matrix inversions and additions, and has a computational time less than
half of what full SVD has.
A recursive subspace identification algorithm that employs nuclear norm variant of PBSID as
the cost function was recently proposed in [51]. Using ADMM as the optimization method,
the algorithm updates the Markov parameters defining the underlying system by nuclear
norm minimization at every time instant. Computational complexity of using SVD at each
time instant is handled here by thresholding the singular values, hence resulting in lower
complexity.
Starting with introduction to subspace identification, developments in SID were reviewed.
Nuclear norm SID was studied after which different recursive subspace techniques were surveyed. With an aim to implement N2SID recursively, different alternatives to SVD were noted
in this section. The next part of the survey focuses on formulating a control methodology
for an identified system. Following that, the document lists out ways to validate a newly
proposed algorithm, and in the process surveys different available benchmark models.

Bhagyashri Telsang

Literature Survey

Chapter 3
Controller formulation and validation

During the review of the developments in the field of subspace identification individually, few
aspects that needed to be formulated got highlighted. A few years after subspace identification
techniques were introduced in early 1990s, it was noticed that they cannot be implemented
recursively with favorable computational complexity. Hence by the end of the decade, a couple
of identification methods that were either recursive versions of previously proposed subspace
identification algorithms [35] or completely new without previous basis [37] were proposed; a
detailed review is given in Section 2-2.
Around the same time, in the early 2000s, [15] proved that rank minimization can be replaced
by a nuclear norm heuristic. This led to a series of developments in the field of identification,
as noted in Section 2-1. However, recursive versions of these nuclear norm based subspace
identification algorithms have not yet made their place in the literature. This is mainly
because nuclear norm is based on singular values and recursive computation of SVD is heavy.
A fast way to compute SVD or an alternative to SVD is necessary in order to formulate a
recursive version of nuclear norm subspace identification techniques.
Attractive feature of subspace identification techniques is that the system matrices are explicitly recovered from the computed extended observability matrix or equivalent subspace,
depending the kind of technique in consideration. It is to be noted that this recovery is
generally the last step in the algorithm and it is only for this step that knowledge of the
system order is required. If the identification process is computed with the purpose of model
analysis, then it is necessary to extract the system matrices so that parameters can be evaluated. If the purpose of identification is to achieve control over the underlying system then
the explicit computation of system matrices will not be required if an appropriate controller
design methodology is selected.
Section 3-1 shows that Model Predictive Control is one such technique where the control law
can be formulated directly in terms of the acquired subspace, without having to compute the
system matrices explicitly. The to-be formulated design should be tested on a standard model
or an experimental setup so that its effectiveness can be analyzed and quantified. In order
to do so, the standard model should first be decided on, which is done through a survey in
Section 3-2.
Literature Survey

Bhagyashri Telsang

18

Controller formulation and validation

3-1

Adaptive model predictive control

As seen in the beginning of Chapter 2, SID methods involve three steps; the third step being
extraction of system matrices. This step requires information about order of the system in
order to define the matrix dimensions. Gaining knowledge about the system rank has to
be done either through examination of dominant singular values or by using model order
estimation criteria. Choosing any of the methods increases the computation as there have to
be extra operations. Hence it is desirable to skip this step unless it is absolutely necessary.
If the purpose of identification is analysis of the underlying system, then this step is crucial.
But if the identification is carried out in order to be able to control the system, then this step
is not essential as we will see in this section.
Most controllers, like PID require knowledge about the system model for which the order has
to be estimated. Model Predictive Control (MPC) generally assumes the knowledge of system
matrices, but enables the controller design even when the system matrices are not explicitly
known. Consider a discrete time system as given in Eq. (3-1).
x(k + 1) = Ax(k) + Bu(k)

(3-1)

y(k) = Cx(k) + Du(k)

Typical method of designing a control law for such a system in the MPC framework involves
stacking the outputs and inputs from the current time instant k until a particular time instant
Np called Prediction horizon. Stacked output Y can then be expressed as

Y = Cx(k)
+ DU

(3-2)

where

D
C
y(k)
u(k)

CA
y(k + 1)
u(k + 1)
CB

.
.
= CAB
U =
Y =
C = . D

.
.
.
N
N
p
CA p 1 B
CA
y(Np )
u(Np )

.
D
CB

.
0
D

0
0

.
D

For the stacked system, the objective function is defined in terms of the requirements and
constraints. For an unconstrained system with the objective of reference tracking, a general
optimization function is defined as:
J = (Yref Y )(Yref Y )T + U U T

(3-3)

where Yref is a stacked vector of desired output values. The control law is derived as the
U that minimizes the objective function given in Eq. (3-3). With an initial estimate of the
state, the minimizing control input vector is derived at each time instant and only the current
value is implemented. The process is repeated at each time instant and hence this principle
is called Receding Horizon.
not directly
The important point to notice here is that the control law is in terms of C and D,
in terms of system matrices A, B, C, D. This is a very attractive feature because model order
Bhagyashri Telsang

Literature Survey

3-1 Adaptive model predictive control

19

need not be computed and the extraction of system matrices in identification can be skipped.
Not only does this save computation but also avoids approximation of the model order. This
feature is the backbone of controller design in Subspace Predictive Control (SPC) which was
first introduced in [52]. It is precisely because of this feature that the use of MPC is considered
here.
The field of MPC started mainly in process control industries to handle constraints, for
example to enforce quality and production rates. The structure of implementation of MPC
allowed it to easily deal with multivariable systems as well. In the early years, MPC used
impulse and step response parameters based methods to model and control the system under
consideration. Dynamic Matrix Control (DMC), proposed in [53], and Generalized Predictive
Control (GPC), proposed in [54], were the most widely used control methodologies. A detailed
history and comparison of the two methods can be found in the survey [55].
MPC, as we see today, is mostly based on state-space representation of the underlying system.
Extension from step response parameters based to state-space based methods was brought
about by various papers like [56]. Interpretation of MPC in the state-space framework allows
straightforward generalization to multivariable complex cases with general stochastic disturbances and measurement noise. The paper [57] proposed one such interpretation of MPC
in the state-space framework but it involved solving a Riccati equation of potentially large
order. The order is generally equal to the number of step response coefficients times the
number of outputs, and hence large. Another such attempt to bridge the gap was proposed
in [56] which makes use of the structure in the step response model to reduce the order of
the Riccati equation. While this method is for the infinite horizon case, it is also applicable
to the finite horizon case but with the cost of steady-state error. It also makes provision for
handling integrated and double integrated white noise disturbances in a simple manner.
After about 15 years of MPC being recognized as a field, a survey paper by [55] pointed
out the key issues in the field by analyzing the history of MPC and connecting with the
present status at the time. This brought to light various issues that need to be considered
while designing the controller, closed-loop stability being one such issue. The paper pointed
out that an algorithm which minimizes an open-loop objective function might inadvertently
drive the closed-loop system to instability. This is because of employing the receding horizon
principle due to which the implemented control inputs might differ from the control input
sequence computed at a particular time. This could be taken care of by introducing slack
variables to soften the hard constraints or by choosing a sufficiently large horizon. For an
inherently unstable system, it is necessary to precalculate the feasible region in the presence
of input saturation constraints. Some algorithms for doing the same were proposed in [58]
and [59].
As the name suggests, design of MPC relies on the availability of a model for the system
to be controlled. This requires the model to accurately represent the underlying system.
Unfortunately this is not always possible. Hence there was a need to design the controller
while considering model uncertainties, thus combining Robust control with MPC. The Campo
algorithm proposed in [60], did not bridge all the gaps as robust stability was not guaranteed
from it. Another approach to going about the problem of unavailability of accurate models is
to use an adaptive controller in which repeated identification is performed. One such robust
MPC was proposed in [61], which involves a subspace predictor to obtain information about
the underlying system indirectly in terms of Markov parameters and not in the form of stateLiterature Survey

Bhagyashri Telsang

20

Controller formulation and validation

space matrices explicitly. The control methodology employed assigns weights through the use
of a mixed-sensitivity objective function and derives the controller in order to minimize its
H norm. The control law is ensured to be feasible without driving the control inputs beyond
acceptable range. This design methodology, however, does not consider model uncertainties
but the accuracy is maintained through repeated identification.
An adaptive strategy is useful not only for linear systems with uncertain models, but also
for nonlinear systems. One such adaptive strategy to control nonlinear systems was proposed
in [62]. The underlying idea in this technique is linearizing the nonlinear model at different
operating regions and designing a controller for each region. The final control input is taken as
a weighted average of all the controllers, the weights reflecting the current region of operation.
The adaptive nature of this technique lies in the choice of weighting at each control instant.
However, the methodology is redundant with the use of multiple models for one nonlinear
system. Moreover, if the system reaches a region of operation that is not defined by any
combination of the previously identified models, then satisfactory control action cannot be
attained without performing identification again. This results in loss of adaptive nature of
the algorithm for such situations.
Addressing the need of an adaptive controller, there were several other papers; [63] proposed
adaptive Internal Model Control. In this technique, the controller is parameterized in terms
of the model of the system, hence the name internal model, using Youla parameterization.
The adaptive formulation of this method requires on-line estimation of the coefficients of the
internal model and the control law is in terms of these estimated parameters. Adaptive
nature here reflects the identification of the model online; the controller parameterizations
stay fixed. The paper also presents a robust version of the technique along with proving its
asymptotic stability properties.
Eventually, subspace based MPC designs evolved. One such method was proposed in [64]
which makes uses of latent variable method Principal Component Analysis (PCA) in order
to model the system in a reduced dimensional space.
MPC is suitable for time-varying systems, design of which can be done with a receding horizon
principle or recursive identification. To address the stability issue associated with the design
of the controller, use of constraints is a reliable and accepted approach; proof of the same can
be found in [65] and [66]. One such technique that ensures stability using constraints was
proposed in [67]. This method identifies the system on-line with the condition of updating
the model based on prediction error before and after update. This method also uses subspace
identification to model the system.
To go about the issue of requirement of accurate model for the design of MPC, a set of
methods were proposed that did not explicitly use a model of the system. An adaptive
control method based only on the output estimator, and not on the model, was proposed in
[68]. In this method, the output (
yk ) is estimated directly from past inputs (uk1 ) and past

outputs (yk1 ) as yk = a
yk1 + buk1 where the parameters a
and b are adaptively adjusted.
Using the estimated output and the parameters, a control law for the future is derived by
optimization.
Most of the controller design methodologies listed above require the system matrices explicitly.
The example shown in the beginning of this section demonstrates that an explicit model is not
necessary for the design of a control law. This stream of MPC, called model-free design or
Bhagyashri Telsang

Literature Survey

3-2 Benchmark models

21

SPC, was first proposed in [52]. SPC, in its original form, constitutes of system identification
for data retrieved in open-loop conditions. which if used in closed-loop conditions will result
in biased results. Hence, as a solution to this, a control technique for a system identified under
closed-loop conditions was proposed by the same authors in [69]. This technique combines
the steps of identification and control in an elegant manner. The name model-free derives
from the fact that the system matrices are never computed explicitly. The identification step
gives the subspace defining the underlying system, in the form of an observability matrix
and Markov parameters. Connecting with the demonstrated example, the subspace derived
The controller is designed using these parameters
from identification is captured in C and D.
directly, thus avoiding the need of explicitly computing the system matrices.
Starting with the survey on SID in Chapter 2 different ways of combining identification with
control were surveyed in this section. The requirements of different control methodologies
along with their advantages and disadvantages were also noted. In order to validate such
methodologies and to bring out their features and applicability, testing them on a standard
platform is necessary. To do so, an appropriate standard platform must first be selected,
which is the topic of next section.

3-2

Benchmark models

A model obtained from any identification method must be validated before it is used for
analysis or controller design. This is a way to evaluate how closely the estimated mathematical
model describes the underlying system. There are various ways to carry out validation of a
model; it could be evaluated in a worst-case scenario or in terms of average deviation. The
general idea is to compare the system output y(k) with the model output y(k) for the same
input. Few ways to mathematically perform the comparison are listed below. Let the two
outputs be compared for N number of values.
Maximum Absolute error:
JM AE = max |y(k) y(k)|
1kN

Sum of squared error:


JSSE =

N
X

(y(k) y(k))2

k=1

Mean squared error:


JM SE =

1
JSSE
N

Root mean squared error:


JRM SE =

JM SE

Variance accounting for:


JV AF = (1

Var(y(k) y(k))
)100%
Var(y(k))

where Var(x) is the variance of x. VAF considers the variance of the outputs, hence evaluating
in terms of second moment in their probability distribution. There exist more criteria that
Literature Survey

Bhagyashri Telsang

22

Controller formulation and validation

assess the model in terms of different statistical properties. As this assessment is effective
enough, it is the most widely used criterion for this task.
Along with evaluating a model estimated from an identification method, it is also necessary
to evaluate the method itself. Once a new identification technique is proposed, it needs to be
compared with the existing state-of-the-art methods in order to evaluate its performance. To
do so, there has to be a common platform in which all the comparisons are made so that the
evaluation is fair and unbiased. Benchmark data repositories and models exist to provide this
common platform. Although there cannot be one universal model that can be used to compare
all the methods, exhaustive work has been done in this regard to make available a wide range
of such models to meet the requirements of the method. For example, the methodology could
be applicable only to LTI, or LPV systems or it could be specific to nonlinear models.
DaISy, [70], is a database repository that provides an extensive number of datasets in different
fields like Process industry, Electrical/electronic, Mechanical, Biomedical systems etc. These
datasets are intended to be used to test the new methods proposed in the field of system
identification. National Institute of Standards and Technology (NIST), in [71], provides
datasets for assessing the accuracy of linear regression models among other fields. Apart
from availability of databases, benchmark criteria have been proposed to evaluate a method
in a specific field. For example, [72], holds the criteria of minimum variance as the benchmark
for evaluation of Linear Quadratic Gaussian (LQG) controllers.
For different categories, benchmark models either in the form of simulation model or experimental setup are available. A high purity distillation column, as in [73], has been used to
test linear and Weiner model identification techniques, [74]. In the field of process control, a
benchmark named "Tennesse Eastman" model was proposed in [75]. To simulate the process,
a set of Fortran subroutines are also available. University of EPFL provided a laboratory
setup - Active hydro-suspension system - that was used as benchmark problem to be solved
and all the solutions were published in the special issue of European Journal of Control in
2003. Since then it has been widely used as a benchmark model for individual works as well,
[76].
Active Vibration Isolation System (AVIS) has been proposed in [77] with an aim to provide a
benchmark to compare the capability of different black box, linear time invariant identification
techniques to model complex industrial systems. For a linear time-varying and parametervarying system, a dataset is provided by [78]. This dataset is exhaustive in the sense that
it includes different typical excitation scenarios, including band-limited noise, random phase
multisines with sparse excited frequencies etc. On a slightly different side, [79] provides
a dataset for a office-like simulated environment and this can be used as a benchmark for
testing data-driven techniques.
The technique that is proposed for the thesis work is intended to be used in combination with
MPC so that the result is an inherently adaptive controller. The underlying goal here is that
even when the system slowly changes, the resulting methodology should be able to account
for the changes and accordingly manipulate the control effort. Therefore, it is necessary to
test the to-be-formulated identification technique with datasets obtained from LTI and LPV
system. As this requirement is supplied by datasets provided in [78] and DaISy, these are
going to be used for assessment of the technique.
Based on the survey of considered fields, possible developments are listed and accordingly a
Bhagyashri Telsang

Literature Survey

3-2 Benchmark models

23

problem statement is defined in Chapter 4. Further, in order to obtain the solution to the
stated problem, a break-down of the work is given.

Literature Survey

Bhagyashri Telsang

24

Bhagyashri Telsang

Controller formulation and validation

Literature Survey

Chapter 4
Conclusion and Future Work

Starting with a brief review on the origins of the field System Identification, its branching
into PEM and SID was noted. Subspace techniques were considered for further studies.
The reasons and circumstances under which SID field started out were initially investigated,
after which the dominating techniques in the field were examined. Features of these important
methods and their comparison were briefly examined. The problems with SID techniques soon
came to light as they were too computationally expensive to be implemented on-line. This
led to the development of Recursive subspace techniques which turned out to be promising.
After probing into the recent developments of the field, nuclear norm based subspace identification techniques were seen to be developing rapidly. Nuclear norm was proven to be a
heuristic for rank minimization problems and the guarantee of obtaining low rank solutions
was also shown. The use of nuclear norm, which is a type of regularization, combined the steps
of projection/regression with order estimation. It also converted the optimization problem
into l1 minimization which is desirable due to the associated properties as seen in Chapter 2.
However, like original SID techniques, the computational burden associated with this heuristic surfaced. The solution of this problem is recognized as the direction for development in
this field. The algorithm proposed in [51], which is based on PBSID, is one of the solutions
in this direction and can be used as a standard to evaluate the to-be-formulated method.
The main reason for the use of recursive techniques is to identify the system on-line so that
the model can be supplied to an adaptive controller. Due to this, the study of controller
design techniques was performed and the motivation for using MPC in place of other control
techniques was given. Various adaptive formulations of MPC were surveyed. The validity of
combining identification and control design must be verified through the Separation Hypothesis. However, it was noted that the theory to carry out the validation does not exist, as of
today.
In order to validate the to-be-formulated methodology and evaluate its performance, a survey
on validation tests and benchmark criteria was carried out. The availability of different data
repositories that can be used to assess the potential of the method and compare it with other
state-of-the-art techniques was noted. Based on the features of the data repositories, suitable
ones were picked to be used further.
Literature Survey

Bhagyashri Telsang

26

Conclusion and Future Work

4-1

Future Work

With the review on the history and required developments, the future work necessary to
bridge the existing gap is:
Formulate a recursive version of nuclear norm based subspace technique and
combine it with control law formulation using a MPC strategy.
The above stated problem can be solved by breaking it down into the following steps:
Derive the conditions for verifying Separation hypothesis.
Formulate a computationally less expensive alternative to the use of SVD to solve the
nuclear norm minimization problem.
Track the subspace describing the underlying system while adding new and deleting old
measurement at every time instant.
Devise a recursive subspace identification technique by combining the above two points.
Validate the technique with the chosen identification database repository.
Design a control law in MPC framework with the available subspace.
Combine identification and control law in one step so that an inherently adaptive controller is in play at every time instant.
Test the entire methodology on a benchmark model.

Bhagyashri Telsang

Literature Survey

Bibliography

[1] Gevers, M, A personal view on the development of system identification, IEEE Control
Systems, Vol 26, Issue 6, 93-105, 2006.
[2] Ho, B. L. and Kalman, R. E, Effective construction of linear state-variable models from
input output functions, Regelungstech Prozess-Datenverarbeitung, 14, 545-548, 1965.
[3] strm, K. J. and Bohlin, T, Numerical identification of linear dynamic systems from
normal operating records, Theory of Self-Adaptive Control Systems, 96-111, 1965.
[4] strm, K. J and Eykhoff, P, System identification - a survey, Automatica, Vol 7,
Issue 2, 123-162, 1971.
[5] Ljung, L, System identification - theory for the user. 1987.
[6] Van Overschee, P and Moor, B, N4SID: Subspace algorithms for the identification of
combined deterministic-stochastic systems, Automatica, 30, 75-93, 1994.
[7] Verhaegen, M and Dewilde, P, Subspace model identification part I: The output-error
state space model identification class of algorithms, International Journal of Control,
56, 1187-1210, 1992.
[8] Larimore, W. E., Canonical variate analysis in identification, filtering and adaptive
control, Proceedings of 29th Conference on Decision and Control, 596-604, 1990.
[9] Van Overschee, P and Moor, B, Subspace Identification for Linear Systems. 1996.
[10] Favoreel, W, Moor, B, and Van Overschee, P, Subspace state space system identification
for industrial processes, Journal of Process Control 10, 149 - 155, 2000.
[11] Qin, J, An overview of subspace identification, Computers and chemical engineering
30.10 (2006): 1502-1513, 2006.
[12] Verhaegen, M and Verdult, V, Filtering and System Identification. 2007.
Literature Survey

Bhagyashri Telsang

28

Bibliography

[13] Jansson, M, Subspace identification and ARX modeling, IFAC Symposium on System
Identification, 2003.
[14] Qin, J and Ljung, L, Closed-loop subspace identification with innovation estimation,
2003.
[15] Fazel, M, Hindi, H, and Boyd, S, A rank minimization heuristic with application to
minimum order system approximation, Proceedings of American Control Conference,
4734-4739, 2001.
[16] Ljung, L, On the use of Regularization in System Identification, Department of Electrical Engineering, Linkoping University, 1992.
[17] Xu, J, Reweighted nuclear norm minimization for matrix completion, 2011.
[18] Cands, E. J and Tao, T, The power of convex relaxation: Near-optimal matrix completion, IEEE Transactions on Information Theory, 2053-2080, 2009.
[19] Hansson, A, Liu, Z, and Vandenberghe, L, Subspace system identification via weighted
nuclear norm optimization, IEEE Decision and Control, 3439 - 3444, 2012.
[20] Verhaegen, M and Hansson, A, N2SID: Nuclear Norm Subspace Identification, CoRR,
http://arxiv.org/abs/1401.4273, 2015.
[21] Verhaegen, M and Hansson, A, Nuclear Norm Subspace Identification (N2SID) for short
data batches, IFAC World Congress, 2014.
[22] Liu, Z and Vandenberghe, L, Semidefinite programming methods for system realization
and identification, Proceedings of the Joint 48th IEEE Conference on Decision and
Control and 28th Chinese Control Conference, 4676-4681, 2009.
[23] Ayazoglu, M and Sznaier, M, An algorithm for fast constrained nuclear norm minimization and applications to systems identification, IEEE Decision and Control, 3469-3475,
2012.
[24] Cai, J. F, Cands, E, and Shen, Z, A Singular Value Thresholding Algorithm for Matrix
Completion, SIAM Journal on Optimization, Vol. 20, 2008.
[25] Ji, S and Ye, J, An accelerated gradient method for trace norm minimization, Proceedings of the 26th International Conference on Machine Learning, 2009.
[26] Hsieh, C.J and Olsen, P, Nuclear norm minimization via active subspace selection,
Proceedings of the 31st International Conference on Machine Learning, Beijing, China,
2014.
[27] Mazumder, R, Hastie, T, and Tibshirani, R, Spectral regularization algorithms for
learning large incomplete matrices, Journal of Machine Learning Research 11, 22872322, 2010.
[28] Mohan, K and Fazel, M, Iterative reweighted algorithms for matrix rank minimization,
Journal of Machine Learning Research 13, 3441-3473, 2012.
Bhagyashri Telsang

Literature Survey

29

[29] Sadigh, D, Ohlsson, H, Sastry, S, and Seshia, S, Robust subspace system identification
via weighted nuclear norm optimization, CoRR, http://arxiv.org/abs/1312.2132, 2013.
[30] Liu, Z, Hansson, A, and Vandenberghe, L, Nuclear norm system identification with
missing inputs and outputs, Systems and Control Letters, Vol 62, Issue 8, 2013.
[31] Chiuso, A, Chen, T, Ljung, L, and Pillonetto, G, Regularization strategies for nonparametric system identification, IEEE conference on Decision and Control, 6013-6018,
2013.
[32] Pillonetto, G and Nicolao, G, A new kernel-based approach for linear system identification, Automatica, Vol 46, Issue 1, 2009.
[33] Prando, G, Chiuso, A, and Pillonetto, G, Bayesian and regularization approaches
to multivariable linear system identification: the role of rank penalties, CoRR,
http://arxiv.org/abs/1409.8327, 2014.
[34] Yang, B, Projection approximation subspace tracking, IEEE Transactions on Signal
Processing, Vol 43, Issue 1, 95-107, 1995.
[35] Lovera, M, Gustafsson, T, and Verhaegen, M, Recursive subspace identification of linear
and non-linear wiener state-space models, Automatica, Vol 36, Issue 11, 2000.
[36] Oku, H and Kimura, H, Recursive 4SID algorithms using gradient type subspace tracking, Automatica, Vol 38, Issue 6, 2001.
[37] Mercre, G, Lecoeuche, S, and Lovera, M, Recursive subspace identification based on
instrumental variable unconstrained quadratic optimization, International Journal of
Adaptive Control and Signal Processing, Wiley, 2004, 18, 771-797, 2004.
[38] Mercre, G, Lecoeuche, S, and Vasseur, C, A new recursive method for subspace identification of noisy systems: EIVPM, 13th IFAC Symposium on System Identification,
2003.
[39] Mercre, G, Bako, L, and Lecoeuche, S, Propagator-based methods for recursive subspace model identification, Signal Processing, 88, 2007.
[40] Chi, Y, Eldar, Y, and Calderbank, R, PETRELS: Parallel subspace estimation and
tracking by recursive least squares from partial observations, IEEE International Conference on Acoustics, Speech and Signal Processing, 3301-3304, 2013.
[41] Balzano, L, Nowak, R, and Recht, B, Online identification and tracking of subspaces
from highly incomplete information, CoRR, http://arxiv.org/abs/1006.4046, 2010.
[42] He, J, Balzano, L, and Lui, J, Online robust subspace tracking from partial information,
CorR, http://arxiv.org/abs/1109.3827, 2011.
[43] Van Der Veen, A. J, Deprettere, E. F, and Swindlehurst, A. L, Subspace-Based Signal
Analysis using Singular Value Decomposition, Proceedings of IEEE, Vol. 81, No. 9,
1993.
[44] Comon, P and Golub, G, Tracking a Few Extreme Singular Values and Vectors in Signal
Processing, Proceedings of IEEE, Vol. 78, No. 8, 1990.
Literature Survey

Bhagyashri Telsang

30

Bibliography

[45] Bischof, C. H and Shroff, G, On Updating Signal Subspaces, IEEE Transactions on


Signal Processing, Vol. 40, No. 1, 1992.
[46] Stewart, G. W, An Updating Algorithm for Subspace Tracking, IEEE Transactions on
Signal Processing, Vol. 40, No. 6, 1992.
[47] Stewart, G. W, Updating a rank-revealing ULV decomposition, Siam Journal on Matrix Analysis and Applications, Vol. 14, No. 2, pp. 494-499, 1993.
[48] Moonen, M, Van Dooren, P, and Vandewelle, J, A Singular Value Decomposition updating algorithm for subspace tracking, Siam Journal on Matrix Analysis and Applications,
Vol. 13, No. 4, pp. 1015-1038, 1992.
[49] Van Der Veen, A. J, A Schur method for low-rank matrix approximation, Siam Journal
on Matrix Analysis and Applications, Vol. 17, No. 1, pp. 139-160, 1996.
[50] Cai, J. F and Osher, S, Fast Singular Value Thresholding without Singular Value Decomposition, 2010.
[51] Navalkar, S and Wingerden, J. W, Nuclear Norm-Enhanced Recursive Subspace Identification: Closed-loop Estimation of Rapid Variations in System Dynamics, Paper for
review, 2015.
[52] Favoreel, W, Moor, B, and Van Overschee, P, Model-free subspace-based LQG-design,
Proceedings of ACC, Vol 5, 3372-3376, 1999.
[53] Cutler, C. R and Ramaker, B. L, Dynamic matrix control - a computer control algorithm, AIChE 86th National Meeting, 1979.
[54] Clarke, D. W, Mohtadi, C, and Tuffs, P. S, Generalized predictive control - Part I: the
basic algorithm, Automatica, 23, 1987.
[55] Morari, M and Lee, J, Model predictive control: past, present and future, Computers
and Chemical Engineering, Vol 23, Issues 4-5, 1998.
[56] Lee, J, Morari, M, and Garcias, C, State-space interpretation of model predictive control, Automatica, Vol 30, Issue 4, 1994.
[57] Li, S, Lim, K. Y, and Fisher, D. G, A state space formulation for model predictive
control, AIChE Journal, 35, 241-249, 1989.
[58] Gilbert, E. G and Tan, K. T, Linear systems with state and control constraints: the
theory and application of maximal output admissible sets, IEEE Transactions on Automatic Control, 36 (9), 1991.
[59] Zheng, Z. Q and Morari, M, Control of linear unstable systems with constraints, Proceedings of the American Control Conference, 37043708, 1995.
[60] Campo, P. J and Morari, M, Robust model predictive control, Proceedings of the
American control conference, 1987.
[61] Woodley, B, How, J, and Kosut, R, Subspace based direct adaptive H-infinity control,
2001.
Bhagyashri Telsang

Literature Survey

31

[62] Dougherty, D and Cooper, D, A practical multiple model adaptive strategy for singleloop MPC, Control Engineering Practice 11, 2003.
[63] Datta, A and Ochoa, J, Adaptive internal model control: Design and stability analysis,
Automatica, Vol 32, Issue 2, 1996.
[64] Cerrillo, J and MacGregor, J, Latent variable MPC for trajectory tracking in batch
processes, Journal of Process Control, Vol 15, Issue 6, 2005.
[65] Keerthi, S. S and Gilbert, E. G, Optimal infinite-horizon feedback laws for a general class of constrained discrete-time systems: stability and moving-horizon approximations, Journal of Optimization Theory and Applications, 1988.
[66] Bemporad, A, Chisci, L, and Mosca, E, On the stabilizing property of the zero terminal
state receding horizon regulation, Automatica, 30(12), 1994.
[67] Luo, X, Jiang, M, and Chen, X, Online subspace-based constrained adaptive predictive
control with state-space model, Journal of Convergence Information Technology, Vol 8,
Issue 2, 1-10, 2013.
[68] Mizumoto, I and Fujimoto, Y, Adaptive predictive control system design with an adaptive output estimator, IEEE 51st Conference on Decision and Control, 5434-5441, 2012.
[69] Favoreel, W, Moor, B, and Gevers, M, Closed-loop model-free subspace based LQG
design, 1999.
[70] Moor,
B,
DaISy:
Database
for
the
http://homes.esat.kuleuven.be/ smc/daisy/, 2012.

Identification

[71] NIST,
NIST
Statistical
Reference
Datasets
(STRD)
http://www.itl.nist.gov/div898/strd/general/bkground.html, 2013.

of

Systems,
Description,

[72] Grimble, M. J, Controller performance benchmarking and tuning using generalised minimum variance control, Automatica 38, 2002.
[73] Skogestad, S, Dynamics and control of distillation columns: a tutorial introduction,
Transaction of the Institution of Chemical Engineers, Part A, Chemical Engineering
Research and Design 75, 539-562, 1997.
[74] Bloemen, H. H. J, Chou, C. T, Van den Boom, T. J. T, Verdult, V, Verhaegen, M, and
Backx, T. C, Weiner model identification and predictive control for dual composition
control of a distillation column, Journal of Process Control 11, 601-620, 2001.
[75] Downs, J. J and Vogel, E. F, A plant-wide industrial process control problem, Computers and chemical Engineering, Vol. 17, No. 3, 245-255, 1993.
[76] Yin, S, Luo, H, and Ding, S, Real-time implementation of fault-tolerant control systems
with performance optimization, IEEE transactions on Industrial Electronics, Vol. 61,
No. 5, 2014.
[77] Voorhoeve, R, Rietschoten, A, Geerardyn, E, and Oomen, T, Identification of HighTech Motion Systems: An Active Vibration Isolation Benchmark, Preprints of the 17th
IFAC Symposium on System Identification, 2015.
Literature Survey

Bhagyashri Telsang

32

Bibliography

[78] Lataire, J, Louarroudi, E, Pintelon, R, and Rolain, Y, Benchmark data on a linear timeand parameter-varying system, 17th IFAC Symposium on System Identification, 2015.
[79] Risuleo, R, Molinari, M, Bottegal, G, Hjalmarsson, H, and Johansson, K. H, A benchmark for data-based office modeling: challenges related to co2 dynamics, 17th IFAC
Symposium on System Identification, 2015.

Bhagyashri Telsang

Literature Survey

Glossary

List of Acronyms
SID

Subspace IDentification

MIMO

Multi Input Multi Output

PEM

Prediction Error Methods

SISO

Single Input Single Output

N4SID

Numerical algorithms for Subspace State Space System Identification

MOESP

MIMO Output-Error State Space model identification

CVA

Canonical Variate Analysis

SVD

Singular Value Decomposition

AIC

Akaike Information Criterion

PBSID

Predictor-Based Subspace IDentification

ARX

Auto Regressive eXogenous

ARMA

Auto Regressive Moving Average

ARMAX

Auto Regressive Moving Average with eXogenous input

SSARX

SubSpace ARX

SNR

Signal-to-Noise Ratio

N2SID

Nuclear Norm based Subspace Identification

IVM

Instrumental Variable Method

SDP

Semi-Definite Programming

PAST

Projection Approximation Subspace Tracking

Literature Survey

Bhagyashri Telsang

34

EIVPM

Glossary

Extended Instrumental Variable Propagator Method

PETRELS Parallel Estimation and Tracking by REcursive Least Squares


GROUSE Grassmanian Rank-One Update Subspace Estimation
GRASTA Grassmannian Robust Adaptive Subspace Tracking Algorithm
ADMM

Alternating Direction Method of Multipliers

MPC

Model Predictive Control

DMC

Dynamic Matrix Control

GPC

Generalized Predictive Control

PCA

Principal Component Analysis

LTI

Linear Time-Invariant

LPV

Linear Parameter-Varying

PM

Propagator based Methods

DaISy

Database for Identification of Systems

NIST

National Institute of Standards and Technology

LQG

Linear Quadratic Gaussian

EPFL

cole Polytechnique Fdrale de Lausanne

AVIS

Active Vibration Isolation System

PID

Proportional Integral Derivative

VAF

Variance Accounting For

SPC

Subspace Predictive Control

Bhagyashri Telsang

Literature Survey

You might also like