You are on page 1of 12

Journal of Process Control 17 (2007) 539550 www.elsevier.

com/locate/jprocont

A multi-variate Hammerstein model for processes with input directionality


Gerrit Harnischmacher, Wolfgang Marquardt
Lehrstuhl fur Prozesstechnik, RWTH-Aachen, D-52064 Aachen, Germany Received 4 August 2006; received in revised form 4 December 2006; accepted 4 December 2006

Abstract A new formulation of a block-structured model based on the Hammerstein operator is presented for the identication of multi-variate systems with input directionality. In contrast to the existing formulations for multi-variate Hammerstein models, the proposed structure oers the possibility to independently model the dynamic and nonlinear characteristics of the system and at the same time preserves the possibility to use the new ecient algorithms developed for the identication of single input Hammerstein models. Further, the formulation allows for a representation of arbitrary static nonlinear coupling of input variables with a considerably lower amount of parameters compared to existing formulations. The new model structure is applied to the identication of a uid catalytic cracking (FCC) unit and signicantly outperforms all previous multi-variate Hammerstein model structures by reducing the prediction error by over 50%. 2006 Elsevier Ltd. All rights reserved.
Keywords: Hammerstein model; Multi-variable block-oriented model; Nonlinear identication; Block-structured model

1. Introduction Hammerstein models have been used in systems identication since the 1960s [22] and have been successfully applied to control in such dierent elds as neuroprothesis [17] and chemical engineering [30]. They may be viewed as the prototype of the class of block-structured models [24] and were the rst, for which an ecient identication algorithm existed [22]. Because of the independence of the nonlinear static and linear dynamic elements, the identication problem can be greatly simplied for single-input singleoutput (SISO) models, if the nonlinearity is known [23] or if the characteristics of dierent excitation signals are exploited [3]. This model structure is particularly attractive for process systems modeling because steady-state information is often available from process design or from historical data [23,27]. In that case, plant tests are only needed to generate data for a standard linear identication

Corresponding author. Tel.: +49 241 8094668; fax: +49 241 8092326. E-mail address: marquardt@lpt.rwth-aachen.de (W. Marquardt).

experiment. The proposed formulation extends the possibility to use these sources of information to the multi-variate case. While there is a wealth of papers treating multi-input multi-output (MIMO) Hammerstein models (see Section 2.2), applications of these models are rare. At the same time, the published control applications based on singleinput Hammerstein models are inherently multi-variable. This apparent inconsistency may be attributed to the limitations of existing MIMO Hammerstein model formulations. As will be shown in Section 2, existing formulations either do not suciently capture the system nonlinearity or lead to very demanding parameter identication problems. A new formulation accounting for these problems as well as the question of the under-modeling of real processes is developed and analyzed in Section 3. Strictly speaking, the new formulation results in a complexity beyond the Hammerstein operator, as most existing MIMO Hammerstein models do. However, it maintains the independence of the linear and nonlinear elements known from SISO Hammerstein models, which can be exploited for identication and control applications, for

0959-1524/$ - see front matter 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.jprocont.2006.12.001

540

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

example, in [17,30]. Section 4 demonstrates the performance of the proposed model structure in a simulation study, before the paper is concluded in Section 5. 2. Background 2.1. The Hammerstein operator The Hammerstein operator is at the core of Hammerstein modeling. It refers to an integral equation studied by Hammerstein [15] and has been used by Narendra and Gallman [22] in their seminal work on Hammerstein model identication. Narendra and Gallman considered systems that can be expressed as Z y H ut Ksf ut s ds: 1
X

This operator denes a single-input single-output, timeinvariant, nonlinear impulse response model dened on an input space X. Narendra and Gallman [22] also introduced the very intuitive block-diagram notation for the Hammerstein model depicted in Fig. 1, which has since been used extensively in the discussion of block-structured models. Most of the literature on Hammerstein models is concerned with discrete-time models. Using the block-diagram notation of Fig. 1, the scalar discrete-time Hammerstein model is generally dened as yk
na X i1

is not straightforward, as will be shown below. This paper therefore only deals with the multi-input single-output (MISO) case for the sake of simplicity, as it straightforwardly generalizes to the MIMO case. The following considerations therefore deal with identifying multi-variate processes using an appropriate block-structured model. It follows from (1), that multi-variate formulations of Hammerstein models can consider vector-valued input variables u as well as matrix kernels K(s) and vector functions f[]. Models with matrix kernels K(s) and vector functions f[] have been introduced for scalar inputs u as Uryson models [12], referring to a corresponding integral equation studied by Uryson, or as generalized Hammerstein models, where each function fj[] is a basis function. A multi-input Hammerstein model is dened by an operator of type (1) with a vector-valued input signal u. When considering vector-valued input variables, matrix kernels K(s), and vector functions f[], their respective dimensions are obviously independent degrees of freedom of the model structure. The SISO Hammerstein model (2) straightforwardly extends to the MISO Hammerstein model yk
na X i1

ai y ki

nb X i0

bi N uki

RB

ai y ki

nb X i0

bi vki ;

2a 2b

vk N uk ;

where uk and yk are the input and output sampled at times t = tk, k = 0, 1, . . . K. N uk : R1 ! R1 is an arbitrary, nonlinear, memoryless function mapping the scalar input uk into the nonmeasurable scalar intermediate variable vk. The parameters a and b in Eq. (2a) dene a linear timeinvariant dynamic system. The summation index i runs from 1 to na = dim(a) for the delayed outputs and accordingly from 0 to nb = dim(b) 1 for the current and delayed inputs. Analogous denitions of the number of summands are used in all subsequent models. An additional term accounting for disturbances at the output may be added [20], but will be omitted here for the sake of simplicity. 2.2. Multi-variate Hammerstein models Multi-output Hammerstein models can be designed by simply using one Hammerstein model for each output. In contrast, the design of a multi-input Hammerstein model
uk vk yk uk vk yk

considered by Rollins et al. [25,26] by replacing N(uk) by N uk : Rdimu ! R1 . The block-diagram structure is depicted in Fig. 1, (right); we will term it RB model for further reference. However, only a very limited variety of dynamic behavior can be represented as the dynamic element is of scalar type (see, e.g., the chemical reactor example in [26]). In fact, a linear multi-input model is capable of representing more complex dynamic behavior, namely dynamics which depends on the direction of the input vector. It is well known that such input directionality is a dening characteristic of multi-variate systems. As an example consider the response of the uid catalytic cracking (FCC) unit we treat as a sample process in Section 4. Fig. 2, (right), shows the response to subsequent steps in the two input variables. Clearly, the dynamic response to the step at t0 changes qualitatively with the direction of the step in the input space. To show, that this is indeed a multi-variate eect and not a nonlinear eect, the other input is stepped at t150 showing that this qualitative dierence in the response is independent of the starting point. To address the problem of input directionality, a multi-input Hammerstein model based on separate nonlinearities, yk
nu X j1

y j;k ; aj;i y j;ki


nbj X i0

KUa

N(uk )

N(uk )

y j;k

naj X i1

bj;i N j uj;ki

KUb

Fig. 1. Discrete-time SISO Hammerstein model (left, cf. (2)) and MISO Hammerstein model (right, cf. (RB)) in block-diagram notation: N() static nonlinear element, L linear dynamic element.

has been introduced by Kortmann and Unbehauen [20]. Its block-diagram structure is depicted in Fig. 3 and we will

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550


964 980 970 T [ F]
ra

541

962 960 958 956 410 400 41 390 40 R [ton/min]


rc

960 950 940 420 42

T [ F]

ra

954 0 50 100 150 200 250

R [Mlb/hr]
ai

Fig. 2. FCC unit steady-state gain function (left) and step responses (right, solid line: RaiRrc, dotted line: RrcRai).

v1,k
N1(u1,k)

v1,k
L1

y1,k yk uk
N(uk )

v1,k
L1

y1,k yk

+ ynu,k

+ ynu,k

vnu,k
Nnu(unu,k )

vnu,k

L nu

vnu,k
Lnu

Fig. 3. Block diagrams of KU model (left, cf. (KU)) and EJL model (right, cf. (EJL)).

term it KU model for further reference. It consists of a set of nu = dim(u) scalar Hammerstein models containing scalar nonlinear maps N j uj : R1 ! R1 driven by scalar inputs uj, where the output is calculated as the sum of the outputs yj of the model in channel j. The KU model can obviously represent input-directional dynamics via the linear systems diering for each channel, but it cannot represent any nonlinear coupling among the input variables. Such nonlinear coupling is another dening characteristic of nonlinear multi-variable processes. Again, consider the FCC unit example, the steady-state gain surface of which is also depicted in Fig. 2(left). With nonlinear gains modeled independently in each direction, slope and curvature of the gain functions in one direction would be independent from the other input. This is clearly not the case in the real process. Consequently, Eskinat et al. [11] found the capabilities of the KU model to be limited, when applied to a real process. The strength of the model, however, lies in the ease of identication. Each channel of the model contains a scalar Hammerstein model. Hence, all techniques for scalar Hammerstein model identication remain applicable. The KU model has been extended to parallel channels of WienerHammerstein systems [4] and to networks of scalar block-structured models [8]. State-space rather than inputoutput representations have also been used for the linear element [21]. A MISO Hammerstein model based on a combined nonlinearity,

yk

nu X j1

y j;k ; aj;i y j;ki


nbj X i0

EJLa

y j;k

n aj X i1

bj;i N j uki

EJLb

has been introduced by Eskinat et al. [11]. Its block-diagram structure is depicted in Fig. 3, (right); we will term it EJL model for further reference. While the authors consider dim(u) nonlinear functions N j u : Rdimu ! R1 , the model structure has later been extended to an arbitrary number of nonlinear elements, independent from dim(u) [21,28], making the structure even more similar to the Uryson model discussed below. Arbitrary basis functions [14], neural networks [27], and support vector machines [13] were proposed for the nonlinear element. State-space models have also been employed for the linear element [21]. In contrast to the KU model, the EJL model can represent the nonlinear coupling of input variables. Eskinat et al. [11] found it to be a much better representation of the multivariable distillation column they studied. Based on the model structure, the EJL model should more accurately be termed a multi-input Uryson model [12]. The Uryson model is dened as a set of nN parallel Hammerstein models, with dierent nonlinear and linear elements in each channel but driven by the same input. These properties hold for the EJL model, except that nN was originally restricted to dim(u). However, it is a

542

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

straightforward exercise to transform the EJL model into a Uryson model by developing the nonlinear element N() into a basis function expansion. Then, a Uryson model containing one of the nN basis functions fj in each channel nN X y j;k ; 5a yk
j1

y k y S;k y D;k ; y S;k N uk ; y D;k X


i1 n aD

6a 6b X
i0 nbD

aD;i y D;ki

bD;i N uki

6c

y j;k

X
i1

naj

aj;i y j;ki

X
i0

nbj

bj;i fj uki

5b

is straightforwardly obtained. Note, that the parameters aj and bj in (5b) are fundamentally dierent from the respective parameters in (EJLb). The structure of the EJL model implies, that identication concepts based on an independent identication of the static and dynamic elements (e.g., [3,23,25]) are not applicable. To identify the linear element, an input signal would be required, which excites only one fj() in (5b) or one Nj() in (EJLb). Since all nonlinear elements are driven by the same input, this is generally not possible. Instead, if a suitable set of basis functions is known, the overparameterization method of Chang and Luus [7] straightforwardly extends to the multi-variate case: If fj() are known basis functions, the Uryson model (5) is linear in its identiable parameters and hence can be obtained by linear identication. This method has been used in combination with dierent sets of basis functions, for example [13,27]. However, for large sets of basis functions, for example, neural networks, this leads to challenging identication problems. In summary, while oering the greatest modeling exibility of the three existing multi-variate Hammerstein structures, the EJL model looses the main advantage of Hammerstein modeling, namely the ease of identication. 3. Hammerstein model for input-directional dynamics In this section we develop a new multi-variate Hammerstein model structure. It is designed such that the nonlinear element N() can be an arbitrary nonlinear function representing the steady-state behavior of the process, which can then be identied independently using the method of Rollins et al. [25]. For this method to be applicable, we assume the steady-state gain of the system to be dierent from zero, which is the case for most physical systems, but may exclude some cases, such as certain electrical circuits. We decouple the dynamic response of the model with respect to all inputs, such that the linear elements can be identied independently using the method of Bai [3]. 3.1. Model derivation We start our model derivation from a MISO Hammerstein model of type (RB). We reformulate the linear model of (RB) to the parallel structure

comprising a static channel yS and a dynamic channel yD, where aD and bD can be determined analytically to match the linear element of (RB). The static channel (6b) completely denes the static behavior of the model including the nonlinear coupling of input variables, while (6c) denes the dynamic deviation thereof. As a consequence of this reformulation
nbD X i0

bD;i 0

holds for arbitrary values of the parameters a and b. To decouple the response of the dynamic channel (6c) with respect to its scalar inputs uj, we represent the delayed inputs N(uki) to the linear element (6c) by the Taylor-series expansion of N() at uk:  !l  nu 1 X 1 X  o Duj;ki N u N uk Duki N uk  l! j1 ouj  l1
uuk

8 with Duj,ki being the jth element of Duki uki uk : 9

The products of the dierential operators in (8) are to be interpreted as higher order dierentiation operators, for example,    o o o2 N u   :  N u 10 ouj1 ouj2 ouj1 ouj2 uuk uuk Next to the term N(uk), (8) contains terms depending only on one input Duj,ki as well as coupling terms, that is, terms with j1 5 j2. We pragmatically neglect the latter and discuss the consequences below. This reduces the dynamic channel to y D;k vD;ki
n aD X i1

aD;i y D; ki

n bD X i0

bD;i vD;ki ; :
uuk

11a 11b

 l nu  1 X X 1 o  Duj;ki N uk N u  l! ouj j1 l1

We can now split the sum over j into nu independent channels each containing the terms which depend only on Duj,ki. Since (7) holds, we can add N(uk) to every channel obtaining

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550


nu X j1

543

y D;k

y D;j;k ; aD;i y D;j;ki X


i1 nbD

12a bD;i vj;ki ; 1 A:


uuk

u1 u2 u3

y D;j;k

X 0
i1

naD

12b
tm
tk

vj;ki

 l  1 X 1 o  @N uk Duj;ki N u  l! ouj l1

12c

tk

tk +1

tk + 2

Fig. 4. Example of inputs reformulated to the time discretization Dtm according to (17).

The input vj,ki to each linear element j is now equal to the Taylor series expansion of N(u) in the direction of uj,ki at uk. Hence, we can simplify the expression for vj,ki to vj;ki N uk Duj;ki ej
T

13

bounded above by Dtk and well acceptable for a properly chosen sampling interval Dtk. Finally, determine the output yk from the output of the reformulated model by y k y m with tm tk1 Dtm : 18

with ej = (0, . . . , 1, . . . , 0) being the unit vector in direction j. In (12b) we can allow for dierent dynamic elements in each channel. This yields a model, where N() can be an arbitrary function, yet input directionality can be modeled and the linear and nonlinear elements can be identied independently: y k N uk y D;j;k
naD;j X i1 nu X j1

Note, that the sampling interval Dtk of the input uk and output yk from the process remains unchanged by this reformulation. We can then state the Hammerstein model based on deviation dynamics for the measurements on the k-grid as y k y m with tm tk Dtm ; nu X y D;j;m ; y m N um
j1

y D;j;k ;
nbD;j X i0

14a

HMa HMb

aD;j;i y D;j;ki

bD;j;i N uk Duj;ki ej :

14b

However, this model suers from a decoupling error incurred by neglecting the coupling terms in (11). This error can become large for strongly coupled systems. The coupling terms neglected in (11) are indeed all equal to zero, if only one input j* is excited, that is, if Duj;ki 0 8j : j 6 j 15

y D;j;m

naD;j X i1

aD;j;i y D;j;mi

nbD;j X i0

bD;j;i vj;mi ;

HMc HMd

is introduced in (8). With (7) the decoupling error of (14) is zero, if (15) holds for all i 6 nbD;j . For arbitrary inputs, the decoupling error can be eliminated by a second reformulation of the model. We dene a new sampling time Dtk Dtm Pnu j1 nbD;j 16

vj;mi N um Duj;mi ej ; 8 j > > uj;k1 8tm : tk 6 tm < tk P nb Dtm ; > D;i < i1 uj;m j > P > > uj;k 8tm : tk nbD;i Dtm 6 tm < tk1 : :
i1

HMe

and refer to the discretization grids with sampling times Dtm and Dtk as m-grid and k-grid, respectively. We reformulate the model (14) to the new m-grid. The parameters aD,j and bD,j are determined such that the original time constants remain unchanged. On the m-grid, we dene the reformulated inputs 8 j > > uj;k1 8tm : tk 6 tm < tk P nb Dtm ; > D;i < i1 17 uj;m j > P > > uj;k 8tm : tk nbD;i Dtm 6 tm < tk1 : :
i1

We will refer to this model as HM model in the sequel. The block-diagram structure of this model, depicted in Fig. 5, is much more complex than those of the other MISO Hammerstein models, but as we will see below, the properties of the model are much more similar to the simple SISO Hammerstein model than to those of existing MISO formulations. 3.2. Model properties The structure of the HM model allows the identication of input-directional dynamics in multi-variate processes. This property is combined for the rst time with the possibility to identify arbitrary nonlinear static models. At the same time the independence of the nonlinear and linear elements is preserved. This allows for simplied identication of the model, increases interpretability and facilitates the incorporation of application-specic nonlinear maps. 3.2.1. Input-directional dynamics The HM model accounts for linear dynamic coupling of input variables. This property is shared by the KU model,

An example of this input reformulation with nu = 3 and nbD;j 2; j 1; . . . ; 3 is depicted in Fig. 4. For the reformulated inputs, (15) holds for all j and i 6 nbD;j . Hence, all coupling terms in (8) are zero. The remaining error is the delay incurred by the denition of the inputs in (17). It is

544

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

uk

N(uk)

H1, 1 H1, dim(bl,1) Hnu,1 Hnu,dim(bn )


u

N(um + u1,me1) N(um + u1,me1) N(um + unu ,menu ) N(um + unu ,menu )

v1,m
L1

yk

y1,m

R1

y1,k

v1,k- dim(b )
1

vn ,k
u

L nu vnu,k-dim(bn )
u

yn ,m
u

R nu

yn ,k
u

Fig. 5. Block diagram of new HM model: N(), nonlinear static map; Lj(), linear transfer function in channel j; Hj,i(), reformulation of inputs according to (17) including zero order hold of i sampling steps Dtm; Rj, calculation of output on the k-grid from output on the m-grid according to (18).

but not by the RB model. The EJL model can exhibit input-directional dynamics to the extent of the corresponding Uryson model, which includes the linear case discussed below. However, for the linear case the EJL model is equal to the KU model, because no nonlinear couplings of input variables exist. The example process in Section 4 also shows dynamic behavior which may not be represented as a linear combination of the responses to changes in each input. Such behavior may be modeled in block-structured form only by adding additional linear and nonlinear elements in sequence [8], which is beyond the complexity of the Hammerstein model and poses challenging identication problems. Theorem 1. The HM model contains a linear dynamic model with nu inputs as a special case and can account for the same linear input-directional dynamics. Proof. For the linear case, with u a vector of constants, N u uT u holds. The dynamic channels of (HMc) reduce to y D;j;m
naD;j X i1 n

With the static channel of (HM) being y S;m uT um 23

for the linear case and the equivalence of (RB) and (6), (HMb)(HMd), reduce to ym
dimu X j1 naj

y j;m ; aj;i y j;mi


nbj X i0

24a bj;i uj uj;mi ; 24b

y j;m

X
i1

which is a linear multi-input model. The denitions of um in (HMe) and yk in (HMa) are linear functions. Since in (24) all channels are independent, these reformulations amount to simple delay operators. The modeling of these by an appropriate choice of bj is standard in linear discrete-time modeling. h 3.2.2. Arbitrary nonlinear static models For the HM model the independence of the dynamic channels is preserved regardless of the structure of the nonlinear element. This property is unique to the new HM model, as the KU model explicitly restricts the structure of the nonlinear element and the dynamic channels of the EJL model are only independent, when it degenerates to the KU model. Theorem 2. For the excitation of one single input variable uj , the HM model reduces to a single-input single-output Hammerstein model containing Lj as the linear element regardless of the structure of the nonlinear element.

20

aD;j;i y D;j;mi

bD;j X

bD;j;i 21

uT um uj Duj;mi : With (7) and (9) this straightforwardly reduces to y D;j;m X


i1 naD;j nbD;j

i0

aD;j;i y j;mi

X
i0

bD;j;i uj uj;mi :

22

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

545

Proof. Since (7) holds for all linear elements Duj;mi 0 8m ) y D;j;m 0 8m: 25

Hence, if Duj,mi = 0 "m holds for all j 5 j*, i.e. all uj,k but uj ;k are constant for all k, (HMb)(HMd) reduce to y m N um y D;j ;m ;
naD;j

26a
nb

y D;j ;m

X
i1

aD;j ;i y D;j ;mi

X
D;j

bD;j ;i vj ;mi ;

26b 26c

i0

vj ;mi N um Duj ;mi ej :

Again with the equivalence of (RB) and (6), (26) is equivalent to a SISO Hammerstein model. With uj,m = const. "j : j 5 j*, (HMa) and (HMe) again introduce only simple delay operators the modeling of which by suitable choice of bj is standard in discrete-time Hammerstein modeling. h 3.2.3. Independence of nonlinear and linear elements Just as in the scalar Hammerstein model, the structures of the nonlinear and linear elements of the HM model are independent. This property is shared by the KU model and the RB model, but not by the EJL model, unless it degenerates to the KU model. Lemma 1. The static response of the HM model is determined entirely by the nonlinear element, regardless of the structure of the linear elements. The linear elements of the HM model can be determined by identifying a linear ^ model of the system and replacing its gain u by 1. Proof. The rst part is a direct consequence of the equivalence of (RB) and (6). The second part is a direct consequence of Theorem 1, because replacing N(u) by a constant gain u directly leads to a linear model with gain u. h 3.2.4. Complexity and interpretability Single-input Hammerstein models comprise a very intuitive yet narrow generalization of a linear model. In contrast to Wiener models for example, the dynamic behavior of the Hammerstein model is completely dened by its linear element. The nonlinear element provides the steady-state gain, and hence the intermediate variable of the SISO Hammerstein model has a physical meaning: It is the future steady-state value of the output corresponding to the current input. This interpretability is fully lost by the EJL model. It is preserved by the new model as well as the RB and KU models. The KU model as well as the new HM model also preserve the close ties to the linear model, such that step response experiments can be used for the identication of their linear elements. Hence, much more easily interpretable data can be used for the identication in contrast to the more complex excitation signals required for the identication of the RB model. To discuss model complexity, we assume that the nonlinear element can be represented by an expansion in nN

basis functions. The linear elements contain dim(hL) = dim([aT, bT]T) parameters. The EJL model contains the most Pnu parameters with dimhEJL nu nN j1 dimhL;j . The Pnu HM model contains only dimhHM nN j1 dimhL;j parameters. For an example with three inputs, 15 basis functions in the nonlinearity and second order linear elements, the HM model contains 27 parameters, which is less than 50% of the 57 parameters of the corresponding EJL model. The RB model is obviously the least complex containing dim(hRB) = nN + dim(hL) parameters, or, respectively, 19 in the example stated above, which is 30% less than the number of parameters of the HM model. However, this reduction comes a the cost of greatly reduced modeling exibility. The basis function expansion of the KU model contains only those basis functions, which are R1 ! R1 . Its complexity is therefore not directly comparable to the other three models. 3.3. Identication method for the new HM model As stated in (13), the nonlinearity N() is the same in all channels. Theorem 1 states that it may be of arbitrary structure. We may therefore identify a nonlinear estimator N u : Rdimu ! R1 independently from the linear elements using the method of Rollins et al. [25]. Therefore, for a suitable choice of steady-state experiments with inputs ul and process outputs ~l we minimize y X JN 27 y k~l N ul k:
l

N(u) can be any nonlinear function parameterized by some h including, for example, polynomials or neural networks. Hence, identication of N(u) comprises of the choice of an appropriate estimator N(u) and determination of its parameters via (27). As stated in Theorem 2, the HM model reduces to a single-input Hammerstein model for excitation of one single input. Hence, the method of Bai [3] directly applies to each dynamic channel of the HM model. Since the nonlinearity of the model is not excited by the step or the pseudo random binary sequence (PRBS) used for identication, the nonlinearity may be replaced by its secant approximation between the lower and upper values of the step or PRBS. First, we determine an estimate of the dynamics in channel j after excitation with uk = uk0 + Duj,k ej, and Duj,k a PRBS or step signal and solving min
aj ;bj

J Lj yk

K X k1 naj

k~k y k k y
nbj X i0

28a bj;i uj uj;ki c0;j ; 28b

s:t:

X
i1

aj;i y ki

where uj and c0,j are the parameters of the secant approximation of N(u) between the lower and upper bounds of the step or PRBS. The linear system in channel j of the HM model is then obtained by setting uj = 1, c0,j = 0 according to Lemma 1. Then, we perform the reformulations

546

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

discussed in Section 3.1 to eliminate the decoupling error and to obtain the parallel structure of the HM model. The proposed identication method identies both, the linear and nonlinear elements without knowledge of the other. Hence, errors in the identication of one element, which are inevitable due to the structural under-modeling of real processes, will not carry over into the identication of the other element. This two-stage identication method is especially suitable for applications in the chemical industry, where the steady-state behavior of the process is either known [23] from rst principles modeling or can be much easier identied than the dynamic behavior [25,27]. In principle, the HM model can also be identied using the modied algorithm of Narendra and Gallman, which is suitable for identication of the EJL model [11]. However, in this case the structure of the model is not exploited. 3.4. Application specic nonlinear maps Originally, polynomial nonlinear maps were used in Hammerstein systems identication [22] because of their simplicity. However, these are limited in the nonlinear behavior they can identify. Precisely, they tend to oscillate when saturation is to be represented, that is, the gain of an actuator approaching zero as the unit approaches a physical limitation. This can be overcome by means of other sets of basis functions or rigorous models. If the identication of the linear and nonlinear elements is not independent, as in the case of the EJL model, identication algorithms need to be tailored to the respective nonlinear maps. In the case of the HM model, the identication of the two elements is independent. Therefore, rigorous steady-state process models can be incorporated in the block-structured HM model without further modication. Furthermore, any other nonlinear function approximation such as articial neural networks (ANN) can be used in connection with the new HM model structure together with existing identication algorithms without the need of any further modication. In this section we will consider three nonlinear maps, which are more complex than polynomial representations: articial neural networks, sparse grid representations, and rigorous models. They will also be benchmarked on the simulation example in Section 4. A popular choice for a set of basis functions to approximate a nonlinear function is the use of an ANN with one hidden layer, which may be represented as N u
S X j1

approximation is continuous with continuous rst-order derivatives. The latter allows to use these models as constraints in dynamic optimization problems together with standard gradient based solvers (see [16] for an application with the new HM model). The main drawback of the ANN representation is that the identication is nonlinear in the parameters and can therefore converge to a local minimum. The popularity of ANN in nonlinear function representation has motivated the development of tailored identication algorithms for their incorporation in SISO Hammerstein models [1,27]. Multi-dimensional spline models feature the same universal approximation capabilities as ANN, but result in a linear parameter estimation problem [29]. However, the number of basis functions increases exponentially with the number of input variables, because the basis functions are dened on a full discretization grid of the input space. We therefore propose to use a much more ecient representation based on sparse grid approximation [5,6]. The sparse grid representation of the nonlinear map X N u ws fs u 30
s2I

is the weighted sum of the approximations on a minimal set I of subgrids X fs u hs;j /s;j u 31
j2K

aj gj bT u cj d; j

29

where a and d are the weights and bias of the output layer and bj, cj, and gj() are the weights, biases, and transfer functions of the S neurons j of the hidden layer. Such networks are known to be able to represent any continuous nonlinear function to arbitrary precision [9]. The function

with weights ws and parameters of the subgrids hs,j. The sparse-grid approximation uses local basis functions /s,j(), which are derived from a one-dimensional piecewise linear basis by a tensor product construction. Details on discretization and regularization of the sparse grid can be found in [5,6,19], for example. While the sparse-grid representation is favorable with respect to identication, its rst order derivatives are not continuous. Therefore, they cannot easily be used in connection with gradient-based solvers, which require continuous rst-order derivatives. No tailored identication algorithm for Hammerstein models incorporating sparse grids exists. The new model structure, therefore, for the rst time allows the easy incorporation of sparse-grid representations into a multi-variate Hammerstein model. For SISO Hammerstein models, Pearson and Pottmann [23] have proposed to use rigorous steady-state models for the nonlinear element. These models are developed by dening a set of equations modeling the physical principles underlying the process such as mass and energy balances. In the chemical industry such models are oftentimes available from process design. Their reuse greatly reduces the identication problem as only a linear system remains to be identied. Rigorous models cannot be used for the nonlinear elements of the KU model, because the latter are restricted to R1 ! R1 . In the EJL model the use of rigorous models for Nj() would lead to all yj being equal. In that case the EJL model could easily be transformed into the RB model. The proposed HM model, therefore, for the

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

547

rst time allows the use of these models in identifying input-directional multi-variate systems using Hammerstein models. While the reuse of rigorous models reduces the identication eort, the computational load of these models can be considerable. For use in nonlinear model predictive control, this increased computational load can become prohibitive. The three nonlinear maps discussed above can be considered as application-specic examples: ANN for use in optimization problems, sparse-grid representations for ease of identication, and rigorous models to reuse existing process knowledge. Beyond these, a wealth of other nonlinear function approximation methods exist, which we will not treat in detail in this paper. The three approximations discussed above will be compared on the simulation example in Section 4, but any nonlinear approximation, which can be identied from inputoutput data can be used in connection with the HM model structure. 4. Application to a simulated FCC unit In this section we will benchmark the new model against the existing structures on a simulated uid catalytic cracking (FCC) unit depicted in Fig. 6, which consists of four coupled units. The FCC is an industrially relevant process and several rigorous models exist in the open literature. We use the model originally developed by Kurihara in an unpublished dissertation and comprehensively discussed by Denn [10]. This model has been validated and used for control of a real unit by Ansari and Tade [2]. We will not restate the equations here for brevity. The nomenclature and units used in the sequel are the same as those of Denn [10], where the complete model may be found. Ansari and Tade [2] also state the complete model, but with some typographical error and a slightly dierent notation. Detailed process descriptions can be found in both references. 4.1. Simulated FCC unit The main manipulated variables of the process are the air owrate Rai and the catalyst circulation rate Rrc, while

the feed rate Rtf and feed temperature Tfp are treated as disturbances. For simplicity we will restrict our discussion to two-input single-output models, as we can visualize the nonlinear static maps for these models. We therefore treat Rrc and Rai as inputs to the system. To control the main quality variable, the cracking severity, several controlled variables have been explored due to the complex dynamics of the system. However, the riser outlet temperature Tra is directly related to the cracking severity and has recently been used for control [18]. We will therefore treat it as the output variable. The reactor regenerator section of the FCC unit (cf. Fig. 6) shows a complex behavior. The static response is coupled and highly nonlinear. We consider a range of ton inputs Rai 390; 420 Mlb and Rrc 40; 42 min, for which h the nonlinear static map of the process is shown in Fig. 2, (left). While the process exhibits nearly linear behavior at low Rai or high Rrc, it becomes highly nonlinear at high Rai and low Rrc. In fact, it even becomes unstable (the reaction is extinguishing) at higher Rai and lower Rrc than considered here. The dynamic response of the process shows a strong input directionality. Fig. 2, (right), shows the response to subsequent steps in both inputs. Clearly, the dynamic behavior changes qualitatively with the direction of the step in the input space regardless of the starting point. Repeating this experiment with dierent starting points yields the same qualitative dierence in the responses to Rai and Rrc. 4.2. Hammerstein models The process is identied using the new HM model structure as well as the previously developed structures RB, KU and EJL discussed in Section 2.2 with polynomial nonlinear maps to assess their suitability for identifying the dynamic behavior of the system. Subsequently, the capability of the HM model to incorporate independently identied, arbitrary nonlinear functions is exploited, and the polynomial representation is replaced by the three representations discussed in Section 3.4. 4.2.1. Identication The new model is identied as described in Section 3.3. The nonlinear models are identied from a measured steady-state data set ~1;s , ~2;s , ~s by solving u u y min
hN

product reactor

flue gas regenerator

~s ys ~s ys y y y s;i N hN ; ~1;s;i ; ~2;s;i ; u u i 1 . . . dimys

32a 32b

Tra

s:t:

oil riser

catalyst recycle Rrc

air Rai air riser


Fig. 6. Flowsheet of the FCC process example.

feed Rtf , Tfp

for the parameters hN of the nonlinear map. Terms of up to second order in each input are considered for the polynomial representation. The ANN is identied using a backpropagation algorithm based on the Levenberg Marquardt method. The sparse grid is identied using the algorithm of Kahrs et al. [19]. Both of these algorithms use the same error-criterion (32a). The data points used for identication of the nonlinear maps are taken on a

548

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

ton uniform grid of DRrc 0:05 min and DRai 0:5 Mlb. The linh ear models are identied independently by solving

min 0 0
aj ;bj

~ yT ~ y y y
j X

33a
j X

n a0

nb0

s:t:

yk

a0j;i

y ki

b0j;i ~j;k cj;0 ; u

33b

4.2.2. Prediction To assess model accuracy, we use a test sequence of 5000 time intervals with 90 random steps in the input space. Figs. 7 and 8 each show system responses for part of this sequence. The model prediction errors are dened by the root mean square error 1 y RMSE p k~ yk2 ; dimy 34

i1

i0

where the vector y contains system response {yk} after excitation with a PRBS signal in ~j . The parameters aj u and bj are determined from a0j and b0j by normalizing the gain of the linear elements to one and performing the reformulations discussed in Section 3.1. The linear elements are of second order for Rrc and of fourth order for Rai. In this example, we use noise-free data for the sake of simplicity. As can be seen from the discussion above, the determination of the model parameters is performed by standard identication experiments. A broad body of literature treating the eects of noise on these experiments exists. The RB model can incorporate arbitrary, independently identied nonlinear maps. Therefore, the same nonlinear map as in the HM model is used. The linear model needs to represent an average dynamic response of the system. The data set used for identication, therefore, has to contain an excitation such that the user deems the process output to be representative of the average dynamic response of the system. With this requirement, identifying the FCC process with the RB model obviously is more of an art than a science. In our case, we use uniformly distributed random signals with switching probability of 0.5 and calculate input sequence {vj,k} to the linear element using N(). The linear element is of second order. The KU model uses the same linear dynamic model as the HM model. The nonlinear element is identied independently using (32) by replacing N() by N1(hN1,u1) + N2(hN2,u2) in (32b). For the EJL model an iterative algorithm has been proposed by Eskinat et al. [11], which we applied to our example. We chose dierent model parameterizations of up to third order linear elements and nonlinear maps of up to second order in each input. However, the algorithm did not converge for any of the chosen model orders, and the model could not be identied. This may be due to two reasons. First, our example process is more complex than the distillation column studied by Eskinat et al., which, for example, shows less severe input directionality. Further, the EJL model is only unique up to a similarity transformation: Since all channels contain the same terms, they are interchangeable. During identication with an iterative procedure such as the one proposed by Eskinat et al., this may lead to convergence problems. The identication algorithms for the RB model, the KU model and the new model are all noniterative with respect to the elements of the Hammerstein model and computation times for identication are well below 5 s using MATLAB on a 1.5 GHz PC.

where ~ is the vector of measurement data and y the correy sponding model prediction. We rst compare the dierent block-structured models. Since the KU model cannot incorporate rigorous models, we compare the block-structured models using polynomial nonlinear maps in each case. The polynomials contain terms of up to the second power in each input. As can be
974 972 970 968 T [ F] 966 964 962 960 958 956 4250 process RB model KU model new HM model 4300 4350 4400 4450 4500 4550

Fig. 7. Comparison of RB model, KU model, and new HM model structure with polynomial nonlinear maps containing up to the second power of each input.

ra

974 972 970 968 T [ F] 966 964 962 960 958 956 4250 process polynomial rigorous sparse grid neural network 4300 4350 4400 4450 4500 4550

Fig. 8. Comparison of four nonlinear maps for the new HM model.

ra

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550

549

seen in Fig. 7, the RB model does not suciently capture the dynamic response of the process. The prediction error over the entire test sequence is 1.29 F. The new HM model, in contrast, is much more suitable to identify process dynamics. Due to inaccurate modeling of the gain of the system a prediction error of 0.91 F remains. To demonstrate the dierence between the two models more explicitly, we use the rigorous steady-state process model as the nonlinear element for both models. In this case, there is no plant-model mismatch at steady state. The resulting errors for the dynamic test sequence are HM = 0.47 F and RB = 1.07 F for the HM and RB model, respectively. In this case the new model reduces the prediction error by more than 50% compared to the RB model. While the KU model is capable of identifying the process dynamics as can be seen in Fig. 7, the nonlinear coupling of inputs cannot be represented. This results in the large steady-state error visible in Fig. 7 between 4400 and 4500. In fact, the prediction error of 1.44 F is even larger than that of the RB model. For the new HM model we now compare the dierent nonlinear maps discussed above. Fig. 8 shows the same section of the test sequence as Fig. 7. For comparison, the prediction of the polynomial model is redrawn and it becomes obvious, that it is far inferior to the rigorous, sparse grid, or ANN representations. All three of these are, however, very close and in fact, the error incurred by the sparse-grid or neural-network approximations at steady state is small compared to the error incurred during transients by the block-structured approximation of the system dynamics (e.g., between 4500 and 4550). The respective errors are RMSE = 0.47 F for the rigorous model and RMSE = 0.52 F for both the sparse-grid and neural-network representations. The errors of all three models are about 50% lower than that of the polynomial representation. The errors are due to the block-structured model as well as due to the representation of the nonlinear map. The dierent nonlinear maps may be compared using the RMSE incurred in predicting the nonlinear gain shown in Fig. 2. The sparse grid and ANN result in errors of RMSE = 0.21 F and RMSE = 0.31 F, respectively, while the error increases to RMSE = 0.74 F for the polynomial representation. Since the KU model further restricts the polynomial as discussed in Section 2.2, the error increases further to RMSE = 1.49 F, which is seven times the error of the sparse grid. 4.2.3. Computation time For control applications, computation times are of major interest, and we therefore compare the computational cost for the simulation of the test sequence used in the previous section. As expected, the models based on polynomial nonlinearities perform best with respect to computational performance. Regardless of the dierent block-structure, they could all be solved in less than 0.6 s using MATLAB on a 1.5 GHz PC. The models based on rigorous nonlinear maps in turn are expected to perform

worst and in fact consume 475 s of CPU-time. The models based on sparse grid or neural network approximations perform much better, as they could be solved in 4 and 2 s, respectively, with no signicant dierence in model delity. The competitive computation times of the latter two are preserved regardless of the internal complexity of the process as they only depend on its inputoutput dimensionality. 5. Conclusions A new Hammerstein model structure has been developed for multi-variate nonlinear processes with input directionality. This model for the rst time allows to use arbitrary, independently identied nonlinear maps with a linear model independently identied by standard SISO step or PRBS response experiments. The model structure is applied to the identication of a simulated chemical process exhibiting input-directional dynamics and a nonlinear coupling of input variables at the same time. For the casestudy the new model formulation is shown to be superior to all previously developed Hammerstein model structures. Input-directional dynamics is represented via identication of a linear multi-input model. The representation of the nonlinear steady-state behavior can be tailored to the desired application of the model. While a polynomial representation suers from poor prediction capability and a rigorous model from poor computational performance, the ANN and the sparse grid representations prove to eciently combine high model delity with low computational cost. In combination with a suitable representation of the nonlinearity, the use of the new model structure results in a reduction in prediction error of more than 50% compared to block-structured models reported in previous literature. References
[1] H. Al-Duwaish, M.N. Karim, A new method for the identication of Hammerstein model, Automatica 33 (10) (1997) 18711875. [2] R.M. Ansari, M.O. Tade, Constrained nonlinear multivariable control of a uid catalytic cracking process, J. Proc. Contr. 10 (6) (2000) 539555. [3] E.-W. Bai, Decoupling the linear and nonlinear parts in Hammerstein model identication, Automatica 40 (4) (2004) 671676. [4] M. Boutayeb, M. Darouach, Recursive identication method for MISO WienerHammerstein model, IEEE Trans. Autom. Contr. 40 (2) (1995) 287297. [5] M. Brendel, W. Marquardt, An algorithm for multivariate function estimation based on hierarchically rened sparse grids, Comput. Visual Sci., accepted for publication. [6] H.-J. Bungartz, M. Griebel, Sparse grids, Acta Numer. 13 (2004) 147 269. [7] F.H.I. Chang, R. Luus, A noniterative method for identication using Hammerstein model, IEEE Trans. Autom. Contr. 16 (5) (1971) 464 468. [8] H.-W. Chen, Modeling and identication of parallel nonlinear systems: structural classication and parameter estimation methods, Proc. IEEE 83 (1) (1995) 3966. [9] G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Contr. Sign. Syst. 2 (1989) 303314.

550

G. Harnischmacher, W. Marquardt / Journal of Process Control 17 (2007) 539550 [20] M. Kortmann, H. Unbehauen, Identication methods for nonlinear MISO systems, in: Proceedings of the IFAC World Congress 1987, Munich, Germany, 1987, pp. 225230. [21] S. Lakshminarayanan, S.L. Shah, K. Nandakumar, Identication of Hammerstein models using multivariate statistical tools, Chem. Eng. Sci. 50 (22) (1995) 35993613. [22] K.S. Narendra, P.G. Gallman, An iterative method for the identication of nonlinear systems using a Hammerstein model, IEEE Trans. Autom. Contr. 11 (3) (1966) 546550. [23] R.K. Pearson, M. Pottmann, Gray-box identication of blockoriented nonlinear models, J. Proc. Contr. 10 (4) (2000) 301315. [24] R.K. Pearson, Selecting nonlinear model structures for computer control, J. Proc. Contr. 13 (1) (2003) 126. [25] D.K. Rollins, N. Bhandari, A.M. Bassili, G.M. Colver, S.-T. Chin, A continuous-time nonlinear dynamic predictive modelling method for Hammerstein processes, Ind. Eng. Chem. Res. 42 (4) (2003) 860872. [26] D.K. Rollins, N. Bhandari, Constrained MIMO dynamic discretetime modeling exploiting optimal experimental design, J. Proc. Contr. 14 (6) (2004) 671683. [27] H.-T. Su, T.J. McAvoy, Integration of multilayer perceptron networks and linear dynamic models: A Hammerstein modeling approach, Ind. Eng. Chem. Res. 32 (9) (1993) 19271936. [28] M. Verhaegen, D. Westwick, Identifying MIMO Hammerstein systems in the context of subspace model identication methods, Int. J. Contr. 63 (2) (1996) 331349. [29] Wahba, Grace, Spline Models for Observational Data, SIAM 1990. [30] X. Zhu, D.E. Seborg, Nonlinear predictive control based on Hammerstein models, in: Proceedings: Process Systems Engineering 1994, 1994, pp. 9951000.

[10] M.M. Denn, Process Modeling, Pitman Publishing, Marsheld, MA, 1986. [11] E. Eskinat, S.H. Johnson, W.L. Luyben, Use of Hammerstein models in identication of nonlinear systems, AIChE J. 37 (2) (1991) 255 268. [12] P.G. Gallman, An iterative method for the identication of nonlinear systems using a Uryson model, IEEE Trans. Autom. Contr. 20 (6) (1975) 771775. [13] I. Goethals, K. Pelckmans, J.A.K. Suykens, B. De Moor, Subspace identication of Hammerstein systems using least squares support vector machines, IEEE Trans. Autom. Contr. 50 (10) (2005) 1509 1519. [14] J.C. Gomez, E. Baeyens, Identication of block-oriented nonlinear systems using orthonormal basis, J. Proc. Contr. 14 (6) (2004) 685 697. [15] A. Hammerstein, Nichtlineare Integralgleichungen nebst Anwendungen, Acta Math. 54 (1930) 117176. [16] G. Harnischmacher, W. Marquardt, Nonlinear model predictive control of multivariable processes using block-structured models, Contr. Eng. Pract., in press, doi: 10.1016/j.conengprac.2006.10.016. [17] K.J. Hunt, M. Munih, N. Donaldson, M.D. Barr, Optimal control of ankle joint moment: toward unsupported standing in paraplegia, IEEE Trans. Autom. Contr. 43 (6) (1998) 819832. [18] C. Jia, S. Rohani, A. Jutan, FCC unit modeling, identication and model predictive control, a simulation study, Chem. Eng. Process. 42 (4) (2003) 311325. [19] O. Kahrs, M. Brendel, W. Marquardt, Incremental identication of NARX models by sparse grid approximation, in: Proceedings of the 16th IFAC World Congress 2005, Prague, 2005.

You might also like