You are on page 1of 12

Z Proc. Cont. Vol. 5, No. 6, pp.

363-374, 1995
TERW~RTH
I~ 'N E M A N N Copyright © 1995 Elsevier Science Ltd
Printed in Great Britain. All rights reserved
0959-1524/95 $10.00 + 0.00
0959-1524(95)00009-7

An efficient method for on-line identification of


steady state

Songling Cao and R. Russell Rhinehart*


Department of Chemical Engineering, Texas Tech University, Lubbock, TX
79409-3121, USA

Received 8 December 1994; revised 14 March 1995

A novel method for the on-line identification of steady state in noisy processes is developed. The method
uses critical values of an F-like statistic, and its computational efficiency and robustness to process noise
distribution and non-noise patterns provide advantages over existing methods. The distribution of the sta-
tistic is obtained through Monte-Carlo simulations, and analytically derived shifts in the distribution due
to process ramp changes and autocorrelations in the process data are shown to cross check with simula-
tions. Application is demonstrated on experimentally measured pH, temperature and pressure data.

Keywords: steady-state identification; automated; stochastic

Identification of steady state is an important task for numerical methods. Many mathematical techniques
satisfactory control of many processes. Steady-state such as nonlinear regression and optimization are itera-
models are widely used in the process control functions tive and require a stopping criteria. Instead of iterating
of model identification, optimization ~, fault detection 2, through a fixed number of times, the procedure should
sensor analysis, data reconciliation 3, and control 4. Since be stopped when the objective function attains near
manufacturing and chemical processes are inherently steady-state values (with respect to iteration number).
nonstationary, selected model parameters must be Development of a practicable technique for steady-
adjusted frequently to keep the model true to the corre- state identification will help realize the full benefits of
sponding process. However, parameter adjustment of on-line model-based process control techniques. There
steady-state models should only be performed with are several existing methods for steady-state identifica-
nearly steady-state data otherwise in-process inventory tion, and herein the procedure will be developed for a
changes will lead to model error. process that generates sequential data. If a process is
By contrast, certain supervisory model-based control sampled every fixed time interval, a time series or
functions are best performed at non-steady-state condi- sequence of data will be generated, Here, the measured
tions. One such is the incremental adjustment of para- data will be viewed as representing the true process
meters of dynamic models. If a process is well value with additive noise and disturbances, and the
controlled, the controlled variables remain constant for steady-state condition means that the true process value
long periods of time. These are pseudo-steady-state stays unchanged. Note that here, the concept of steady-
periods during which the manipulated ,Jariables change state is less strict than 'stationary' or 'strictly stationary'
in response to disturbances. In those periods, noise may in statistics. In statistics, 'stationary' requires not only
dominate the dynamics of the model-relevant state vari- the mean of time series data to be constant but also
ables. Whenever noise is used to adjust model parame- requires the distribution and autocorrelation, if any, to
ters, the parameter values become statistically remain unchanged with time. By contrast, for determin-
uncertain, the model becomes invalid and functionally istic process modelling purposes the steady-state condi-
useless or degraded. Incremental adjustment of dynamic tion does not require that the associated noise and
model parameters should be triggered by non-steady- disturbances be stationary. In chemical processes the
state conditions. amplitude and distribution of noise can change with
Another potential application of steady-state identifi- flow rates, pH, composition, etc.; however, if the true
cation is to determine the stopping criteria for iterative process value does not change, it is considered to be at
steady state.
Here noise is considered to consist of random inde-
* A u t h o r to w h o m c o r r e s p o n d e n c e s h o u l d be a d d r e s s e d pendent perturbations added to the true process value.

363
364 On-line identification of steady state: S. Cao and R.R. Rhinehart

The distribution of these perturbations is usually close respect to the distribution of ratio values which are
to Gaussian for chemical process measurements. By expected at steady-state conditions).
contrast to noise, an autocorrelated disturbance is a The second method is a valid, but primitive
perturbation added to the true value at each sampling, approach, with several undesirable features. First, it
but the magnitude of any perturbation is correlated to requires user expertise; the user must choose the hori-
previous values as well as having a random independent zon to balance the desire for the identification to be
component. In this work, the autocorrelated distur- insensitive to noise v e r s u s its insensitivity to long-past
bance is modelled as an nth order autoregressive process upsets. Second, a large number of data must be stored
driven by Gaussian distributed noise. and manipulated at each sampling. Third, autocorrela-
Noise can be attributed to such events as mechanical tion in the measured signal will affect the statistic. A
vibration, thermal motion, flow turbulence and spuri- process may be at steady state but may also have
ous electronic signals which contaminate transmitted medium short-lived transient fluctuations which last for
values. Autocorrelated disturbances can be attributed to a few sampling intervals. The presence of significant
such events such as nonideal mixing, and sensor dynam- autocorrelation will always produce a 'not at steady-
ics. But assuredly, if the sampling interval were small state' message.
enough, one would find that the influence of the above Recently, there have been three industrial responses 7-9
mentioned noise effects would persist for two or more to the question: 'How can steady-state conditions be
samplings. So the distinction between correlated and identified automatically' posed in a periodical. One
uncorrelated noise is subjective, and is dependent on group 7 uses a moving average chart common to statisti-
whether the persistence of the event is long enough that cal process control (SPC). The technique plots a moving
the sampling interval detects autocorrelation or short average and the upper and lower '30'' control limits.
enough so that the perturbations appear independent. When the moving average violates the control limits, an
alarm indicates that a significant event has occurred.
While this method is often used to trigger control, it
Existing approaches cannot indicate steady state. For instance, continual
oscillation about the mean may not make the not-at-
A direct approach to steady-state identification is to steady-state moving average violate the control limit.
perform a linear regression over a data window and When the noise level reduces, the control limits contract
then perform a t-test on the regression slope. If the and earlier data may create a false violation alarm.
slope is significantly different from zero, the process is Finally, proper SPC application 5,1°,11 requires updating
almost certainly not at steady state 5. This is usually an both the average and the variance of the moving win-
off-line technique. On-line versions require considerable dow of the most recent 100 process measurements. This
data storage, associated computational effort and user is computationally expensive.
expertise. For instance, the user must define the data Another method 8 compares the average calculated
window to be longer than autocorrelation persistence. from a recent history to a 'standard' based on an earlier
Further, in the middle of a definite oscillation, where history. The t-statistic 5 is used to test whether the aver-
the linear regression slope was temporarily zero, the age is unchanged: the steady-state hypothesis. However,
method would give a false reading. At each time inter- this method also suffers from some problems. First, the
val, the whole data window must be updated and the hypothesis of steady-state and equal mean are quite dif-
linear regression must be reperformed. If the data win- ferent. If the process is oscillating or ramping and the
dow is long, there will be considerable computational data window happens to bracket the 'standard' mean
effort, and recognition of changes would be delayed. If then the t-test could accept the equal mean hypothesis
the window is short, noise will confound the analysis; when, in fact, the process steady-state hypothesis should
and the window length which would balance these be rejected. Further, during any non-steady-state period
desires would change with changes in noise amplitude. the traditional root-mean-square (rms) technique for
There are no universal rules to choose the length of the calculating the process variance produces a biased
data window and the selection will be judgmental. value. Since the process variance changes, the proper
An alternative method 6 uses an F-test type statistic, a method for updating the variance cannot be the
ratio of variances as measured on the same set of data straightforward rms technique. Finally, storage and
by two different methods. The data in the most recent processing all of the data is a computational burden.
window are averaged and the variance is first conven- The third method reported by the practice 9 is to cal-
tionally calculated as the mean-square-deviation from culate the process measurement standard deviation over
the average. The variance can also be calculated from a moving window of recent data history. Presumably
the mean of squared differences of successive data. If the rms method is used. Whenever the process is not at
the time series is stationary (i.e. if the process is at steady state the measured standard deviation will be
steady state) then, ideally, the ratio of the variances will larger than its steady-state value. Therefore, when the
be unity. However, due to random noise the actual ratio measured standard deviation is greater than some
of the variances will not be exactly unity; the ratio will threshold value, the not-at-steady-state condition is trig-
be near unity. Alternatively, if the process is not at gered. Those authors appropriately note that 'success
steady state the ratio will be unusually large (with with this method relies on the ability to determine the
On-line identification of steady state: S. Cao and R.R. Rhinehart 365

key unit variables, the process variables time period Equations (5) and (6) yield:
used for calculation, and the [threshold] standard devi-
ation.' Further, when the process variance changes, the
threshold values must change. Again, the storage and a~ = - - a~ = - v2 (7)
operation on the data window is a computational bur- 2 2
den.
from which the noise variance can be estimated if v2 is
known.
A new method for steady-state identification
6.2 = 2 - 21 ~2 (8)
The design of the new method is styled after the primi- 2
tive F-test type of statistic. It is the ratio of variances,
R, as measured on the same set of data by two different However, Equation (3) is computationally expensive;
methods. so, use a filtered value instead of a traditional average:
The primitive way of estimating variance would be:
V), i ----~2('J("- X Ji ,) 2 + ( 1 - ~ 2 ) V ) ." I (9)
.2 _ 1 N
N-1 ~'(Xi- XN)2 (l)
i=l If the process is stationary:
The modification (or simplification) begins with a
conventional exponentially weighted moving average, E(v},,.) = e((X,. - X~ ,):) = v 2
or conventional first-order filter of a process variable X~.
This requires little storage and is computationally fast.
So, Equation (9) is an unbiased esumate of v2, and the
In algebraic notation:
variance of v~:, is:
Xf, = ZIX <+(I - A1)Xf, (2)
var(v},i) = 2 ~ Z 2 var((Xi - Xli ~)2)
where 0 < Z~ < 1.
If the previous filtered value Xj;._j is used to replace
the sample mean, "~N, a mean square deviation can be which means that Equation (9) provides a computa-
defined as: tionaily efficient, unbiased estimate of (X~Xj:~_I)2.
Then the estimate of the noise variance from this first
approach will be:
v" = E ( ( X , - Jff.i_l) 2)

and can be estimated by: s21,, = 2 -2 )h v}.,. (10)

i)2 1
N -- 1 (X, - x~, ,)2 (3) Actually, since Equation (9) requires Xfi_~ one would
.=
compute Equation (9) before Equation (2) to eliminate
Assuming that {X~} is uncorrelated, using the previous the need to store the previous 'average'.
value of Xj~ X,~j, prevents autocorrelation between X~ Using this method, s21,~ will be increased from its
and Xji_l, and allows one to easily estimate a 2 from v 2. steady-state value by a recent shift in the mean. Such a
Define: measure could be used to trigger the not-at-steady-state
condition9; however, the threshold is dependent on both
d l -~. X t. - X~ 1 (4) the measurement units and the unknown process noise
variance.
The second method to estimate the variance will use
If the process is at a steady-state condition and there is
the mean squared differences of successive data. Define:
no autocorrelation in the sequential measurement, then
X~ and Xsi_l are independent, then the variance on d is
related to the variance on X and XI 5: 62 = E((Xi - Xi_l) z) (11)
3 9 "~
or?l = a~. + a i r (S)
and 6 2 could be estimated by:
Further, for the exponentially weighted moving aver-
age, when {X,} are independent and stationary, the vari- 1
E(s2") = 2 E((x,- x,_,) (12)
ance on Xf from Equation (2) becomesl::

ZI 2 (6) However, Equation (12) is computationally expensive;


O'2f -- 2 - Z I crx
so, use a filtered approach:
366 On-line identification of steady state: S. Cao and R.R. Rhinehart

62f,i = A.3(X i - X i q ) 2 + (1 - Z3). 6},i_ , (13) Probability density function of R


Theoretically if {X~} are stationary and independent, the
Again, Equation (13) provides an unbiased estimate of pdf(R) is a function of Z], ~ and ~3, and it also depends
S 2. on the nature of random variable {1",.}. But if R is used
It is easily shown that the second estimate of the as a general purpose steady-state identifier, we desire
noise variance would be: that pdf(R) is not sensitive to the nature of Xi. Fortu-
nately for useful /1, values it is not. Figure 1 gives the
~2
comparison of pdf(R)s for different distributions of X~.
2
S2'i =
f,i
2
(14) For most of the cases, the pdf(R) are almost identical in
spite of the distribution of the noise (normal, uniform,
gamma etc.). The pdf(R) of the exponential distribution
Taking the ratio of the two estimates of variance as is a little different, but its right tail is indistinguishable
determined by Equation (10) to Equation (14): from those of the other distributions. As a result their
critical values will be very close and the critical values
based on the assumption of Gaussian distributed noise
Ri _ s 1,i
z -- (2 -- Z0 "V/,i
2 (15) will adequately represent any process. This observation
S2 ~2
2,i f ,i is empirical but not unexpected in light of the robust-
ness of the similar F-statistic.
The pdf(R)s in Figure 1 and Figure 2 are calculated
Summarizing, use Equation (9) to calculate v~, then by simulations using computer generated pseudo-ran-
use Equation (2) to calculate Xf~ then use Equation (13) dom numbers. To generate a Gaussian distributed
to calculate 8)~, then use Equation (15) to calculate &. sequence, a pair of independent uniformly distributed
Each are direct, no-logic, low storage, low operation [0,1] random variables r~, r 2 are first generated from a
calculations. In practice, it would be preferable to com-
pare ~ 2f.iRcrit(Rcrit is threshold value of R) to (2-Z~)v2f~to
prevent the possibility of a divide by zero in Equation
2,6 i ~e:~:x)nor~al
(15). For each observed variable, the method requires
the direct, simple calculation of three filtered values. In
!
total there are three variables to be stored, 10 multipli- Larnlxlal =0.1
cations, eight additions, and one comparison per Lar~bda2=O. 1
observed variable.
There are three possible process behaviours which
affect the value of R6:

/\
1. If the process data is at steady state (process mean I
is constant, additive noise is independent and iden-
tically distributed), then R will be near 1. 0
i
0 0.5 1.5 2 2,5 3
2. If the process data mean shifts, or if the noise is R
autocorrelated, then R will be greater than 1. When
Figure 1 Probability density function for the R-statistic for a variety
there is a shift in mean, both the calculations of the of distributions of process noise
variance will be influenced temporally. The first cal-
culation will increase more and persist longer, so R
will be greater than 1 for a period of time, and that 7
is the way that the 'not at steady-~tate' condition !
can be identified.
I
3. If the sequentially sampled process data alternate n- '9_..o
1_~o..oL~.o1~
between high and low extremes, then R will be less !i !
1 i ', i !
than 1. This would be very uncommon in chemical 4----- i
I:-- i !
processes, Hence this work only tests if R >Rcrit.
i
i
I
I i
i
t
I
If {X~} are stationary and independent, and if the i
process is at steady state, there will be a probability a1!o.01,0.1,
- - - - Io~
density function of R (pdf(R)). Critical values for R, i
Rcri,, can be calculated from the pdf(R) (actually cdf(R)). o !
0 0.5 1 1.5 2 2.5 3
Once R becomes greater than some threshold value, R

Rmt, we can know at a certain confidence level that the Figure 2 Probability density function for the R-statistic for several
process is not at steady state. choices of the filter parameter set (~1, 22, ;!.3)
On-line identification of steady state: S. Cao and R.R. Rhinehart 367

uniform random number generator (in this work the deviation ~ and is uncorrelated. Appendix A shows
Borland C/C++ library function is used). Then a Gauss- that for this type of processes:
ian distributed process value X,. with mean of p and
standard deviation of cr can be generated ~3 as: E(s2t,i) _ (2 - 2")E(v~,,.)

X i = ~t+ ~j-2. cr2. ln(q )-sin(2rc •r2 ) (16)


, 2-,L.IsTI2+I
Given the 2' values and a time series of random num-
bers, the pdf(R) can be constructed as follows. First
estimate the mean and variance of the random number
2~.crx J (17)
by using the first ten numbers in the sequence. Use the
estimated mean as the initial value of both XI and X/_~.
Use the estimated variance as the initial value of v),..
This equation gives an idea of how the average R can be
Use twice the estimated variance as the initial value of
changed and the pdf(R) can be shifted by the ramp. Fig-
62i.~. Then, for each new random number X~, use Equa-
ures 3 and 4 show the pdf(R)s with (B) and without (A)
tions (9), (2), (13) and (15) to calculate a new R~ and
a ramp on the means of process signals. Note that the
update Xj> X~_L,v~t and 6 ~i. In this work this procedure
pdf(R)s of As are both centred on 1, the pdf(R)s of Bs
is repeated for one million samples after the first 10.
are centred on 3.5. It seems for small 2' values, the two
Each time a new R value is generated it is put into a his-
pdf(R)s are obviously distinct from each other and Type
togram which has 300 bins and ranges from 0 to 3.0
I and Type II error can be minimized.
(bin width 0.01). The histogram data can be later used
Now consider that the process has been stationary for
to give the pdf(R). The method would also be applied in
a long time, and then suddenly there is a ramp on the
this sequential manner to determine steady state.
mean of the process data. Because of the filtering nature
As an alternative technique to using sequential simu-
of R, at the point of change, the pdf(R) will not jump
lated data, the pdf(R) can also be generated from actual
from A to B. Rather it will move gradually from A to B,
process (heat exchanger and flow controller) data by the
and the lower the As, the slower the transition will be.
technique known as bootstrapping. In our bootstrap-
In Figures 3 a n d 4, the curves labelled 'C' are pdf(R)s
ping method, 100 flowrate data were sampled from the
when the process is on a ramp for 20 data points after
process and put into an array. During the pdf(R) gen-
an initial steady state. The curves labelled 'D' are
eration, this array of data is resampled randomly, one
pdf(R)s when the process is back to steady state for 20
at a time with replacement. The same procedure is used
data points after a long persisting ramp. It is obvious
to construct the new pdf(R) which is found to be nearly
that for smaller As, it will take a longer time for pdf(R)
identifical to pdf(R) of computer generated Gaussian
to get away from its previous process state. So, if
data.
smaller 2' values are used it will take a longer time for
Simulations also show that, for uncorrelated data, 2'~,
the steady-state identifer to detect any changes of the
does not have a significant effect on the pdf(R). Only 2'2
process.
and 2'3 affect the pdf(R) significantly. Increasing 2'2 and
The choice of 2' values and Rcrit should balance our
23 increases the variability of R because 2'z and 2'3 are
desire to reduce the Type I errors (trigger a 'not at
filter factors on the two variances. Figure 2 shows the
pdf(R)s for some 2'~, 2'2 and 2,3 values.

i i
Selection of )l values i ,,, ]
' i
..... -[A----

A), ,L2.,23 are all filter factors. Small filter factors can sig-
nificantly reduce the noise influences on the estimates of
process variances, and the pdf(R) will be narrowed and
'centred' near unity. So, small A, values can make the I D J ,
pdf(R) of steady state and the pdf(R) of non-steady-
state split apart and both Type I and Type II errors 5 can
be reduced. But, dynamically, small filter factors can
make the R statistic lag far behind the present process
'2
0.5-

_ i
state. o
2 3 4 5 6 7 8 9
The idea of rapid tracking of the process can be o
R
depicted in the following example of a nonstationary
process. Suppose there is a ramp on the mean of the Figure 3 Probability density function for the R-statistic for four cases:
process data. The slope of the ramp is s and the sam- (A) during steady state; (B) during a long ramp disturbance; (D) 20
samples of steady state after a long ramp period; (C) 20 samples of
pling time interval is T. Assume that the additive noise ramp after a long steady state period. All curves are with ).~, = )-2, =
is Gaussian distributed with zero mean and standard A3 = 0.1)
368 On-line identification o f steady state: S. Cao and R.R. Rhinehart

Because of the non-ideal mixing of the exit fluid of each


i tube, packets of hot and cold fluid will exit the
exchanger. The temperature sensor on the outlet of the
6--
heat exchanger may show short-term autocorrelation
5- even though the process may actually, for the operating
and control point of view, be at steady state.
4-
Either short-term or long-term autocorrelation can
change the pdf(R) and can result in false not-at-steady-
3-
i state readings. Our solution to differentiate between
2- ' / L -+- +- i long-term and short-term autocorrelation is to make the
k
,2 i sampling intervals long enough such that the influence
1 of short-term autocorrelation on the sampled data is
t negligible. The user must decide the time persistence of
0
0 3 4 5 6 7 8 influences which are not to be 'watched' by the steady-
R
state identifier. Practically, the autocorrelations of rep-
Figure 4 Probability density function for the R-statistic for four cases: resentative steady-state data could be calculated and the
(A) during steady state; (B) during a long ramp disturbance; (D) 20 sampling interval selected to be long enough such that
samples of steady state after a long ramp period; (C) 20 samples of the autocorrelation between successive sampling data is
ramp after a long steady-state period. All curves are with A~, = 2~, =
t 3 = 0.01) zero within confidence limits. This is a standard statisti-
cal process control procedure in choosing sampling
interval for control charts l°,tt. Compared to the linear
steady state' when the process is at steady state) and the regression method, the selection of sampling interval is
Type II errors (not trigger a 'not at steady state' nonjudgmental, mechanical and independent of noise
response when the process is not at steady state) and the level of the process.
need to rapidly track the process. Small t values make Figures 5, 6 and 7 show the pdf(R)s when the devia-
the steady-state pdf(R) and not-at-steady-state pdf(R) tions are driven by a first-order autoregressive process
distinct from each other (Figures 3 and 4). So if the [AR(1)]. When sampling with every data point, the
Type I error is fixed small t values usually can reduce degree of autocorrelation between two successive sam-
the Type II error. On the other hand, large A values pled data is so strong (autocorrelation at lag 1, pl = 0.7)
mean less filtering and can track the process more that the pdf of R is significantly different from the
closely. We suggest that ,;t,t = 0.2, 12 = 0.1 and t3 = 0.1 uncorrelated situation. If sampling with every five data
are values which offer a useful compromise. Then, R.95 points, the degree of autocorrelation is less strong (Pt =
= 1.44, R.975 -- 1.56, R99 -- 1.73 and R.995 -- | . 8 6 (R a can 0.168), but the pdf(R) is still quite different from uncor-
be easily calculated from cumulative density function, related data. However, when sampling with every 10
cdf(R) by linear interpolation). So Rcrit = 2.0 will be a data points, the autocorrelation becomes negligible (p~
good choice. -- 0.0282) and pdf(R) is nearly identical to the uncorre-
iated case. Also note that in Figure 5, the average R for
autocorrelated data is big enough to trigger a 'not-at-
Sampling intervals steady-state' message provided that a normal value of
Refit( 2-3) is used. Similar results are also found from
TM

Autocorrelation usually increases both the average and a second-order autoregressive process [AR(2)]. So, gen-
variability of the pdf(R). From a process analysis per- erally the longer the sampling interval, the lower the
spective, autocorrelations can be classified as either long
term or short term. Again, the classification is based on
the persistence of the effect relative to the desired process 2.5
analysis. When the process is not at steady state there is t i
long-term autocorrelation. For example, change in inlet
2
flow rate of a liquid entering a heat exchanger will cause
a dynamic response of the temperature of liquid leaving
the heat exchanger. The long-term autocorrelation 1.5-

caused by this kind of process dynamics will change the


pdf(R) when a process moves from steady-state to non-
steady-state condition and identifying such events is the
objective of the new identification technique.
0.5-
Short-term autocorrelation is extrinsic to the process
phenomena one is interested in, and is due to uncertain
/
events that occur and cannot be controlled or even pre- 0
0 2 3 4 5
dicted. For example, in a heat exchanger, if some of the R

tubes are partially plugged, the outlet temperatures


Figure 5 Probability density function for the R-statistic. A compari-
from those tubes will be different from the others. son of correlated and uncorrelated data with a sampling interval of 1
On-line i d e n t i f i c a t i o n o f steady state: S. Cao a n d R.R. R h i n e h a r t 369

2.1! i
. . . . . .
i
. . . . i I
I
i
i
i
i
J i =
i
'
i
~
i
I

130-~- ~ ~---~ i

1.5-

"°1( ,I i i 2 I i

ii!
[ ~ k s t a t e I i ~- I

O.5-

0 50 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 4 5 0 5 0 0 5 5 0 6 0 0 6 5 0 7 0 0 7 5 0
2 3 4 Ume (seconds)
R

Figure 6 Probability density function for the R-statistic. A compari- Figure 8 D e m o n s t r a t i o n of the steady-state identifier on distillation
son of correlated and uncorrelated data with a sampling interval of 5 column feed temperature at the exit of an electrical heater )t~ = 0.2,
,;t2 = )~,, = O. I, R¢~,: = 2.0

2.5

11 i = i --~i
! i I

a ........... ~ ..... i . . . . . . 4....................... - - . - - 4 - -


1.5 .........
i
.....~ - - - ! --'--4 .............
"8. J 6 I I T
................ ~ ~l~r~'~a-~at~l
Xi= 0.7XJ-1 +hi

o.5 . . . . . . . . . . -----r--- I --*.... r f ; i 1

J_ 2 3 4
0 100 200 300 400 500
time ( min )
600 700 800 900 1000

Figure 7 Probability density function for the R-statistic. A compari- Figure 9 D e m o n s t r a t i o n of the steady-state identifier on the effluent
son of correlated and uncorrelated data with a sampling interval of 10 pH value of a commercial in-line p H control system. ,;t~ = 0.2, 2: = ~,~
= 0.1. R~,~ = 2.0
degree of autocorrelation, and the closer the pdf(R) is to
the uncorrelated case.
For a first-order autoregressive process which is sam- 28-
pled every P data points, Appendix B gives:
26 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

g
E(s;_,), = -~_ E ( 8 7,,.) = -~ E ( ( X , - X,. p)') = (1- p e ) a - ~
(18) i
and:

E(s~,)= E 2 , .vT.,. = - -~ . E((X i - Xf,,. e)2)

1 - Pe 0 100 200 300 400 500 600 700 800 900 1000
= or-x time (seconds)
1 (1 -,k~)-pe (19)
Figure 10 D e m o n s t r a t i o n of the steady-state identifier on the cooling
water pressure of a heat exchange. 2, : 0.2..k: : A~ = 0.1, R~,, = 2.0
where pe is the autocorrelation coefficient at lag P.
For stable autoregressive processes, the bigger the mates, will be centred towards unity and the pdf(R) will
step size P, the smaller the pe. So, the above equations be close to that of uncorrelated cases. By adjusting the
indicate that a big sampling interval could make the sampling intervals, the short-term autocorrelations
means of those estimations close to the true process could be differentiated from long-term autocorrelation
variance; and, as a result, R, the ratio of those two esti- and short-term autocorrelations can be prevented from
370 On-line identification of steady state: S. Cao and R.R. Rhinehart

fooling the steady-state identifier. Selecting sampling Kamal Chanda (Texas Tech University, Department of
interval to 'eliminate' the influence of autocorrelation is Math) for his guidance. The authors appreciate both
a standard statistical process control practice~°.tL the financial support and technical guidance from the
following members of the Texas Tech Process Control
and Optimization Consortium: Albemarle Corp. (for-
Results merly Ethyl Corp.); Amoco Oil Co.; Arco Exploration
& Production; Conoco, Inc.; Diamond Shamrock; The
The steady-state identifier has been tested by tempera- Dow Chemical Co.; The Dynamic Matrix Control
ture data from a distillation column feed preheater TM Corp.; Exxon Co., USA; Hyprotech, Inc.; Johnson
(Figure 8), pH measurement from a commercial, in-line Yokogawa Corp.; Phillips Petroleum Co.; and Setpoint,
pH control system 15 (Figure 9) and tube fluid pressure Inc.
data from a pilot-scale heat exchanger t6 (Figure 10). In
all cases, the steady-state identifier agrees with a visual
recognition of the system state. It is also observed that References
the steady-state identifer is insensitive to the change of
noise level. In Figures 9 and 10, there are dramatic 1 Lin, T.D.V. Hydrocarb. Proces. April 1993, 107
2 Keller, J.Y., Darouach, M. and Krzakala, G. Comput. Chem.
changes (increase) of noise amplitude, but because the Eng. 1994, 18(10) 1001
mean of the data is unchanged, the identifier indicates 3 Albers, J.E. Hydrocarb Proces. March 1994, 65
the system is still at steady-state, without user's interac- 4 Forbes, J.F. and Marlin, T.E. Ind. Eng. Chem. Res. 1994, 33,
1919
tions. P=I, ~=0.2, ~=2~=0.1 and Rcrit=2.0 are used in
5 Bethea, R.M. and Rhinehart, R.R. 'Applied Engineering Statis-
the testing. We found these values of ~ , ~ , 2~, Rcrit tics', Marcel Dekker, New York, NY, 1991
with P chosen so that autocorrelation coefficient Pe < 6 Crow, E.L., Davis, F.A. and Maxfield, M.W. 'Statistics Manual',
0.05 could balance our desire of reducing Type I and Dover Publications, New York, NY, 1955, 63
7 Loar, J. in 'Control for the Process Industries', Putman Publica-
Type II errors and desire of fast tracking of the process. tions, Chicago, IL., Vol VII, No. 11, November 1994, 62
The steady-state identifier also responds correctly to 8 Alekman, S.L. in 'Control for the Process Industries', Putman
the dynamic changes of the systems. In Figure 8, most Publications, Chicago, IL, Vol VII, No 11, November 1994, 62
9 Jubien, G. and Bihary, G. in 'Control for the Process Industries,'
of the process changes are nearly step changes and Putman Publications, Chicago, IL., Vol VII, No. 11, November
those changes trigger the steady-state identifier almost 1994, 64
instantly. But when the process is back to steady-state, 10 'Manual on Presentation of Data and Control Chart analysis',
6th edition, ASTM Manual series MNL-7, American Society for
because of the filtering nature of the identifier, it will Testing and Materials, Philadelphia, PA, 1991
take some time for R (may have a high value after a 11 Oakland, J.S. 'Statistical Process Control, A Practical Guide',
step changer to go below the critical value and trigger a John Wiley, New York, NY, 1986
12 Box, G.E.P. and Jenkins, G.M. 'Time Series Analysis: Forecast-
new 'steady-state' signal. The amount of delay primar- ing and Control', Holden-Day, 1976
ily depends on A, values, the nature of the process, and 13 Edwards, P. and Hamson, M. 'Guide to Mathematical Model-
the trigger value. For most of the processes and ~ val- ing', CRC Press, Boca Raton, Florida, 1990
ues which are useful, the amount of delay is acceptable. 14 Dutta, P. and Rhinehart, R.R. in 'Proceedings of the 1995 Amer-
ican Control Conference', June 1995, Seattle, WA, 1787
15 Mahuli, S.K., Rhinehart, R.R. and Riggs, J.B.J. Proc. Cont.
1992, 2,(3)
Conclusion 16 Rhinehart, R.R. in 'Proceedings of the 1994 American Control
Conference', Baltimore, MD, June 1994, 3122
17 Brockwell, P.J. and Davis, R.A. 'Time Series: Theory and Meth-
A new statistically-based method has been developed ods', Springer-Verlag, Berlin, 1987
for automated identification of steady state. The
method is computationally inexpensive when compared
to conventional techniques. The R-statistic is dimen- Appendix A: Means of 'variances' for the
sionless and independent of the measurement level. process with a ramp
Because it is a ratio of estimated variances, it is also
independent of the process variance. Simulations and If a process has a long persisting ramp on its mean, the
actual data show that for recommended values of ~, process data Xi can be expressed as:
critical values are also effectively independent of the
magnitude and distribution of the noise. Tests on a X~ - X o = s . ( t i - t o ) + n i (A.1)
variety of experimental processes show that the method
agrees with visual recognition.
where s is the slope of the ramp, t t is time at ith point,
n~ is noise at ith point. For simplicity and without loss
of generalization, treat X~, t~ as deviations from X0 and
Acknowledgements
to from t = to, so:
The authors wish to thank Dr B. 'Soundar' Ramchan-
dran for his preliminary investigations, Mehul M. X i = S ' t i -bl'l i
Desai, Prabu Murugan and Priyabrata Dutta for help
in obtaining the experimental results, and Professor The filtered process value is then:
On-line identification of steady state: S. Cao and R.R. Rhinehart 371

)(ni and Xfn,i-l, are independent and their means are


X f , i ~" (1 -- ~'l) "J~fj,i-I "b ~['l" ( S " t~ + n,.) (A.2)
all zero)
Inserting ~2r, from Equations (A.8) into (A.9)
Xf~ can be expressed as the sum of Xfm.~, the filtered
value of the mean of process data and Xt-..~, the filtered
value of the noise. Define: E(s2.) = 2 - ~1 $2T2 2 ,
2 ' (--~ + o-~) (A.10)
' 2 -X I

X~,,,i = (1 - 21)" X.tm.,-I + XI" s ' t i (A.3)


This value not only depends on the variance of noise,
but also depends on the slope and the filter factor 2t.
X.lh, i = ( l - / ~ l ) ' X j h , i _ l q- ~'1" /'/i (A.4) Large s (steep slope) and small Xt (large lag) can result
large values of s 2l,i"
Add above two equations together and compare to For the estimate s~ calculated from the filtered
Equation (A.2): squared-difference of successive data, if the process is
stationary:
X f,~ = X lm,i + X f,,, (A.5)
E(v},,) = E((X, - X,_,) 2)
Here X,~., is the non-noisy part of Xf,~, while ~,., is the
noisy part.
So:
Let t, = t~ + T and assume that Xjm,,, the non-noisy
part of Xt:,, has reached the steady slope s. That is:
E ( s ~ j ) = -~. E ( ( X i - X , _ , f ) = -~. E ( ( s T + n, - n, , ) ' )
Xfm,i = s T + z!(fm,i_l
(A.I1)
Substituting into Equation (A.3) yields:
Because n, and nj_l are independent and their means are
all zero:
Xjm,i = s " t i + (1 - 1 ) "s'Y (A.6) 1
E ( s 2 i ) = ~ " E ( ( X i - X , i) 2 )

Take the variance of Equation (A.4) and because Xj,,_~ = l(s=V: + a~ + G~)
and n, are independent: 2
s2T 2
0-7.,.,
~' :
(I -- 2 2
X I ) 0- fn,i-I
+Z] " 0-n
2 (A.7) - 2 + 0-n (A.12)

If the distribution of n,. is stationary, the distribution of The ratio of means of those two variances becomes:
Xs,., will also be stationary.
E(s~,,) _ (2 - & ) - E ( ( X i - Xrjq)2)

2
G fn,i = 0"2
Jn,i-I = 0-fn E(s~,,) E((X, - x,_,) 2)
(2 - s2T2
z,)-~- + 2a~
Substituting into Equation (A.7):
(A.13)
s 2 T 2 + 20-~

0-~" - = 7k, 0-"~ (A.8)


In dimensionless form:

If the process is stationary, from Equations (9) and (10):


2-X, E(v},i ) 2" +1
E(s~,) _ 2 =
2 -X I
E(s~,,) = i . E(v},,)
' -, +1
= 2 - & . E((Z, - Xf ~q)2) (A. 14)
2
= 2 - 2~ . E ( ( s . t i + n i - X f m , i _ 1 - Z f n i I) 2)
2 This is not a mean of R. This is division of two means,
= _ _ -. 2 X 1 E((n,. + s . T _ X f , , i q ) 2 ) but it provides a good estimate of the effect of a ramp
2 2~ in level on R. If a process is stationary, this value
2
2 - Z l . s-T should be unity (here we assume the only correlation
= --5- (57,+ E(4)+ E(X2
o,,,)) between successive data is the ramp, n, and n,+j are inde-
2 -- ,,Z 1 s2T 2 2 0-2fn) pendent). By looking at how the average R is different
.
= 2 ( +0-~ + from unity, we can estimate how far away a process is
(A9) from being stationary.
372 On-line identification of steady state: S. Cao and R.R. Rhinehart

Table 1 A comparison of estimates where:


2 - A , ~, E(s2~;) = 0~ (B.5)
2 E(s~,,)
~1 "a-2 -'71-3 T" ~f.i (A,14) R/
d~. = ai + ai_l . 0 + ai-2 • 0 2 + ai-3 . 03 + . . .

0.1 0.1 0.1 3.580 3.589 3.953 + ai p+l " 0 p-1 (B.6)
0.1 0.05 0.05 3.574 3.589 3.799 Now calculate the mean of X;, from Equation (B.1):
0.1 0.01 0.01 3.580 3.589 3.641
0.1 0,005 0.005 3.578 3.589 3.618 E(X,) = 0 E ( X ; q ) + E(a;) (B.7)
( s T = 0.5, o-, = 3.0)
Because the process is stationary:
A numerical comparison between average R and the
ratio of mean sl~; and mean s2.2;is given in Table 1. E(Xi) = E(Xi 1) = E ( X )
This process is simulated by using computer-gener-
ated Gaussian distributed pseudo-random numbers as So:
described earlier. Each time a new process value is gen-
E ( X ) = 0" E ( X ) + E(a) (B.8)
erated, st,~ S2!i and R are updated and put into three dif-
ferent histograms. This simulation process is repeated a
million times and upon completion the pdf(R), pdf(sx!;), E(X)- E(a) (B.9)
pdf(s2!;), average R, average sl~;and average s22.can all be I-0
calculated from the histograms. All columns in Table 1 As for the variance of X;, from Equation (B.1), because
are simulated experimental data except for the ratio of X,=~ and a, are independent:
E(s~;) and E(2,~.) which is calculated by Equation (A.14).
Given the ramp, noise level and A, values, the ratio of 0"2 ~- 02
Xi
2 "?
" 0-Xi I + 0-a~
mean s~ and mean s2~ can be calculated and used as an
estimation of the average R. The deviation of average R For a stationary process:
from unity indicates the ease with which the ramp can
be identified by the steady-state identifier. The first col- 2 = 0 2 . 0 " 2 + 0"~
0"X (B. 10)
umn of results verified that the analytical solution of
E(sO/E(s2) is accurate.
So:
2
Appendix B: Means of 'variances' for the first- 0 "2X - - 0-a (B. I1)
1-0 2
order autoregressive process*
If the process is sampled every P data points (P>I), the
For the first-order autoregressive process: mean and variance of X; will stay the same.
Now, calculate the mean of squared difference of suc-
X i = (9. X i _ 1 + a i (B.1) cessive data. Because the mean of the difference of suc-
cessive data is zero, the mean of the squared difference
where at =/.t + n;, n; is zero mean noise, of successive data is equal to the variance of the differ-
ence of successive data.
0 < 0 < 1 for the process to be stable If sampling each data point, let:

We have also: d i : X i - Xi_ 1 (B.12)

Xi-1 = 0 " X i - 2 + ai-i (B.2) From Equation (B.1):

Substitute Equation (B.2) into (B.1) and do it recur- di = Xi - X;_l = 0" Xi_l + a, - X; i = (0 - 1)Xi i + a;
sively:
(B.13)
X i =0 p "X i p+ai+ai l'O+ai_2(~ 2 a; and X;, are independent, so:
+ ai-3" 0 3 + ' " + ai p+l" 0 p-3 (B.3)
var ( g ; - X i q ) = (0 - 1) 2 ' 0"2 + 0"a2 = (0 --
2
1)2. 0"X
2
So, if the process is sampled every p data points, it can + (1 - 02) • 0-x = (2 - 20) • 0-x (B.14)
still be described as a first order autoregressive process:
If sampling every P data points, Equation (B.11)
X i : Iffl).Xi_ p -b A t (B.4)
becomes:
Xi - Xi-p = ~ " X i - e + Ai - Xi e = ( ~ - 1)Xi-p + Ai
* For an excellent review of time series analysis, refer to References 12
and 17. (B. 15)
On-line identification of steady state: S. Cao and R.R. Rhinehart 373

A, and X,.e are also independent, so: T h a t is:

var(X,. - X,._e) = ( ~ - 1) 2 - O'~- + O" A I-B


2 + (l (I)2) O'x2
= (qb - 1) 2 "¢Yx -

X,-Xf, il = l_(l_.a.l+0)B+0( 1 ~-)-(B:a' (B.23)


= (2 - 2 0 ) . ~Y?v (B.16)
The deviation from the previous filtered value xc-xl:,
T h a t ~s: is a second-order autoregressive process and first-order
moving average (ARMA(2,1 )).
1 Let
-~ E((X, - X i _ p f ) = (1 - *)o'.~- = (1 - ~e)o'~ (B.17)
d, = X~ - Xj ~-1 (B.24)
For the estimate of s2!, calculated from the filtered The above equation is equivalent to:
squared-difference of successive data:

, l. E(6},,) 4 = (1-'a'l+ q } ) ' 4 q - ~ ( 1 - ~ l ) ' d , 2+a, - a , 1


E(s2,,) = 7
(B.25)
Because: The average of d~ is zero, so the mean of the squared-
6~,, = ~(x, - x, 1): + (1 - ;~). 6},,_, deviation from the filtered value is equal to the variance
of d, which can be calculated by the autocovariance
function r 0 of d,, the ARMA(2,1).
if the process is stationary, E(~j!i) = E(~.~_0. The calculation of r 0 for ARMA(2,1) is standard 17
The above two equations lead to: and it is found that for 4:

E(v2r,,) = e ( ( x , - x,_,) 2) 2 1-~ ,


Jb = - - O'~.
2 - kq 1 - (1 - kq)~
So:
So:
E(s~., ) = ~-. E(6).,) = (1 - ~e )cy2 (B. 18)
2 - ~ l E((Xi - ~ f,-1 , , 2 ~1 • u ( , ( )
This is the mean o f 'variance' calculated by the filtered 2 2
squared differences of successive data. If ~ is small, 2-~q 2-21 (a+b)
-- ro - _ _

which means the degree of autocorrelation is weak, or 2 2


P (the sampling interval) is large, this n u m b e r could be _ l-O 4
close to the true variance of the process. l -- (| -- X 1)~ (B.26)
Before we can estimate the mean of the 'variance' cal-
culated by the filtered squared-deviation from filtered This is the mean o f 'variance' calculated by the squared
process value, we first look at the filtered process data deviation from the filtered value.
Xj;,. F r o m Equations (2) and (B.1): If a process is sampled every P data points, the analy-
sis is identical; however, ~ replaces 0, so:
Xf,,. = (1 - ,~l)Xf,i-I + &l Xi = (1 - ,~l)BXf,i + ~k I " X i

(B.19) 1 - oR
2-)q2 " E ( ( X i - X f ' " P ) ~ ) - 1 (-1--Z)~ e ° ' ~
X, = (o. Xiq + a, = ?p, B. X i + ai (B.20)
The mean of estimate s~i calculated by filtered squared-
where B is the backward shift operator. deviation from the filtered process value is:
F r o m Equations (B.19) and (B.20):
e ( s ~ , ) = 2 - ;~,
2 . e(v~,,)
21 kq 1
Xf,,- I (I-,;t~)BX'= 1-(1-&~)B l-0-----B'a' Because:
(B.21)
v?, i = Z2(X i - X l i t )2 + (1 -- ~l)V~'a 1
and
ai ~1 1
Xi
-Xl"-I - I - O B l-(l-&l)B I-~------B"a'-I If the process is stationary:
_ a,. At B
l-OR 1-(1-x0g 1-¢----~ a* (B22) 3

E(Wf,,) = E(v?,, ~)
9
374 On-line identification of steady state: S. Cao and R.R. Rhinehart

Table 2 A comparison of estimates


m

SI __ ( 2 - ~l)V2f,i --
-Sl - 2 - 2~ 1 - 7 E(sl) 1 2 E(s:) ---- - R =
2 Vf'i (B.30) (B.29) s2 ~.i

P =1 16.854 15.813 5.036 5.029 3.347 3.2814


P =3 19.554 19.476 14.344 14.294 1.363 1.4265
P =5 20.051 20.050 18.171 18.097 1.103 1.1570
P =8 20.266 20.245 19.915 19.818 1.018 1.0680

The above two equations lead to: Equation (B.30) is not valid for higher order autore-
gressive processes, but the following example shows
that Equation (B.30) can give a good approximation for
E(v},~) = E ( ( X ~ - Xf~ 1)2)
a second-order autoregressive process.
An example second-order autoregressive process is
So: described as:

1-¢ X i = 0.85, Xi_ 1 - 0.15. Xi_ 2 + a i (B.32)


(B.27)
1_
For the first-order autoregressive process: where X~ is the process variable, a~ is independent
Gaussian distributed with mean equal to 50 and stan-
Pe = Ce (B.28) dard deviation equal to 3. This process is simulated by
using computer generated Gaussian distributed pseudo-
where Pe is the autocorrelation function at lag P and by random numbers as described earlier. Each time a new
definition: process value is generated, s12~,size, and r are updated and
put into three different histograms. This simulation
re process is repeated a million times and upon completion
jO e ~ --
r0 the pdf(R), pdf(Sl2~), pdf(s22,j), average R, average sl2,~and
average s22,~,can all be calculated from the histograms.
So, Equations (B.18) and (B.27) can be rewritten as: In the simulation At = 22 = A3 = 0.1 are used, the
results are summarized in T a b l e 2. (In this example the
autocorrelation function pp in Equations (B.29) and
= (1 - p~)cr#
(B.30) was calculated analytically. In real processes,
(B.29) however, autocorrelation functions are usually calcu-
lated numerically.)
and: In T a b l e 2, it is observed that the average of sz2~is
extremely close to E(s~/,~); the small difference is due to
numerical error. Remember that the bin width used in
E ( s ~ i ) = 2 - ) h . E(v2f,,) = -2 -- •, E ( ( X i - ~ f , i _ p ) 2
2 2 the histograms is 0.01 which can place a limitation on
1 - Pe t:r2x the accuracy of the average of $22,i. The average of Sl2,iis
1-(1- A).pe generally very close to E(Sl2,~) calculated from Equation
(B.30) (B.45) especially when the sampling interval is large and
autocorrelation is weak. It is also observed that there is
a strong relation between the ratio of the average of s12,~
Equation (B.29) is also true for any order of auto-
to the average of s22~and average of R. When the sam-
regressive process. For an n-order stationary autore-
piing interval is large, both the average of s12t, and the
gressive process:
average of s22,~are close to the true process variance, and
1 E(((xi - F4X)) the average of R will be close to 1, the pdf(R) will be
1.2 E ( ( X , - X~_p) 2) = 7 close to the pdf(R) for an uncorrelated process.
Equations (B.29) and (B.30) can be used to select the
- (X,_e - E ( X ) ) ) 2)
minimum sampling period that can keep autocorrela-
1 tion away from triggering the steady-state identifier.
= -~. (F.((x, - F.(X)) 2) + r.((X,_p - E(X)) 2)
For a real process, the autocorrelation function can be
- 2. E((X, - E(X))(X,_ e - E(X)))) calculated numerically when the process is stationary, as
= ,r~ - cov(X;,X~_,) = ,r~ - p~.,r 2 = (1 - p , ) . ~ determined by visual inspection. Then by Equations
(B.29) and (B.30), E(Sl2,i) and E(s22,i) c a n be calculated for
(B.31) each different sampling interval.

You might also like