You are on page 1of 5

Fault Detection and Diagnosis the component(s) (sensors, actuators, or plant

components) where the fault is located, while fault


Janos Gertler identification is to determine (estimate) the size of the
George Mason University fault and, in some cases, the time of its arrival. With the
ubiquitous presence of the computer, fault detection and
Fairfax, Virginia, USA diagnosis (FDD) is, in general, a function of the
Abstract computer interfaced to the plant.

The fundamental concepts and methods of fault The simplest approaches to FDD consist of comparing
detection and diagnosis are reviewed. Faults are defined individual plant measurements to pre-set limits, without
and classified as additive or multiplicative. The model- utilizing any knowledge of the plant model (limit
free approach of alarm systems is described and checking or “alarm systems”). More sophisticated
critiqued. Residual generation, using the mathematical techniques rely on an explicit mathematical model of the
model of the plant, is introduced. The propagation of plant. They compare plant measurements to estimates
additive and multiplicative faults to the residuals is obtained, from other measurements, by the model; any
discussed, followed by a review of the effect of discrepancy may be an indication of faults. Another class
disturbances, noise and model errors. Enhanced of techniques (generally but incorrectly called “data
residuals (structured and directional) are introduced. The driven”), most notably Principal Component Analysis
main residual generation techniques are briefly (PCA), include the estimation of an implicit model, from
described, including direct consistency relations, parity empirical plant data, and then use this in ways similar to
space and diagnostic observers. Principal component the model-based methods. These approaches will be
analysis and its application to fault detection and described in more detail in the sequel.
diagnosis are outlined. The article closes with some Alarm Systems
thoughts about future directions.
Alarm systems rely on the comparison of individual
Keywords: fault detection; fault diagnosis; residual plant measurements to their respective limits. The limits
generation; consistency relations; parity space; may be two- or one-sided (upper and lower limit or upper
diagnostic observers; principal component analysis. limit only), and may have one or two levels (preliminary
Introduction and full alarm). Momentary comparisons may be
extended to include trend checks. Alarm systems are
Faults are malfunctions of various elements of technical
relatively simple but suffer from two major
systems. Extreme cases of faults, called failures, are
shortcomings:
catastrophic breakdowns of the same. The technical
systems (the “plant”) we are concerned with range from - They have very limited fault specificity. A variable
complex production systems (chemical plants, oil exceeding its limit is not a fault but a symptom of
refineries, power stations) through major transportation faults. A single component fault may cause alarm on
equipment (airplanes, ships) to consumer machines many variables and a particular alarm may be due to
(automobiles, home-heating systems, etc). The faults various component faults.
may affect various parts of the main technical system - They have limited fault sensitivity. What is
(motors, pumps, storage-tanks, pipelines) or devices “normal” for a plant output variable depends on the
interfacing the main technical system with computers value of the plant inputs. Such relationship,
providing for control, monitoring and operator however, cannot be considered without a plant
information. These latter include sensors (measuring model, therefore the alarm thresholds need to be set
devices) and actuators (devices acting on the process, conservatively high.
such as valves). Because of their simplicity, and in spite of the above
The objective of fault detection is to determine and shortcomings, alarm systems are widely used in
signal if there is a fault anywhere in the system. Fault industrial applications.
diagnosis is aimed at providing more specific
information about the fault; fault isolation is to pinpoint
Model-Based FDD Concepts The (“primary”) residual vector e(t), in response to
additive faults, is
Model-based methods utilize an explicit mathematical
model of the plant. Such model is obtained usually from e(t) = y(t) - M(q, ) u(t) = S(q, ) p(t)
empirical plant data by systems identification methods
Here the central part of the equation is the computational
or, exceptionally, from the “first principles”
form of the residual while the right-hand side is its fault-
understanding of the plant. Model building, though
effect form. If there are multiplicative faults, then  = 
critical to the success of model-based FDD, is usually
not considered part of the FDD effort. The models may + , where  is the nominal parameter vector and 
be linear or nonlinear, static or dynamic, and continuous- is its change (the parametric fault); now the residual
or discrete-time. In FDD, most frequently linear discrete- vector e(t) is (Gertler, 1998)
time dynamic models are used. e(t) = y(t) - M(q, ) u(t) = j (M(q, )/j) u(t) j
The fundamental idea of model-based FDD is the Enhanced residuals. To facilitate the isolation of faults,
comparison of measured plant outputs to their estimates, the primary residuals e(t) are subject to some
obtained, via the mathematical model, from measured or enhancement manipulation. The three widely used
actuated plant inputs (Figure 1.). Any discrepancy is (at enhancement techniques are:
least ideally) an indication that a fault (or faults) is (are)
present in the system. Mathematically, the difference - Structured residuals, whereas each residual is
between the measured output yi(t) and its estimate yi^(t) selectively sensitive to a subset of faults, resulting
is a (primary) residual (Willsky, 1976): in a fault-specific set of zero/nonzero residuals upon
a particular fault (fault codes);
ei(t) = yi(t) - yi^(t) - Directional residuals, whereas the residual vector
In general, residuals are quantities that are zero in the maintains a fault-specific direction in response to
absence of faults and non-zero in their presence. each particular fault.
- Diagonal residuals, whereas each residual responds
Unfortunately, it is not only the faults that can make the only to a particular fault.
residuals nonzero. Usually the plant is subject to
disturbances (unmeasured deterministic inputs) and Residual generators take the input and output
noise (unmeasured random inputs) (Figure 1). In observations from the plant and generate enhanced
addition, and most importantly, model-based FDD is residuals by one of the above schemes, utilizing the
subject to model errors (either due to initial inaccuracies mathematical model of the plant (Figure 2).
in model building, or to changes in the physical plant). Dealing with noise. Noise is practically unavoidable in
The FDD algorithm should be designed, as much as physical systems. In FDD, basically two steps may be
possible, to be insensitive to noise and “robust” in face taken to reduce the effect of noise:
of disturbances and model errors.
- Residual filtering. This can be achieved by basing
Additive and multiplicative faults. Depending on the decisions on moving averages of the residuals, or by
way they appear in the system equations, faults may be applying explicit low-pass filters to the residuals, or
additive or multiplicative. Additive faults are sensor and by designing the residual generators in such a way
actuator biases, leaks in the plant, etc. Multiplicative that they have built-in low-pass behavior.
faults are changes in the plant parameters. In the - Statistical testing of the residuals. Structured
following input-output relationship, u(t) is the vector of residuals are tested individually; each scalar residual
observed (measured or commanded) plant inputs, y(t) is is then represented by a Boolean 1 or 0, depending
the vector of measured plant outputs and p(t) is the on the outcome of the test. Directional residuals are
vector of additive faults; t is the discrete time. M(q) and tested as vectors against multivariable distributions.
S(q) are transfer function matrices in the shift operator q The test thresholds are determined either
and  is the vector of plant parameters. Then: theoretically, using assumptions for the source
y(t) = M(q, ) u(t) + S(q, ) p(t) noise, or empirically based on measurements from
fault-free operating conditions.
Dealing with disturbances. Additive disturbances are The desired behavior of the residuals is specified as r(t)
un-measurable inputs. If the disturbance-to-output = Z(q)p(t), where the specification Z(q) contains the
transfer function (or equivalent state-space basic residual properties (structure or directions) plus the
representation) is known then it is possible to design residual dynamics. The resulting design condition is
residuals that are completely decoupled from W(q)S(q) = Z(q). If the S(q) matrix is square, that is
(insensitive to) those disturbances. However, the FDD usually the case (Gertler, 1998), then this can be solved
algorithm is subject to a certain degree of “design for W(q) by direct inversion. The residual generator has
freedom”, defined by the number of outputs in the to be causal and stable; this can always be achieved by
physical system; disturbance decoupling is competing the appropriate modification of the dynamics in Z(q).
for this freedom with fault isolation enhancement. If
Parity space (Chow and Willsky, 1984). This method,
there are too many disturbances, or if their path to the
also known as the “Chow-Willsky scheme”, relies on the
outputs is unknown, then only approximate decoupling
state-space description of the system:
is possible, making FDD also approximate, usually
designed to optimize some (H-infinity) performance x(t + 1) = A x(t) + B u(t) + E p(t)
index. y(t) = C x(t) + D u(t) + F p(t)
Dealing with model errors. Model errors are also Stacking n consecutive output vectors y(t) (where n is the
unavoidable in most practical situations. This is the most order of the model), and chain-substituting the state x(t),
serious obstacle in the application of model-based FDD yields the equation
techniques. In some very special cases, uncertainty of a
particular plant parameter may be handled as a Y(t) = J x(t – n) + K U(t) + L P(t)
“multiplicative disturbance,” and residuals designed to where Y(t), U(t) and P(t) are stacked vectors and J, K
be explicitly decoupled from it. In general, however, and L are hyper-matrices composed of the A, B, C, D,
only approximate solutions are possible, reducing the E, F matrices. Now
residuals’ sensitivity to modeling errors, at the expense
of also reducing their sensitivity to faults. Design E*(t) = Y(t) – K U(t) = L P(t) + J x(t – n)
methods utilizing some optimization techniques, mostly would be a stacked vector of primary residuals, was it
based on H-infinity or similar performance indices, are not for the presence of the inaccessible initial state x(t-
available in the literature (Edelmayer, Bokor and n). To obtain true residuals, a transformation ri(t) = wi
Keviczky, 1994). E*(t) is necessary, so that wi J = 0. Any vector wi
Residual Generation Methods satisfying this orthogonality condition is a parity vector,
together spanning the parity space. Any parity vector
For linear dynamic systems, provided exact (non- yields a true residual ri(t); they can be so chosen that a
approximate) solution is possible, there are three major set of residuals possesses structured behavior.
techniques to design residual generators: (i) direct
consistency (parity) relations; (ii) parity space; (iii) Diagnostic observers. Various observer schemes have
diagnostic observers. We will briefly introduce the three been extensively investigated as possible residual
methods, for discrete-time plant models and additive generator algorithms. The basic full-order Luenberger
faults. Note that though they look formally different, if observer (assuming D=0) is
designed for the same plant under the same design x^(t + 1) = A x^(t) + B u(t) + K e(t)
conditions, the three methods yield identical residuals
(Gertler, 1991). where K is the observer gain matrix and

Direct consistency (parity) relations (Gertler, 1998). e(t) = y(t) – C x^(t)


The input-output model of the plant is utilized directly in is the innovation vector. If the observer is stable then,
the design. The enhanced residuals are obtained from the apart from the start-up transient of the observer, the
primary residuals by a transformation W(q): innovation qualifies as the primary residual. The gain
r(t) = W(q)e(t) = W(q) [y(t)–M(q)u(t)] = W(q)S(q)p(t) matrix K is the major design parameter; it is chosen to
place the observer poles, thus achieving stability and
desired dynamic behavior (e.g. noise suppression). The
remaining design freedom can be utilized to influence models is straightforward but it increases the size of the
residual properties. The latter are further affected by the model, proportionally to the dynamic order of the model.
transformation r(t) = He(t), where the H matrix is an
Summary and Future Directions
additional design parameter. Diagnostic observers can be
designed for both structured and directional residuals Fault Detection and Diagnosis is today a mature field of
(Chen and Patton, 1999; White and Speyer, 1987). Other systems and control engineering. There is a very
observer schemes, most notably the unknown input significant level of activity, as measured in published
observer, have also been proposed (Frank and papers and conference contributions, but much of this (in
Wunnenberg, 1989). The detailed design procedures of the opinion of this author) is just refinements of earlier
diagnostic observers go beyond the scope of this article. results. This applies particularly to the long ongoing
quest to create “robust” FDD algorithms, especially in
Principal Component Analysis
the face of model errors.
Principal Component Analysis is extensively used in the
There are still open challenges in a couple of areas, most
monitoring of complex plants with hundreds of variables
notably extensions to various non-linear or parameter
because, by revealing linear relations among the
varying problems. Another open and active area, of great
variables, it significantly reduces the dimensionality of
practical importance, is FDD in networked control
the plant model (Kresta, MacGregor and Marlin, 1991).
systems. What is really of the greatest interest, though, is
The application of PCA for FDD implies two phases. In
the application of the wealth of available theoretical
the training phase, an implicit plant model is created
results and design methods to real-life problems; there
from empirical plant data. In the monitoring phase, this
has recently been some visible progress here, a most
model is used for FDD.
welcome development.
Training data (measured inputs and outputs) are
References
collected from the plant during fault-free operation. The
covariance matrix of the data is formed and its Chen, J. and Patton, R.J. (1999). Robust Model-Based
eigenstructure obtained. Due to linear relations among Fault Diagnosis for Dynamic Systems. Kluwer, Boston,
the data, some of the eigenvalues will be zero (or near- Dordrecht, Amsterdam.
zero, in the presence of noise). The eigenvectors Chow, E.J. and Willsky, A.S. (1984). Analytical
belonging to the non-zero eigenvalues form the data- redundancy and the design of robust failure detection
space, where the fault-free data exist, while those systems. IEEE Transactions on Automatic Control, AC-
belonging to the zero eigenvalues form the residual 29, 603-614.
space.
Edelmayer, A., Bokor, J. and Keviczky, L. (1994). An
It is the residual space that is utilized for FDD. The H-infinity filtering approach to robust detection of
projection of a measurement vector onto the residual failures in dynamic systems. 33rd IEEE Conf. on
space is the (primary) residual. A statistical test on its Decision and Control, Lake Buena Vista, FL.
size leads to a detection decision (absence or presence of Frank, P.M. and Wunnenberg, J. (1989). Robust fault
faults). A threshold test is necessary because noise also diagnosis using unknown input observer schemes. In:
causes non-zero residuals. An analysis of the Fault Diagnosis in Dynamic Systems (Ed.: R. Patton, P.
eigenvectors spanning the residual space shows how the Frank and R. Clark), Prentice Hall, Upper Saddle River,
various faults propagate to the primary residual. This NJ.
allows for the design of residual manipulations yielding
structured or directional residuals, just like in the FDD Gertler, J. (1991). A survey of analytical redundancy
methods based on exact models (Gertler, Wu, Huang and methods in fault detection and isolation. Plenary paper,
McAvoy, 1999). IFAC Safeprocess Symposium, Baden-Baden, Germany.
Gertler, J. (1998). Fault Detection and Diagnosis in
The procedure as described above applies to sensor and
Engineering Systems. Marcel Dekker, New York.
actuator faults; inclusion of plant faults requires extra
effort (and experiments). Also, PCA is primarily meant Gertler, J., Li, W., Huang, Y. and McAvoy, T. (1998).
for static models. Its extension to discrete-time dynamic Isolation enhanced principal component analysis. AIChE
Journal, 45, pp. 323-334.
Isermann, R. (1984). Process fault detection based on inputs outputs
modeling and estimation methods. Automatica, 20, 387- PLANT
404.
Kresta, J.V., MacGregor, J.F. and Marlin, T.E. (1991).
Multivariate statistical monitoring of processes.
Canadian J. of Chemical Engineering, 69, 35- MODEL
White, J.E. and Speyer, J.L. (1987). Detection filter
design: Spectral theory and algorithm. IEEE RESIDUAL
Transactions on Automatic Control, AC-32, 593-603. GENERATOR
Willsky, A.S. (1976). A survey of design methods for
failure detection in dynamic systems. Automatica, 12,
601-611. residuals
Figure 2.

faults
noise disturbances

inputs outputs
PLANT

+ primary
- residuals

MODEL

Figure 1.

faults
noise disturbances

You might also like