You are on page 1of 9

M. A. Sartori and P. J.

Antsaklis, "Implementations of Learning Control Systems Using Neural


Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

Implementations of Learning Control


Systems Using Neural Networks
Michael A. Sartori and Panos J. Antsaklis

The systematic storage in neural networks havior. As discussed later, a better approxima- In the next section, the neural network and
of prior information to be used in the design tion can be developed, for example, by adding the design of the neural network to approxi-
of various control subsystems is investigated. another neural network in parallel and using mate a desired function are presented. The
Assuming that the prior information is avail- training procedures to better approximate the neural networkchosen here uses sigmoid non-
able in a certain form (namely, input/output desired behavior. linearities, while an alternative approach
data points and specifications between the With t h e p r o p o s e d neural network using Gaussian nonlinexities was introduced
data points), a particular neural network and a architecture, the following problems are inves- in [I], [2]to address the same problem. Neural
corresponding parameter design method are tigated: dynamical system (plant and controller) networks for plant and controller modeling,
introduced. The proposed neural network ad- modeling, fault detection and identification, in- fault detection and identification, information
dresses the issue of effectively using prior formation extraction, and control law schedul- extraction, and control law scheduling are dis-
information in the areas of dynamical system ing. In all of these implementations, it is assumed cussed, and three examples are supplied to
(plant and controller) modeling, fault detec- that a set of training points and certain specifica- illustrate the design scheme.
tion and identification, information extrac- tions for the behavior between the training points
tion, and control law scheduling. are provided. With this training set, the neural Neural Network Design
network is designed to exactly satisfy the
Incorporating Prior Knowledge specifications for the interpolation between the The neural network design procedure is
operating points. The proposed neural network presented first for the implementation of a
has two layers of weights with sigmoid non- si ngle-i n pu t- mu 1ti -ou tpu t function and
In many practical control problems, there
linearities for the hidden layer and linear func- second for the implementation of a multi-
exists substantial prior information about the
tions for the output layer. The connection of the input-multi-output function.
various subsystems of the control system. In
modeling these subsystems via neural net- hidden layer to the output layer and the specific
works, it is desirable to incorporate this prior choices for the weights are unique aspects of this Single-lnput Neural Network
knowledge into a neural network. A particular neural network design approach.
neural network architecture and an associated The neural network design procedure In all of the neural network implementa-
design methodology are presented in this presented in this paper represents one possible tions, the neural network is designed to
paper to accomplish this for certain types of approach for satisfying the relationship be- approximate a function given a specific set of
prior knowledge. Using this procedure, prior tween the input/output pairs of the training set. training pattems and a set of generalization
knowledge is directly and systematically Clearly, there exist other methods to (mathe- specifications. It is first assumed that the func-
stored in the neural network with no training. matically) describe the desired curve and tion to be approximated is a single-input func-
This has the advantage of having the informa- solve the problem; for instance, polynomial or tion. For the training pattems, let v u ) denote
tion in a neural network form, which can be spline approximations can be used to represent the jth input pattern for 1 5 j 5 p , and let the
quickly accessed once implemented in the desired function. An advantage of using p-dimensional vector v denote the vector of all
hardware. This neural network can be used as the neural network approach described here input patterns arranged in ascending order,
an initial approximation to the system’s be- instead of one of these schemes is that the where v(j)>v(j- 1) for2 2 j 2 p . Let diu) denote
actual construction of the neural network in the ith component of the jth desired output
An earlier version of this paper was presented hardware would utilize the inherent paral- pattern for I G l n and 1 5 j 5 p , and let the
at the 1991 IEEE International Conference on lelism of the neural network and hence result n-dimensional vector du) and the p-by-n-
Intelligent Control Systems, Arlington, VA, in a fast processing time. Compared to other dimensional matrix D denote the vector for 1
August 13-15, 1991. Michael A. Sartori was neural network techniques, the design scheme 5 j 5 p and the matrix of desired outputs,
with the Department of Electrical Engineer- here has several advantages. For one, there is respectively. Thus, the p pairs (vu), d u ) ) are
ing, University of Notre Dame, Notre Dame, no training required; only the design time the given training patterns.
IN 46556. He is now with the David Taylor needed to choose the appropriate function With u as a scalar, let @ ( U ) be the n-dimen-
Research Center, Code 1941, Bethesda, MD parameters (i.e., the specifications for the in- sional output of a function approximating the
20084. Panos J. Antsaklis is with the Depart- terpolation between the training pairs) is relationship described by the training set [ v u ) ,
ment of Electrical Engineering, University of necessary. Also, exact control of the general- d u ] for 1 <: j 5 p. Let @i(u)denote the ith
Notre Dame, Notre Dame, IN 46556, This ization between the training points is guaran- component of the functions output for 1 I i 5 n.
work was partly supported by the Jet Propul- teed via the design scheme. Furthermore, the Three specifications are made on the function $:
sion Laboratory under Contract 957856 and number of layers and the number of neurons (i) If U = vu), then $ ( U ) = d(j).
the ALCOA Foundation under a Science Sup- needed to correctly implement the desired (ii) If U E [ v u )- ~ j -v,u ) + ~ j + and
] U # v u ) , then
port Grant. function are known precisely. $i(U) E [diu)- yij-, diQ) + ~ i j + l .

0272- 1708/92/$03.000 1992IEEE


April 1992 49

Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on October 7, 2009 at 20:52 from IEEE Xplore. Restrictions apply.
M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

i I n and z(u) denote the output of the neural moid curve is achieved, but by allowing sk to
network. An individual output of the neural vary, a class of sigmoid functions is possible,
network is described by the weighted sum which allows for greater flexibility in the
Z i ( U ) where 1 I i 5 n, wik is a weight of the ith design of the interpolation curve. As an ex-
linear neuron in the output layer, and gk is the ample of the formation of a Gaussian-type
difference between two sigmoid functions: curve, Fig. 2 shows the output of gk(u) = o(u,
-2, 5) - ~ ( u 1,, 20) which can be viewed as a
Gaussian-type curve centered at 0 with asym-
metric sides. This asymmetry has advantages
over the symmetrical Gaussian function used
in the other neural network approach
described in [ l ] and [2]; mainly, the sigmoid
Let W denote the (h + 1)-by-n-dimensional formed function aptly handles nonequidistant
matrix of weights for the output layer. The training sets.
Fig. 1. Illustration of curve spec$cations. output of the kth node in the neural network's
hidden layer is described by the following
Design for the Single-Inpui
where 1 5 k 5 h + 1, o is the sigmoid non-
Neural Network
(iii) If U E [v(j), v ( j + l)], then $;(U) E [di(j), linearity of the hidden layer neurons, ck is the
d;(j + l)]. bias (or "center" of the sigmoid function), and
As an example, Fig. 1 illustrates one way In designing the neural network, its
sk is the weight (or "slope" of the sigmoid
parameters are chosen such that its output z is
in which these specifications can be viewed. function):
in the class of functions $ described pre-
The dots correspond to the training points, and
viously. The proposed neural network design
according to specification (i), the approximat-
scheme is a selection process based on the
ing curve must pass through these points. The
specifications for the interpolation between
boxes surrounding the dots correspond to the and
the operating points and does not require train-
boxes described in specification (ii), and the
ing. With this approach, exact control of the
boxes between the dots correspond to the
neural network's generalization behavior is
bounding of the interpolation curve per
achieved, and the three specifications
specification (iii). The approximating curve For all U, let described above are satisfied precisely, as will
must pass correctly through both sets of boxes.
be shown.
Clearly, there exist numerous curves that satis-
The number of hidden layer nodes is set
fy these three specifications. With the neural
equal to one less the number of training pat-
network design scheme described here, a sub- and
set of the class of curves described by $ is tems, that is h = p - 1. The center ck for the kth
achieved in which these three specifications node is set equal to the value midway between
the kth and the (k + 1)th input training points
are exactly satisfied.
for 1 I k 5 h:
For the neural network, let the scalar U Let c denote the n-dimensional vector of
denote the neural network's input, and let z centers, ands denote the n-dimensional vector
denote the neural network's n-dimensional 1
ck = -[v ( k)+v ( k + 1 )]. (6)
of slopes. With sk = 1, the conventional sig-
output. For a specific input U , let zi(u) for 1 5 2

Since Ck + 1 2 ck, gk here approximates a


1 Gaussian-type nonlinearity centered around
v(k), and this allows for a localized effect at
the output of the nodes gk with the appropriate
0.8 choice for each slope sk.
To aid in satisfying specifications (ii) and
(iii) for the output of the neural network, re-
0.6 quirements relating the neural network's out-
put to its parameters are made. First, the
widths s are chosen such that (1) can be ap-
0.4 proximated by the following weighted sum
when U E [v(j), v ( j + l)]:

0.2

k =j
0
-3 -2 -1 0 1 2
This implies that the tails of the outputs gdu) for
k # j and k # j + 1 are small compared to those
Fig. 2. The Gaussian-typefunction gk(u) = O(u, -2, 5 ) - O(u, 1, 20).
fork = j and k = j + 1 when U E [dJ,v(j + l)].

50 /€E€ Control Systems


M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

The approximation of (7) also implies that choice of o(u, CO, so) and o(u, ch+l, sh+l) can
when the node gk has a larger response com- be modified to accomplish this.
pared to the other nodes, the input is closest to
the kth training pattern. In other words, for 1
Multi-Input Neural Network
If5 h + 1 and for U E [v(k)- Ak-i, v(k) + Ak]
where Ak is defined such that gk(v(k)+ &) =
gk+l(v(k)+ Ak), Given a specific set of training pattems and
the three generalization specifications, the
design of a neural network to approximate a
multi-input-multi-output function is presented
Via (8), the localized effect of the hidden layer in this section. For the training patterns, let the
neurons is shown. The output of the neural m-dimensional vector v(j)denote the jth input
network passing through a box specified by pattern for 1 5 j I p, and let the scalar v&) be
Wk + 1 - Wk
specification (ii) assumes the general shape of C(k)=
d(k)-yk- - Wk'
the rth component of vu) for 1 5 r Im. (Note
the output of the kth hidden layer node if Ek- I that it is not required to arrange these in an
Ak.1 and &k+ 5 Ak. To further aid in satisfying order as with the single-input case.) Let the
and
specifications (ii) and (iii), the slopes s are n-dimensional vectord(j) denote thejth output
chosen such that for all U: pattern for 1 5 j 5 p , and let the scalar diu)
denote the ith component of the jth desired
output pattem for 1 I i 5 n and 1 5 j 5 p . Let
the p-by-n-dimensional matrix D denote the
If d(k)= d(k + l), then specification (iii) requires matrix of desired outputs. Thus, the p pairs
zj(u)=d(k)=d(k+ 1 ) f o r u ~[v(k),v(k+ l)],and { v ( j ) ,d(j)] are the given training patterns.
In conjunction with (7) and (8), (9) implies the solving of (1 1) and (12) for U = v(k + 1) -
For the m-dimensional vector U ,let $(U)be
that for U E [v(k), v(k + l)] the sum of the ~ ( k + i ) -and U = v(k) + &k+ is unnecessary.
the n-dimensional output of a function ap-
outputs of gk(u) and gk + ](U) is constant, the Thus, the neural network parameters can
proximating the relationship described by the
and node gk contributes more to the sum when be chosen to exactly satisfy the three
training set {vu), d ( j ) }for 1 Ij I p. Let
U is close to v(k) and less when it is closer to specifications for the interpolation between
denote the ith component of the functions
v(k+ 1). Thus, the choice of the slope sk clearly the training points. With h = p - 1, the centers
output for 1 I i <: n. Three specifications are
affects the shape of the interpolation curve and c are chosen per (6) and the weights W can be
made on the function $:
the localization properties of Gaussian-type approximated with D . Using (11) and (12),
limits for the widthss are found. Various selec- (i) If u = vu), then o ( u ) = du ).
nodes gk.
With the centers and the slopes specified tions fors can be made based on (1 1) and (1 2 ) (ii) If Ur E [ V & ) - Erj., V&) + E r J+] and U r f
for the hidden layer, the weights for the linear and the corresponding W can be recomputed v&), then @j(u)E [diu) - TI-, diu) +yo+].
output layer are chosen next. Let G denote a via (10) until a desirable curve is achieved. (iii) If U is "between" any two immediate
p-by-(h + 1)-dimensional output matrix of the The resulting neural network describes a func- training patterns, v u ) and v(k), then @(U) is
hidden layer nodes for each of thep operating tion which is within the class of functions "between" the corresponding output pattems
points. Thus, the output layer weights are described by O(u) and exactly satisfies the d&) and di(k).
found by solving the following linear system specifications for the interpolation between
of equations: the training points.
Due to the way in which the neural network The meanings of all the specifications are
parameters are chosen, the behavior of the the same as for the single-input case. How-
GW=D. (10) ever, descriptive language is used to state
neural network is expressed in (1)-(3). How-
specification (iii) for the multi-input case
Since G is square (h = p - 1) and nonsingular ever, the neural network can be reconfigured
in a more economical fashion, and (1) and (2) since the indexing of the higher dimension
because of the particular choices for c and s,
(1 0) has a unique solution W . Furthermore, are equivalent to the following where 1 I i I patterns becomes complex and may not be
n and o(u, c, s) is defined in (3): needed, as is explained below.
due to the choice of s per (7) and (8), if the
neural network's input U is exactly the kth For t h e neural network, let the n-
operating point, the output of the kth hidden dimensional vector u denote the neural
layer node gk(u) is 1 while the outputs of the network's input, and let z denote the neural
network's n-dimensional output. For a
other hidden layer Gaussian-type nodes are 0.
Hence, G can be assumed to be the identity specific input U ,let zi(u) for 1 5 i I n and z(u)
Due to the choice of o(u, CO, SO) in (4) and denote the output of the neural network. For
matrix, and W can be approximated by D .
Using (7), (8), and (10) and the simplify- o(u, c h + i , sh+l) in ( 5 ) , when the neural the multi-input neural network, an individual
ing assumption that n = 1, it was shown in networks input is outside the design domain, output of the neural network is given by the
[ l ] that to satisfy specification (ii) for all the neural network's output is equal to either following where 1 I i 5 n:
operating points, the slope sk needs to be the first or last desired output from the training
chosen such that the following are met for 1 set. In other words, if U < v(l), z(u) = d(l), or
<k<h: if U > v(p),then z = d(p).If instead it is desired
that if U <c v( 1) or U >> v(p) then Z(U) = 0, the

Apri/ 1992 51

Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on October 7, 2009 at 20:52 from IEEE Xplore. Restrictions apply.
M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

For m = 1, (14) reduces to (1). The output of denote thep,-dimensional vector of values of With the centers and slopes chosen to satis-
the rkrth hidden layer node is described by the the input pattems for the rth dimension such fy (19) and (20), which aid in satisfying
following where 1 I k, I h, + 1, 1 I r I m, and thatvr(kr+i)2 vrk, for I 5 r I m and 1 I k , I pr. specification (iii), the weights w;k I .. k, can
where o(u, c, s) is defined in (3): The counting of the p input pattems is not be found. Forming the p-by-[(hi + 1)...(hm +
grk,(ur) = ~ ( ucr(kr-l),
, s<k,-lj) important; rather, their location in the input l)] matrix G from the outputs of the Gaussian-
- o ( U r , Crk,, Srk,). space is important. The number of hidden layer type nodes, (10) is solved for the unique [(hi
(15) nodes in the rth dimension is set equal to one + 1)...(hm + l)]-by-n matrix W. Furthermore,
less the number of input pattems, that is h, = (11) and (12) can be extended to the multi-
For 1 I r I m and for all U,, let pr-l,forlIrIm.ForlIrImandforlI dimensional case with the appropriate chan-
k , I h,, the center c,k, is set equal to the ges to the indices. Thus, a neural network can
distance halfway between the rk,th and the r(kr be designed to satisfy the specifications for its
+ 1)th input pattems generalization behavior in the multi-input
and case.
If the input pattems are not equidistant, two
1, Sr(hr+lj) = 0.
possible design choices are considered here: ad-
o ( ~ rcr(hr+-+l
, (1 7 )
ding extra pseudo-input pattems such that equi-
Since distance occurs or locating the boundaries of the
Let C r denote the hrdimensional vector of
hyper-rectangles for each input pattem such that
centers, and let sr denote the h,-dimensional
the entire input space is covered. The f i s t pos-
vector of slopes for 1 5 r 5 m. Since the sizes
sibility is more viable when most of the input
of C r andsr are not restricted to be the same for
approximates a Gaussian-type nonlinearity pattems are equidistant, and only a few extra
1 I r I m, the formation of a matrix of centers
centered around vrk,, and this allows for a pattems are needed. Once this is accomplished,
and a matrix of slopes may not be possible. In
localized effect in the shape of a hyper-rec- the above procedure for choosing the centers
(14), the outputs of the Gaussian-type nodes
for equidistant input pattems is followed for
are multiplied together, which is unusual when
tangle at the output of the node n g r k , ( U r ) with the set of input pattems and added pseudo-
compared to conventional neural networks. -1 input patterns. When solving (IO), extra
With the multiplications, the specification of the appropriate choice for each slope srk,. pseudo-desired outputs are added such that W
the centers and the slopes for the sigmoid
In choosing the slopes to satisfy the is a unique solution. These added outputs can be
neurons corresponds to the specification of
specifications for the interpolation between chosen to be the same as nearby ones orextrapo-
hyper-rectangles around the operating points.
the training pattems, the multidimensional lated values from surrounding ones.
By appropriately choosing these parameters,
equivalents of (7) to (9) need to be satisfied. For the second possibility when the input
the entire input space can be mapped to par-
For (7), this implies that for any input U the patterns are not equidistant, hyper-rectangles
ticular desired outputs of the neural network
output is dependent only on the closest Gaus- are found for the given input patterns such that
scheduler. However, a drawback is the added
sian-type nodes. For 1 I f, I hr + 1 and for the entire input space is covered. Since the
complexity of the neural network due to the
u r E [Vrk, - Ar( k, - I ), Vrk, + Ark, 1 where Ark, locations of the input pattems are no longer
required multiplications.
is defined such that equidistant, (14) can be replaced by the fol-
lowing where I I i I n:
Design for the Multi-Input
Neural Network

The selection of the multi-input neural for 1 5 r I m and 1 I k , I h,, the multidimen-
network's centers, slopes, and weights is sional equivalent of (8) is
divided into the two cases of equidistant input For low dimensional inputs, the selection of the
patterns and nonequidistant input patterns hyper-rectangles may be performed manually.
with the case of equidistant input pattems However, for higher dimensions, this may be-
considered first. The term "equidistant" for the Furthermore, the interpolation curve pass- come impractical. In [3],this problem is ad-
m-dimensional case refers to the input pattems ing through a box described by specification dressed as the "CRm" problem (or "CRd using
being an equal distance apart in each dimen- (ii) assumes the general shape of the output of their notation). The CR, problem is known to
sion. Instead of denoting the input pattems as have a high computational complexity, and Gon-
v u ) for 1 I j I p, the set of all equidistant input zalez presents algorithms for obtaining a subop-
pattems is considered to be comprised of ele- timal solution. Once the hyper-rectangles are
ments from vectors containing information on all found (either manually or algorithmically), the
the values of all the input pattems. Letp, denote the and centers of the sigmoid neurons are chosen as the
number of values for the rth dimension such that Elk,, 5 Ark, boundaries between them. The slopes are chosen
to satisfy the specifications, and the output layer
for 1 I r I m.For (9) and for all U , weights are found via (IO). Thus, through a
selection of the hyper-rectangles around the
input pattems, the parameters for the neural net-
work are chosen, and the specifications for the
neural network's output are satisfied.

52 /€E€ Control Systems


M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

Neural Network Implementations tain types of knowledge can be implemented model the plant’s known nonlinear behavior,
as a neural network. Specifically, if the the it is anticipated that less time will be required
In this section, the use of the neural net- nonlinear information can be expressed within to train the second neural network. In par-
work architecture introduced in the preceding the framework of the training set and the three ticular, the tens of thousands of back-propaga-
section is described for the following four specifications for the interpolation between tion iterations that are normally required to
problems: dynamical system modeling, fault the training patterns, the design scheme train such a network may be reduced.
detection and identification, information ex- proposed here can be used. Constructing a Further, by modeling the known nonlinear
traction, and control law scheduling. neural network to describe the known non- dynamics of the plant with a neural network,
linear behavior of the plant and using it as part a total neural network plant identification
Dynamical System Modeling of the plant identification process, a second structure is formed. This has the advantage of
neural network may be trained using learning being able to be incorporated into the training
The neural network proposed in this paper algorithms to modify the plant’s estimated of a neural network controller, which is one of
can be used to model the plant’s or controller’s dynamics. Such a scheme i s shown in Fig. 3 the purposes of training a neural network plant
dynamics. In [4], known linear information is and can be viewed as modeling the known estimator as described in [SI and as used in [4].
used to assist in the identification of the plant, dynamics with one neural network and train- This controller training scheme is shown in
yet prior nonlinear knowledge may also be ing another neural network to leam the un- Fig. 4. Since the desired plant input is un-
used in the plant identification process. With known dynamics. This general approach is known, the desired output of the neural net-
the proposed neural network architecture, cer- discussed in IS]. By using a neural network to work controller is unknown. However, the
desiredoutputofthe plantis known. Substitut-
ing the actual plant with aneural network plant
estimator, a multi-layer neural network may
I
I be trained with the back-propagation algo-
I
Neural Network I rithm, or another gradient descent algorithm
I Prior Information I
I to minimize a plant output error cost function.
I
+ I

7
I Via the chain rule, the derivative of the cost
function with respect to the plant estimator’s

f
I
I
input is used in the computation of the deriva-
I tive of the cost function with respect to the
neural network controller’s weights. Next, in
reference to Fig. 4, this gradient is computed
assuming a SISO plant for simplicity.
The cost function i s a sum of the squares
of the plant output errors for the p training
patterns:

Fig. 3. Training a neural network f o r plant modeling.


To compute the weight change in the neural
network controller, 6F/6u is required. So,

I I

and

Fig. 4. Training a neural nehuork controllev.

April 1992 53

Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on October 7, 2009 at 20:52 from IEEE Xplore. Restrictions apply.
M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

10 est to the discrete event control law, the boun-


I I I I I I I I I
daries of the regions are transformed into
9- design specifications for the centers of the
neural network's sigmoid functions. By
8-
changing the slopes of the sigmoid functions,
7- membership in a region can be made specific
c
(i.e., a large slope) or non-specific (i.e., a small
I2
6-
a slope) depending on the type of information
c

7J 5- required by the discrete event control law.


.-
c
N For a digital controller, the neural network
9? 4- architecture proposed here can perform the
5: function of analog-to-digital (AD)conver-
6 3-
sion. The centers of the sigmoid function are
2- chosen to correspond to the quantization
regions. For a clear distinction between
1-
regions, the slopes are chosen larger. This
feedforward neural network design can be
O ' i i i i b i i t i b i o compared to the feedback neural network
design of Tank and Hopfield in [8] and to its
improved design described in [9]. In compar-
Fig. 5. Neural network output of the A D converter of Example 1. ing the feedforward design to the feedback
design, no time is required to wait for the
Computing 6yn(i)/6uand assuming 6yr(i)lGuis output of the feedforward neural network to
A drawback of this approach is that for
determined, the gradient 6F/6u is found. settle, as is required for the feedback neural
multi-input failure detection systems, the
Relaxing the assumption of a SISO plant, the network design. The problems with determin-
valid fault regions are restricted to hyper-rec-
ing and designing the regions of attraction for
extension to the multi-input case is straight tangles. If irregular shaped fault regions
the neural network's minima in the feedback
forward since each grk in (14) is dependent defined by straight lines are desired, the
design do not occur whatsoever with the feed-
on only one input and not the entire input method initially described in [6] and further
forward design; the "regions of attraction" are
vector. Thus, the neural network architecture developed in [ I ] may be employed. If it is
known exactly and can be easily designed
and design scheme proposed in this paper can desired that the neural network failure detec-
according to specifications. Comparing the
be used both to implement a plant estimator tion system detect, identify, and diagnose the
number of neurons required for an n-level A/D
and train a neural network controller. failure, the scheme described in [7], which is
converter, the feedback design requires n
illustrated using a JPL space antenna model,
neurons, while the feedforward design re-
is one possible approach.
Fault Detection and Identi3cation quires 3n neurons. Next, an A/Dconverter is
implemented with a neural network designed
Information Extraction
using the method proposed here.
The described neural network architecture
may be used for fault detection and identifi- The neural network architecture described
cation. As described in [I], [SI, the neural in this paper may also implement the general Example 1
network in this case operates outside the main control function of numeric-to-symbolic con-
loop and receives the appropriate signals from version. As considered here, numeric-to-sym- It is desired to convert analog signals in the
the plant: for instance, outputs, inputs, or en- b o k conversion is the task of processing data range of 0 to IO to all 11 discrete levels be-
vironmental conditions. The neural network that can be used by the controller as informa- tween 0 and IO. In addition, any signal less
then responds by declaring either a fault, no- tion. One such example of numeric-to-sym- than 0 is mapped to 0, and any signal greater
fault condition, or, with the neural network bolic conversion is fault detection and than 10 is mapped to 10. The discrete regions
design proposed here, a partial membership identification as discussed in the previous sub- are divided midway between the integer values.
signal. If fault patterns are known to occur for section. Here, the numeric data consists of the The training set becomes the 11 matched pairs
specific patterns, this information can be detection system's inputs: the plant's inputs of the input and output integer values. Using
stored in the neural network by choosing the and outputs, their derivatives, and the environ- the single-input neural network design proce-
training set of the neural network to coordinate mental conditions. The symbolic data is the dure in this paper, h = 10 and form (6),
with the known faults. Valid fault regions for occurrence of a fault or no-fault condition and
each fault can next be determined and stored the identification of the fault. c = [ O S 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.51'.
using the centers c of the sigmoid non- Another example, as discussed extensively
linearities. By choosing the slope of the sig- in [6],is the task of converting numeric data The slopes for the hidden layer neurons are
moid nonlinearities small, a "gray scale", or to a form usable by a discrete event controller. chosen large: sk = 500 for 1 I k S 10. Solving
partial membership, of the fault region can be The discrete event controller requires sym- (1 0), the output layer weights are determined.
included in the fault detection system design. bolic information describing the state of the The output of the resulting neural network is
By choosing the slope of the sigmoid non- plant. However, the output of the plant is in a shown in Fig. 5. The transitions between the
linearities large, clearly definable fault and numeric form, unusable to the discrete event regions are sharp due to the high gain chosen
no-fault regions develop. controller. By describing the regions of inter- for the sigmoid nonlinearities.

54 /€€E Control Systems


M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

Control Law Scheduling Table I


Initial Disturbancesand Parameter Sets
A neural network may be used to imple- ~

-
ment a control law scheduler. In this capacity, Amplitude 01 02 L L
__
the neural network can be viewed as a high
level decision maker operating outside the 2.0 0.093 10.05 49000.0 119703.96
conventional control loop to provide a higher 2.25 0.213 7.04 39200.0 131674.35
degreeofautonomy to the system [11,[51,[10].
Given a set of operating points and the as- 2.5 0.302 7.35 44046.1 145910.16
sociated set of parameters values for the con-
3.0 0.307 15.79 37325.7 183327.84
trol law, the neural network is designed to
satisfy given specifications for the interpola- 3.5 0.876 9.556 44046.1 175092.19
tion between the provided operating points.
The operating points become the training set, 4.0 0.808 10.48 29114.0 203493.90
and the specifications for the interpolation 203493.90
4.5 1.767 10.48 32025.4
between the operating points become the three
specifications discussed in the design of the 5.0 3.924 8.70 48450.7 140073.75
neural network.
In its operations as a control law scheduler, 5.5 6.928 7.73 41567.5 138395.55
the neural network is first supplied with infor- 6.0 11.08 10.05 41567.5 138395.55
mation about the system and its environment;
it then produces control law switching infor- 7.0 14.41 19.10 4 1567.5 110716.44
mation to the controller. The neural network‘s
inputs are the inputs and outputs of the plant
together with the reference signal. The output
of the neural network is the control law adjust-
ment signal sent to the controller; in this paper, neural network is implemented in analog Two examples, one for a single-input
this signal represents parameter values for the hardware, both the plant and the controller can neural network and one for a multi-output
controller. The neural network’s inputs are not operate in continuous time; in this case, the neural network to implement, are presented
restricted to these signals; other signals such neural network may supply the parameter next to illustrate the design of aneural network
as the plant’s states, derivatives, environmen- values to the controller as continuous vari- implemented control law scheduler.
tal conditions, or delayed values of any of ables. However, if the neural network is to be
these can be used as inputs to the neural net- implemented in software, both the plant and Example 2
work. Basically, any signal that may be the controller need to be discrete or discretized
designed into the operation of the scheduler is versions of continuous ones. In either case, the In [11], a parameter learning method is
used as an input to the neural network. Fur- designing of the neural network as discussed presented and used to define the region of
thermore, the plant and controller can operate in this paper remains unchanged. operation for an adaptive control system of a
in either continuous or discrete time. If the flexible space antenna. In one of the experi-
ments described, an initial pulse disturbance
is applied to the plant, and the adaptive con-

p,
troller is required to follow a zero-order refer-
I ence model. The goal of the parameter
learning system is to find and store values for
4.5- the four agaptive controller parameters ((31,
1
(32, L, and L,) for vaned amplitudes of a pulse
-
P
3 disturbance (system initial conditions) such
_a 4-
a that a defined performance index based on the
output of the plant is minimized. In Table I,
p 3.5-
the values found for the controller parameters
for different pulse amplitudes are repeated.
Using this table, the goal here is to construct

i
a neural network scheduler such that a smooth
interpolation is achieved between the 11
operating points using the design method of
2.5‘ I L I this paper.
2 3 4 5 6 7 8

Amplitude of Disturbance
For the neural network, the centers of the
sigmoid neurons in the hidden layer are set
halfway between the operating points accord-
Fig. 6. Neural network output for L for Example 2. ing to (6):

April 1992 55

Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on October 7, 2009 at 20:52 from IEEE Xplore. Restrictions apply.
M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

Table I1
Selected Flight Points for FlOO Engine
Amplitude Mach Controller
(1K ft) Number Parameters

10 0.75 2

10 I .00 3

10 1.25 4

20 0.50 I
0 1 2
30 0.75 5 Mach Number

30 1.OO 6 Fig.7. Selected FlOOflightpoints for Example


Fig. 8. Neural network output for the first
3.3.
scheduler of Example 3.

and Fig. 9 and are assigned values according to


c = [2.125 2.375 2.75 3.27 3.75 4.25
Table 11. The equivalent to (21) for this ex-
4.75 5.25 5.75 6.51’.
d = [ l 2 3 4 1 1 1 1 1 5 6 61’. ample is
The slopes for the hidden layer neurons are
c h o s e n a s s k = 7 5 f o r I I k I lO.With(lO),the The resulting vectors of operating points are
unique weights Ware found. Fig. 6 shows the
output of the sigmoid neural network VI =[I0 20 301’
scheduler for adaptive controller parameter L
and Only 4 sigmoids are needed to form the 6
along with the straight line approximation be-
regions, and the vectors of centers are
tween the operating points (dotted line). As
can be seen, z3(k) = d3(k) for 1 I kI 11, and v 2 = [OS0 0.75 1.00 1.25]’,
C] = [20]
specification (i) is met. In the regions nearby
the operating points, the adaptive controller where r = 1 corresponds to the altitude and
with r = 2 corresponds to the Mach number. and
parameter values specified by the neural net-
work scheduler are close to those specified by The equivalent to (16) for this example is
Table I satisfying specification (ii) for very c2 = [0.625 0.875 1.1251’
thin and wide boxes. In regions between
operating points, a swift yet smooth transition where r = 1 corresponds to the altitude and r
occurs between the operating points, and = 2 corresponds to the Mach number. By
specification (iii) is met. Between 5.5 and 6.0, choosing S l k = 20 and s2k = 400 for 1 Ik56
Using (20), the vectors of centers are and by solving (10) for w , the interpolation
specification (iii) requires a straight line, and
this is clearly satisfied. Thus, all specifications curve formed is shown in Fig. 10 with the
C I= [15 251’
can be shown to be satisfied. same comer coordinates as the previous two-

and
Example 3
cz = [0.625 0.875 1.1251’.

In [12], linear models of a FlOO engine are


The slopes are chosen as SI k , = 20 for 1 5 ki - 40
c
developed for various flight points based on U
a,.
I 2 and s2 k = 400 for I I k2 I 3. The weights
altitude and mach number. In Table 11, 6 of (I) 30
w are found by solving (10), and the interpola- ?
these flight points are listed with fictitious
tion curve produced is shown in Fig. 8. The
controller parameters, and in Fig. 7, these six a 20
lowest comer corresponds to the point (8,0.4), -c
0
are diagrammed. It is desired to design a
neural network control law scheduler for the
the most left comer to (32,0.4), and the most t
right corner to (8, 1.35). Specifications (i) and U
a, 10
interpolation between these flight points using
(iii) are clearly satisfied, and specification (ii) ._
the architecture and design approach of this -
c

is satisfied for small thin boxes for a rectan- a 0


paper. Since the operating points are non equi- 0 1 2
gular region around the operating points.
distant, it is decided to add 6 pseudo-operating
Instead of adding extra operating points
I Mach Number
points such that equidistance is achieved, and
such that equidistance is achieved, it is
p = 12. Now,
decided to hand pjck the generalization
10 10 10 10 20 20 20 20 30 30 30 30 regions for the in- Fig. 9. Selected localized regions for the FIOO
0.50 0.75 1.00 1.25 0.50 0.75 1.00 1.25 0.50 0.75 1.00 1.25 terpolation curve, flight points of Example 3.
which are shown in

56 /€E€ Control Systems

Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on October 7, 2009 at 20:52 from IEEE Xplore. Restrictions apply.
M. A. Sartori and P. J. Antsaklis, "Implementations of Learning Control Systems Using Neural
Networks,” I EEE C ontrol S ystems M agazine , in Special Issue on 'Neural Networks in Control Systems'
Vol.12, No.2, pp. 49-57, April 1992.

tems," Ph.D. diss., Dept. Elec. Eng., Univ. Notre respectively His Ph D. dis-
Dame, Apr. 1991. sertation addressed the train-
ing of feedforward neural
121M.A. Sartori and P.J. Antsaklis, "Neural network
networks and their applica-
implementations for control scheduling," Tech.
tion to the higher level con-
Rep. #91-04-02, Dept. Elec. Eng., Univ. Notre
trol of systems. He worked
Dame, Apr. 1991.
for the McDonnell Douglas
[3] T.F. Gonzalez, "Covering a set of points with Electronics Company during
fixed size hypersquares and related problems, in the summers of 1986 and
Proc. 1990 Annuul Allerton Con$ Communicution, 1987 and for the McDonnell Douglas Missile Sys-
Control, and Computing, pp. 838-847, 1990. tems Company during the summer of 1989. During
the summer of 1991, he was a Post Doctorate Re-
[4] K.S. Narendra and K. Parthasarathy, "Identifi-
search Associate at the University of Notre Dame.
cation and control of dynamical systems using
He is now employed by the U.S. Navy's David
neural networks," IEEE Trans. Neural Networks,
Taylor Research Center. His research interests in-
vol. 1, no. I , pp. 4-27, Mar. 1991.
clude neural networks, image processing, and
[ 5 ] P.J. Antsaklis and M.A. Sartori, "Neural net- autonomous systems.
Fig. 10. Neural network output for the second works in control systems," Systems and Controls
scheduler of Example 3. Encyclopedia, Supplement I / , to be published.

dimensional plots. Specifications (i) and (iii) [6] Passino K.M., Sartori M.A., and P.J. Antsaklis,
"Neural computing for numeric to symbolic conver- Panos J. Antsaklis received
are clearly satisfied, and specification (ii) is the diploma in mechanical
satisfied for small thin boxes for rectangular sion in control systems," IEEE Control Syst. Mag..
Apr. 1989, pp. 4&52. and electrical engineering
regions around the operating points. from the National Technical
171 P.J. Antsaklis and M.A. Sartori, "Autonomous University of Athens,
Concluding Remarks control of large spacecraft using neural networks," Greece, in 1972,andtheM.S.
Final Rep., Jet Propulsion Laboratory Contract and Ph.D. degrees in electri-
A particular n e u r a l n e t w o r k a n d a 957856 Mod. 4, Nov. 1991. cal engineering from Brown
systematic design methodology are intro- 181 D.T. Tank and J.J. Hopfield, "Simple "neural" University, Providence, RI,
duced so that prior information about the optimization networks: an a/d converter, signal in 1974 and 1977,respectively. After holding facul-
system's behavior can b e directly and easily decision circuit, and a linear programming circuit", ty positions at Brown University, Rice University,
incorporated into the control design. The lEEE Trans. Ciruits Syst., vol. 33, no. 5, pp. 533- and Imperial College, University of London, he
541, May 1986. joined the University of Notre Dame where he is
four uses investigated f o r the proposed
currently Professor of Electrical Engineering. His
neural network architecture are dynamical [9] D.L. Gray, "Synthesis procedures for neural
networks," Master's Thesis, Dept. Elec. Computer research interests have been in multivariable system
system (plant and controller) modeling,
Eng., Univ. Notre Dame, Notre Dame, IN, July and control theory, primarily using the differential
fault detection and identification, informa-
1988. operator and fractional representations, more
tion extraction, and control law scheduling.
recently also in autonomous intelligent control sys-
Another approach to address this problem is [IO] P.J. Antsaklis, K.M. Passion, and S.J. Wang, tems, and in particular in discrete event systems and
presented in [13] using neural networks "An introduction to autonomous control systems," neural networks, in adaptive and learning control,
with Gaussian non!inearities and not sig- IEEE Control Syst. Mug., vol. 1 I , no. 4, pp. 5-13,
and in the reconfigurable control of systems. He is
moid nonlinearities. The two methods are June 1991.
currently an elected member of the Board of Gover-
compared in [ I 1, [ 2 ] ,and it is shown that the [ 1 I] M.D. PeekandP.J. Antsaklis, "Parameterleam- nors in the IEEE Control Systems Society and the
sigmoid neural network implementation has ing for performance adaptation," IEEE Control Syst. Group Leader of the Working Group on Control
certain advantages over the Gaussian neural Mug., vol. IO, no. 7, pp. 3-1 1, Dec. 1990. Systems in the Technical Committee on Intelligent
network implementation. In particular, the [ 121 R.D. Hackney and R.J. Miller, "Engine criteria Control. He was also the Program Chair of the 30th
sigmoid neural network easily implements and models for multivariable control system IEEE Conference on Decision and Control, which
training patterns that are nonequidistant, design," in Proc. 1977 h t . Forum on Altematives took place in the United Kingdom in December
which is a problem for the Gaussian neural ,for Multivariable Control, pp. 14-28, 1977. 1991. He has served as an Associate Editor of the
network approach. However, for the multi- IEEE Transactions on Automatic Control and as
11 31 M.A. Sartori and P.J. Antsaklis, "A Gaussian
input case, the sigmoid neural network has Chair of the Technical Committee on Theory of the
neural network implementation for control schedul-
an added complexity due to the multiplication ing," in Proc. 1991 IEEE In?. Symp. Intelligent IEEE Control Systems Society. He is an Editor of
in (14). Control, Aug. 13-15, 1991. the IEE Control Engineering Book Series, and an
Associate Editor of the IEEE Transactions on
References Neurul Netcvorks, having been founding Associate
Michael A. Sartori received the B.S., M.S., and Editor for Letters. He has also served as the Guest
[ I ] M.A. Sartori, "Feedforward neural networks and Ph.D. degrees in electrical engineering from the Editor for Neural Networks for IEEE Control Sys-
their application in the higher level control of sys- University of Notre Dame in 1987, 1989, and 1991, tems Maguzine. He is an IEEE Fellow.

April 1992 57

Authorized licensed use limited to: UNIVERSITY NOTRE DAME. Downloaded on October 7, 2009 at 20:52 from IEEE Xplore. Restrictions apply.

You might also like