Professional Documents
Culture Documents
CoTtoL
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
. W. M. WONHAMf
Introduction. The object of this paper is to discuss a generalized
version of the matrix liccati and matrix quadratic equations, which arise
in problems of stochastic control and filtering. The properties obtained
include existence, uniqueness and asymptotic behavior, and contain as
special cases some (but not all) of the results reported in [1], [2]. We refer
in particular to [2] for a detailed review and bibliography of the "standard"
equation for the linear regulator problem.
The present generalization consists in the addition of a linear positive
operator to the linear terms of the standard Riccati equation and, in certain
instances, weakening of the usual hypothesis of complete obserwbility
to obserwbility of unstable modes (detectability).
The proofs given here re simple applications of Bellmans principle of
quasi-linearization ("approximation in policy space") [5] and of a known
monotone convergence property of symmetric matrices. In this way the
discussion becomes unified and straightforward. Applications to control
and filtering are indicated in 6.
2. Notation and summary. In the following, all vectors and matrices
have real elements except where otherwise stated. A, B, C, K are matrices
of dimension respectively n X n, n X m, p X n, and m X n; N, P, Q denote
symmetric matrices of dimension respectively m X m, n X n, and n X n;
it will always be assumed that N is positive definite. A denotes the trans-
pose of A, and I is the identity matrix. Matrix unctions of time which are
assumed as data are Lebesgue measurable and bounded in norm on every
finite subinterval of their domain of definition. In particular N(t) -1 is so
bounded.
If P is positive (semi-)definite, we write P :> 0 (P >= 0);P > Q means
P Q > 0, etc. If P is symmetric, the Euclidean norm P is the absolute
value of the numerically largest eigenvalue of P; thus -[ P I -<_ P -<_ P I.
H will denote a (possibly t-dependent) positive linear map of the class of
symmetric n n matrices into itself" that is, II H (t, P) is measurable in
(t, P), linear in P, and P >_- 0 implies H (t, P) 0. In the case where A, B, K
* Received by the editors November 3, 1967, and in revised form March 8, 1968.
f National Aeronautics and Space Administration, Electronics Research Center,
Cambridge, Massachusetts 02139. This research was supported in part by the Air
Force Office of Scientific Research, Office of Aerospace Research, United States Air
Force, under AFOSR Grant AF-AFOSR-693-67, and in part by the National Science
Foundation, Engineering, under Grant GK-967.
681
682 w.M. WONHAM
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
(2.1)
infK f0 et(X--BK)H(I)et(a--Bg)dt <1
will be important later. It expresses the fact that II is not too large.
Let A, B be constant. The controllability matrix of (A, B) is the n X mn
matrix
r(A, B) [B, AB, A-B].
The pair (A, B) is controllable if the rank of I is n. If (A, B) is controllable,
so is (A BK, B) for every matrix K.
The pair of constant mtrices (A, B) is stabilizable if there exists a con-
tant matrix K such that A BK is stable (i.e., all its eigenvalues have
negative real parts). Let the minimum polynomial (},) of A be factored as
b(X) 6+()-(), where all zeros of +() lie in the closed right-half
complex plane and all zeros of -() lie in the open left-half plane. It is well
known that n-space E can be written as a direct sum E EA+ @ Ea-,
where
+ {z "+(A)x 0} E-= {-(A)x 0}
E + thus represents the "unstable modes" of A. It is known [3]+ that (A, B)
is stabi]izable if nd only if the range of 1 (A, B) contains Ea
Dual to the concept of controllability is that of observbility" the pair of
constant matrices (C, A) is obsermble if (A, C) is controllable. A weaker
but useful property is that (t least) the unstable modes of A be observable.
Precisely, (C, A is delectable if (A, C) is stabilizable.
Controllability and observbility re well-known concepts (cf. [7]); stabi-
lizbility is discussed in [3]; detectability in its present meaning originates
here.
-
Of primary iterest will be the Riccati equation
dP(t)
dt k- A(t)P(t) q- P(t)A(t) q- H [t,P(t)]
P(t)B(t)N(t)-lB(t)P(t) C(t)C(t) =0,
to<-_t<= T,
subject to the terminal condition
(2.25) P(T) Pr ->- 0.
In the constant ptrameter case we fiso consider the quadratic equation
(2.3) AP q- PA -k II(P) PBN-BPZr CC O.
MATRIX RICCATI EQUATION 683
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
BN-B P
-
semidefinite
A
is stable.
A proof is given in 3-5. Results for the quadratic equation (2.3) arc
summarized in Theorem 4.l.
-
equation
where
- -
(P, K) (P, K) + CC -[- KNK.
Then (3.2), (3.6) and (3.7) yield
dP(t)
dt
I,[P(t), K+(t)} =< dP,(t)
dt
{P(t), g(t)}
=0
dg+ (t)
dt + I,{P+(t) K+(t)}
to<=t_<= T.
Setting Q (t) P (t) P+ (t), we hve
dQ(t)
dt + [Q(t),Kv+l(t)]-t- R(t) 0
for a suitable matrix R(t) >= 0; and from this we obtain, as before, Q(t) >= O.
It follows that for each [to, T] the sequence of nonnegative matrices
{P,,(t)} is monotone nonincrcasing and hence, by Lemma 3.1,
P(t) lira P(t)
exists. Since
IP(t)[ {[P(s)]- to _-<. s
sup T}, , 1,2, .-.,
it follows by (3.7) that the sequence {[ K(t)[} is uniformly bounded.
Let .(t, s) be the fundamental matrix (cf. (3.3)) determined by A.
-
686 w.M. WONHAM
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
-
P(t) q.(T, t)Pr .(T, t) b.(s, t) {II[s, P(s)]
(3.8) K(s)B(s)P(s) P(s)B(s)K(s) -4- C(s)C(s)
K(s)N(s)K,(s) }(s, t) ds.
Applying the dominated convergence theorem to the integral in (3.8),
we conclude that (3.8) holds with P,, K, replaced by P, K, where
K(t) lim K,(t)
(3.9) *
N(t)-B(t)P(t).
Equations (3.8) (with P P, K K) and (3.9) are equivalent to
(3.1 a, b, c); hence existence of an absolutely continuous solution of
(2.2 a, b) is established.
Uniqueness of the solution results from the fact that the function
,(P, N-BP)
2(P, K)
satisfies a uniform Lipschitz condition in P in every domain to -<_ =< T,
[PI< co nst.
Assertions (i) and (ii) of Theorem 2.1 have now been proved.
The minimum property (iii) is proved by using (3.2) in the same manner
as before. Thus from the inequality
,I,[P(t), K(t)] _-<_ [P(t), K(t)]
together with (2.4 a., b) and (3.2), there follows
(3.11b) Q(T) 0.
If (3.11 u, b) are written ,ts n integral equation (cf. (3.5)) and solved,
as before, by successive pproximtion, we obtain sequence Q(t) such
MATRIX RICCATI EQUATION 687
-
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
that Q(t)
that Q(t)
-4
>=
Q(t), ,
0 for all t,
.. Since we can choose Qo(t) =- 0 it is easily seen
This completes the proof of Theorem 2.1 (iii).
4. Solution of the quadratic equation (2.3). In this section we assume
that the prmeter matrices A, B, C, N and the operator H are independent
of t, and consider exclusively the quadratic equation (2.3).
TIEOREM 4.1. If (A, B) is stabilizable and (C, A) is detectable, and if
H satisfies condition (2.1), then (2.3) has at least one solution P in the class
of positive semidefinite matrices. The matrix A BN-1BP is stable. If in
addition C, A) is observable, then D is unique and P > O.
For the proof we need three auxiliary results.
LEMMA 4.1. Let
CC + DD FF
and let G be an arbitrary matrix of suitable dimension.
(i) If C, A) is observable, then (F, A +
GD is observable.
(ii) If (C, A is detectable, then (F, A +
GD) is detectable.
Proof. Let l" denote the range of a matrix and 9(. the null space. It
is easily seen that (F(A , E)l IF(A, F)} whenever l/} IF/.
+
Also, if xFFx 0, then xCCx xDDx 0, so that (F) 9(C)
1 9(D); taking orthogonal complements, we have
{C} + {D} c {F}.
Thus {DG} {F}, so that
c
{F(A +,,.,,F)} {I(A,F)} D {F(A,C)},
proving (i). For (ii), write D DG and let A + CR be stable. Since
{CR DI {FI, a matrix S can be chosen such that
A + D + FS= A + CR.
The proof is complete.
The following remark will be useful" if (C, A is detectable, then either A
is stable or the matrix
Wt(A, C) eCCea ds
is unbounded on 0 5 <
A with Re ->=
. .*
For if A is not stable, let h be an eigenvalue of
0 and eigenvector If is the conjugate transpose of
, Wt(A, C) e2sReX C, d8.
,
Suppose the integral is bounded. Then C 0, i.e., CA- h-C O,
688 w: M. WONHAM
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1, n, so that
Re, Im u(r(A, C)).
If (C, A) is detectable,
ments, we have
{r(A, C)} = E+, and taking orthogonal comple-
{r(A, C)} (E+,)" E-.
Thus, Re Im , E+
E-, namely 0, contradiction.
LEMMA 4.2. Let C, A) be detectable and suppose the equation
(4.1) AP + PA + H(P) + CC 0
has a solution P O. Then A is stable. Let 5(R) be defined by
Jo eH(R)e dr,
and purSe(R) 5 5- R 5R R 1,2, ....If C, A is ob-
servable, then the series
(4.2) 5(R)
y0
( 5)-(R)
converges for every symmetric n X n matrix R. In that case, the solution P of
(4.1) is unique and is given by
P ( )- e *aCCe
)
dt
() W"Ce d, O,
obgain P (P) + .
is bounded only if A is sgable. Since P
proved. To prove the second, leg
Hence
() 0, ghe firsg assergion is
in (4.a) and wrige ( go
converges. Since
-! v =< Iv
there follows 5v(V) --, 0, v
v
- .
_<_
Hence (4.7) yields
+ k---,-O
1,2, .-.,
as
(4.11)
f0 et(A--BK)II(P)e(X--BK) dt
where 0 (0, 1) is independent of P. Hence for this K the function f(K, P)
is a contraction mapping in P, and so (4.10) has a unique solution. For
later reference we note that the approximating sequence P}, defined by
P() 0, t-)() f(K, P(-)), v 2, 3,
is monotone nondccreasiug.
We can now solve the pair of equations (4.8a, b). By assumption there
exists K such that A BK is stable and (4.11) is true. Let P be the
solution of P f(K, P) and define K N-BP. Next solve the
equation
P f(K:, P)
by successive pproximations. To see that this is possible observe that
(4.9) and Lemma 4:.3 imply that A BK is stable; hence f(K, P) is
defined. Now set
(+)
P2 (1) O, P f(Ke, F2 () 1, 2,
As before it follows that P() >= 0, 2, 3, and {
2
()
is nondecreas-
- MATRIX RICCATI EQUATION 691
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
exists, and 0 P P.
Repeating this procedure we obtain sequences {K,}, P,} with
N-BP, and 0 P,+ P. Then
P lira P,
exists If
/ lira K, N-1BP,
it is clear that K, P satisfy (4.8, b), and (4.10) shows that/ -> 0.
Lemma 4.3 implies that A B/- is stable. If (C, A) is obserwble,
uniqueness of P in the class P ->_- 0 is an immedite result of the mini-
mum property. Finally, Lemm 4.1 shows that ((CC KNK)
A BK) is observable if (C, A) is observable; then it is clear from (4.10)
+ ,
that/5 > 0. Theorem 4.1 is proved.
5. Proof of Theorem 2.1 (iv): asymptotic behavior of the solution. In
this section, we prove assertion (iv) of Theorem 2.1. As in 4, the parameter
matrices A, B, C, N and the operator H are independent of t.
>=
LEMMA 5.1. Let (C, A) be observable, let Po O, and suppose P(t) satisfies
the differential equation
dP(t)
(5.1)
dt + AP(t) + p (t)A + II[P(t)]+ CC O, <- O,
AP + PA + II(P).
denote
_- _ <= O,
R(t) Q(s) ds,
and let/ be the (unique) solution of
(5.6) 2(/) + P0 0.
It will be shown that 0 =< R(t) / (t
so that R(t) nondecreasing as
is
dR(t____)
decreases.
). In fact, Q(t) >__ 0 by (3.5),
Integration of (5.4) yields
(5.7) 2[R(t)] P0 0.
dt
Setting F(t) [ R(t), we obtain from (5.6) and (5.7),
(5.8)
Since/
dt
.
dE(t) 2[F(t)]
-1 (p0) >= 0, it is clear from (5.8) (cf. (3.5)) that F (t) >_- 0,
lim R (t), --, ,
exists.
0,
MATRIX RICCATI EQUATION 693
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Next, (5.7) shows that dR/dT is bounded and then that d2R/dt is bounded.
Since
dR t)
R(-)= d,
it follows that Q(t) dR(t)/dt --) 0 as -->
Next, consider the Riccati equation
. The proof is complete.
(5.10)
+ -
minimum property (Theorem 2.1 (iii)), P (t) =< ta (t). It will be shown that
/(t) is bounded for suitable/. Now
(t) e-tZPo e
e-(t){lI[[(s)]-k CC-t- IN2}e-(-) ds.
We solve (5.10) by successive approximation, setting Po(t)
- 0. Then
(5.11)
where
fi,+t(t) <- ,I f o
+ e-(t-")XII[P,(s)]e-(-) ds, , O, 1 ,..,
694 w.M. WONHAM
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Hence,
/3(t) lim P(t)
is a bounded function of t, nd (i) follows.
To prove (ii) set P0 0 nd let (t, s) denote the fundamental matrix
ssocited with A BK (t) (cf. (3.3)). Then
(5.12) P(t)
ft ((s, t){II[P(s)] + CC + K(s)N(s)K (s)}(s, t) ds.
P( + )
f +r
.(,+ ) {II[P()] + CC + R()N()R()la(,+ r) d
+r
( -, ){n[P()] + cc
-+- K(o- r)N(z)K(r r)}(r r, s) dz
=< (a, s){II[P(a + r)] + CC + K(z)N(o)K(r)}(r,s)
Writing Q(,) P(s) P(s q-- r) and using (5.12), we see that
-
for suitable constants a > 0,/
we obtain Q(s) O, or
P(s + )
> 0 (which my depend on s). Letting
P(s),
Substituting this result in (5.13) and comparing with (5.12), we conclude
that
s O.
P(t) P(t-), O, O,
and the proof is complete.
LEMMA 5.3. If (A, B) is stabilizable and (C, A is detectable, if H satisfies
(2.1), and if Po O, then the solution P(t) of (5.9a, b, c) has the property
(5.15) lira P(t)
t--
P,
where P is a positive semidefinite solution of (2.3).
Proof. By Lemmas 3.1 and 5.2, the limit in (5.15) exists and is positive
semidefinite. Since P(t) is bounded, (5.9a, b, c) show that the same is
true of dP(t)/dt and dP(t)/dt; then convergence of the integral
(5.6) dP(t)
dt
shows that dP/dt 0 as
(2.3) and (5.9a, b, c).
.
The conclusion follows by inspection of
(5.1s) P*(t) P s -.
On the other hand, if P.(t) denotes the solution of (5.9a) and (5.9b)
696 w.M. WONHAM
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
with P, (0)
(5.19)
0, then by Lemma 5.3,
P,(t) P as t-- -.
-
Hence the desired result will follow if we show that
(5.20) P(t) >= P,(t), -<- O.
For this observe from (3.5) that
(5.21) P(t) >= ft (s, t)lII[P(s)] + CC + K(s)N(s)K(s)}(s, t) ds
Denote the right side of (5.21) by Q(t). It will be shown that Q(t) >- P,(t).
Write K,(t) N-1BrP,(t) and ,(t, s) for the fundamental matrix
associated with A BK,(t); and let P(t) be the solution of (5.9a) with
K (t) as before and (0) 0. Then, by the minimum property,
(5.22) P.(t) <= [(t)
and
(5.23) [(t)
f (s, t)lII[P(s)] + CC + g(s)N(s)K(s)}(s, t) ds.
-
(5.25) P,(t)
Since the extreme terms of the inequMity (5.25) both tend to/5 as
the desired result is established.
,
6. Applications.
6.1. Stochastic control. An equation of type (2.2a) arises in optimal
control of a linear system with state-dependent white noise and quadratic
cost (cf. [6], where time-invariant control was discussed, leading to (2.3)).
In this problem,
[II(t, P)]. tr ei(t)rPe(t)l, i, j 1, n,
for certain Gi. We mention that an obvious generalization to include
MATRIX RICCATI EQUATION 697
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php