On A Matrix Riccati Equation of Stochastic Control

SIAM J.
CoTtoL
Downloaded 03/15/13 to 160.36.192.221. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Vol. 6, No. 4, 1968

Printed in U.S.A.
ON A MATRIX RICCATI EQUATION OF STOCHASTIC CONTROL*
. W. M. WONHAMf
Introduction. The object of this paper is to discuss a generalized
version of the matrix liccati and matrix quadratic equations, which arise
in problems of stochastic control and filtering. The properties obtained
include existence, uniqueness and asymptotic behavior, and contain as
special cases some (but not all) of the results reported in [1], [2]. We refer
in particular to [2] for a detailed review and bibliography of the "standard"
equation for the linear regulator problem.
The present generalization consists in the addition of a linear positive
operator to the linear terms of the standard Riccati equation and, in certain
instances, weakening of the usual hypothesis of complete obserwbility
to obserwbility of unstable modes (detectability).
The proofs given here re simple applications of Bellmans principle of
quasi-linearization ("approximation in policy space") [5] and of a known
monotone convergence property of symmetric matrices. In this way the
discussion becomes unified and straightforward. Applications to control
and filtering are indicated in 6.
2. Notation and summary. In the following, all vectors and matrices
have real elements except where otherwise stated. A, B, C, K are matrices
of dimension respectively n X n, n X m, p X n, and m X n; N, P, Q denote
symmetric matrices of dimension respectively m X m, n X n, and n X n;
it will always be assumed that N is positive definite. A denotes the trans-
pose of A, and I is the identity matrix. Matrix unctions of time which are
assumed as data are Lebesgue measurable and bounded in norm on every
finite subinterval of their domain of definition. In particular N(t) -1 is so
bounded.
If P is positive (semi-)definite, we write P :> 0 (P >= 0);P > Q means
P Q > 0, etc. If P is symmetric, the Euclidean norm P is the absolute
value of the numerically largest eigenvalue of P; thus -[ P I -<_ P -<_ P I.
H will denote a (possibly t-dependent) positive linear map of the class of
symmetric n n matrices into itself" that is, II H (t, P) is measurable in
(t, P), linear in P, and P >_- 0 implies H (t, P) 0. In the case where A, B, K
* Received by the editors November 3, 1967, and in revised form March 8, 1968.
f National Aeronautics and Space Administration, Electronics Research Center,
Cambridge, Massachusetts 02139. This research was supported in part by the Air
Force Office of Scientific Research, Office of Aerospace Research, United States Air
Force, under AFOSR Grant AF-AFOSR-693-67, and in part by the National Science
Foundation, Engineering, under Grant GK-967.
681
682 w.M. WONHAM
and II are independent of t, the condition
(2.1)
infK f0 et(X--BK)H(I)et(a--Bg)dt <1
will be important later. It expresses the fact that II is not too large.
Let A, B be constant. The controllability matrix of (A, B) is the n X mn
matrix
r(A, B) [B, AB, A-B].
The pair (A, B) is controllable if the rank of I is n. If (A, B) is controllable,
so is (A BK, B) for every matrix K.
The pair of constant mtrices (A, B) is stabilizable if there exists a con-
tant matrix K such that A BK is stable (i.e., all its eigenvalues have
negative real parts). Let the minimum polynomial (},) of A be factored as
b(X) 6+()-(), where all zeros of +() lie in the closed right-half
complex plane and all zeros of -() lie in the open left-half plane. It is well
known that n-space E can be written as a direct sum E EA+ @ Ea-,
where
+ {z "+(A)x 0} E-= {-(A)x 0}
E + thus represents the "unstable modes" of A. It is known [3]+ that (A, B)
is stabi]izable if nd only if the range of 1 (A, B) contains Ea
Dual to the concept of controllability is that of observbility" the pair of
constant matrices (C, A) is obsermble if (A, C) is controllable. A weaker
but useful property is that (t least) the unstable modes of A be observable.
Precisely, (C, A is delectable if (A, C) is stabilizable.
Controllability and observbility re well-known concepts (cf. [7]); stabi-
lizbility is discussed in [3]; detectability in its present meaning originates
here.
-
Of primary iterest will be the Riccati equation
dP(t)
dt k- A(t)P(t) q- P(t)A(t) q- H [t,P(t)]
P(t)B(t)N(t)-lB(t)P(t) C(t)C(t) =0,
to<-_t<= T,
subject to the terminal condition
(2.25) P(T) Pr ->- 0.
In the constant ptrameter case we fiso consider the quadratic equation
(2.3) AP q- PA -k II(P) PBN-BPZr CC O.
MATRIX RICCATI EQUATION 683
The main result is the following theorem.

THEOREM 2.1. There exists a matrix P(t) with the following properties:
P is defined and absolutely continuous on [to, T] and satisfies (2.2a)
ahnost everywhere) and (2.2b).
(ii) P(t) >- O, to <- <= T, and P(t) is the unique solution of (2.2, b).
(iii) (Minimum property). Let t) be an arbitrary (bounded measurable)
m X n matrix defined on [to, T] and let P(t) be the solution of the linear
equation
dP(t) -[A(t) B(t)(t)]_P(t) -t- P(t)[A(t)
dt
B(t)(t)]
(2.4a)
+ II[t, P(t)] / C(t)C(t) + (t)N(t)(t) O,
(2.4b) P(T) P,.
If P(t) is the solution of (2.2a, b), then P(t) <= )(t), to <= -<_- T.
(iv) Let A, B, C, N and H be independent of and consider (2.2) with
to , T O. If (A, B) is stabilizable and H satisfies (2.1), then P (t)
is bounded on 0]. If in addition C, A is observable, then
solution of (2.3), and the matrix

P limP(t),
exists and is positive definite. In that case P is the unique positive
BN-B P
-
semidefinite
A
is stable.
A proof is given in 3-5. Results for the quadratic equation (2.3) arc
summarized in Theorem 4.l.
3. Existence, uniqueness and minimality.

LEMMA 3.1 (Monotone convergence). Let P 1, 2, be a sequence
of n X n symmetric matrices such that P <= P-<= and P <= P,
1, 2, ,
for some P. Then P lim P, --> exists and P <= P.
The lemma is a special case of a result for positive operators in HilberC
space [4, p. 189]; the result holds also for a monotone decreasing sequence
which is bounded below.
We turn to a proof of assertions (i) and (ii) of Theorem 2.1. Let
b(P, K) (A BK)P -+- P(A BK) + II(t, P)
tild
(3.1) K(t) N(t)-lB(t)P(t).
.
-
684 w. WONHA
Then (2.2a, b) become

dP(t) -{-
dt
[P(t), K(t)]-[- C(t)C(t) K(t)N(t)K(t) O,
(3.1b)
<= T,
t0<t
(3.1c) P(T) Ft.
The key to solving (3.1a, b, c) is the observation that (3.1b) is linear in
P and that the expression (3.1a) minimizes the left side of (3.1b), regarded
as a function of K (cf. [5]). The latter statement results from the identity
(A BKo)P -[- P(A BKo) -t- KoNKo
(3.2) (A BK)P -t- P(A BK)
+ KNK- (K- Ko)N(K- Ko),
where K0 N-BP.
Now let K(t) be arbitrary and let (t, s) be the fundamental matrix
associated with the matrix A (t) B (t) K (t), that is, is determined by
the equations
O((t, s)
(3.3) Ot
[A(t) B(t)K(t)],(t, s), to <= s, <= T,
t)
Recall that (I) (s, t) (t, s)-, whencc
(3.4) O(s, t) -,(s,
t)[A(t) B(t)K(t)].
Ot
It is then easily checked that (3.1b) and (3.1c) are equiwlent to the
-
equation
P(t) a(T, t)Pr((T, t) -t- (s, t){n[s, P(s)]

(3.5)
C(s)C(s) -t- K(s)N(s)K(s)},(s, t)ds, to <= <= T.
The Volterr equation (3.5) hs unique integrble solution P (t) which
can be found by successive approximation. Consider the approximation
sequence [P" , 1, 2, with P(t) =-- O. Recalling the positivity of
II we see that P (t) => P (t) for all t; hence P (t) lira P (t) _-> 0.
We solve the simultaneous equations (3.1a) and (3.5) as follows"
Denote the right side of (3.5) by 5(K, P, t). Choosing K1 arbitrarily, define
P1 to be the unique solution of
P:(t) 5(K, P, t), to <= T.
Having defined K1, K, let P be the solution of

(3.6) P(t) 3(K, P,, t), to -_< -< T,
and define
(3.7) K+(t) N(t)-IB(t)P(t).
From what was said previously, the matrices K, and P are well-defined,
measurable and bounded on [to, T]. Next we exploit the minimum property
(3.2). For brevity let us write (3.1b) as
dP(t) -]-
dt
,P{P(t), K(t)} O,
where
- -
(P, K) (P, K) + CC -[- KNK.
Then (3.2), (3.6) and (3.7) yield
dP(t)
dt
I,[P(t), K+(t)} =< dP,(t)
dt
{P(t), g(t)}
=0
dg+ (t)
dt + I,{P+(t) K+(t)}
to<=t_<= T.
Setting Q (t) P (t) P+ (t), we hve
dQ(t)
dt + [Q(t),Kv+l(t)]-t- R(t) 0
for a suitable matrix R(t) >= 0; and from this we obtain, as before, Q(t) >= O.
It follows that for each [to, T] the sequence of nonnegative matrices
{P,,(t)} is monotone nonincrcasing and hence, by Lemma 3.1,
P(t) lira P(t)
exists. Since
IP(t)[ {[P(s)]- to _-<. s
sup T}, , 1,2, .-.,
it follows by (3.7) that the sequence {[ K(t)[} is uniformly bounded.
Let .(t, s) be the fundamental matrix (cf. (3.3)) determined by A.
-
686 w.M. WONHAM
Then (3.6) is equivalent to
-
P(t) q.(T, t)Pr .(T, t) b.(s, t) {II[s, P(s)]
(3.8) K(s)B(s)P(s) P(s)B(s)K(s) -4- C(s)C(s)
K(s)N(s)K,(s) }(s, t) ds.
Applying the dominated convergence theorem to the integral in (3.8),
we conclude that (3.8) holds with P,, K, replaced by P, K, where
K(t) lim K,(t)
(3.9) *
N(t)-B(t)P(t).
Equations (3.8) (with P P, K K) and (3.9) are equivalent to
(3.1 a, b, c); hence existence of an absolutely continuous solution of
(2.2 a, b) is established.
Uniqueness of the solution results from the fact that the function
,(P, N-BP)
2(P, K)
satisfies a uniform Lipschitz condition in P in every domain to -<_ =< T,
[PI< co nst.
Assertions (i) and (ii) of Theorem 2.1 have now been proved.
The minimum property (iii) is proved by using (3.2) in the same manner
as before. Thus from the inequality
,I,[P(t), K(t)] _-<_ [P(t), K(t)]
together with (2.4 a., b) and (3.2), there follows
0 ---dP(t) + ,[P(t) R(t)] <

dR(t) -4-,I,[P(t) R(t)]
dt dt
If Q(t) P(t) P(t), then
dQ(t)
(3.10)
4t + ,I,[P(t), R(t)] ,[P(t),/(t)] =< 0;
hence for a suitable matrix R (t) ->_ 0,
dQ(t)
(3.11u)
dt
-t- [Q(t),/(t)] + R(t) O,
(3.11b) Q(T) 0.
If (3.11 u, b) are written ,ts n integral equation (cf. (3.5)) and solved,
as before, by successive pproximtion, we obtain sequence Q(t) such
-
that Q(t)
that Q(t)
-4
>=
Q(t), ,
0 for all t,
.. Since we can choose Qo(t) =- 0 it is easily seen
This completes the proof of Theorem 2.1 (iii).
4. Solution of the quadratic equation (2.3). In this section we assume
that the prmeter matrices A, B, C, N and the operator H are independent
of t, and consider exclusively the quadratic equation (2.3).
TIEOREM 4.1. If (A, B) is stabilizable and (C, A) is detectable, and if
H satisfies condition (2.1), then (2.3) has at least one solution P in the class
of positive semidefinite matrices. The matrix A BN-1BP is stable. If in
addition C, A) is observable, then D is unique and P > O.
For the proof we need three auxiliary results.
LEMMA 4.1. Let
CC + DD FF
and let G be an arbitrary matrix of suitable dimension.
(i) If C, A) is observable, then (F, A +
GD is observable.
(ii) If (C, A is detectable, then (F, A +
GD) is detectable.
Proof. Let l" denote the range of a matrix and 9(. the null space. It
is easily seen that (F(A , E)l IF(A, F)} whenever l/} IF/.
+
Also, if xFFx 0, then xCCx xDDx 0, so that (F) 9(C)
1 9(D); taking orthogonal complements, we have
{C} + {D} c {F}.
Thus {DG} {F}, so that
c
{F(A +,,.,,F)} {I(A,F)} D {F(A,C)},
proving (i). For (ii), write D DG and let A + CR be stable. Since
{CR DI {FI, a matrix S can be chosen such that
A + D + FS= A + CR.
The proof is complete.
The following remark will be useful" if (C, A is detectable, then either A
is stable or the matrix
Wt(A, C) eCCea ds
is unbounded on 0 5 <
A with Re ->=
. .*
For if A is not stable, let h be an eigenvalue of
0 and eigenvector If is the conjugate transpose of
, Wt(A, C) e2sReX C, d8.
,
Suppose the integral is bounded. Then C 0, i.e., CA- h-C O,
688 w: M. WONHAM
1, n, so that
Re, Im u(r(A, C)).
If (C, A) is detectable,
ments, we have
{r(A, C)} = E+, and taking orthogonal comple-
{r(A, C)} (E+,)" E-.
Thus, Re Im , E+
E-, namely 0, contradiction.
LEMMA 4.2. Let C, A) be detectable and suppose the equation
(4.1) AP + PA + H(P) + CC 0
has a solution P O. Then A is stable. Let 5(R) be defined by
Jo eH(R)e dr,
and purSe(R) 5 5- R 5R R 1,2, ....If C, A is ob-
servable, then the series
(4.2) 5(R)
y0
( 5)-(R)
converges for every symmetric n X n matrix R. In that case, the solution P of
(4.1) is unique and is given by
P ( )- e *aCCe
)
dt
Here denotes the identity operator.

Proof. From (4.1) there results the identity
(4.3) P e*aPe + ea[H(P) + CC]e" ds, O.
Since (C, A) is detectable, the integral
() W"Ce d, O,
obgain P (P) + .
is bounded only if A is sgable. Since P
proved. To prove the second, leg
Hence
() 0, ghe firsg assergion is
in (4.a) and wrige ( go
Since 0 ghere follows () 0; ghus ghe lasg wriggen sum is dominaged

by P, and he series of nonnegaie matrices converges. Now suppose
(C, A) is observable. Then Q > 0, and convergence of the series with

R <= pQ for some p <
in the form R R +
.
arbitrary R >= 0 in place of Q follows by linearity of 5 and the fact that
Finally, every symmetric matrix R can be written
R-, where R + >= 0, R- _>- 0.1 Since the operators
5 are linear, convergence of (4.2) for arbitrary symmetric R is established.
Uniqueness of P follows by invertibility of 5.
LEMMA 4.3 (Minimum property). Let P 0 satisfy (2.3). Let Q >- 0

and suppose that for some matrix J,
(4.5) (A BJ)Q + Q(A BJ) + II(Q) +CC+JNJ O.
If C, A is detectable, then A BJ is stable, and if C, A is observable,
then P <= Q.
Proof. Let CC + JNJ FF. By Lemma 4.1 (with D NI2J and
G -BN-ln), the pair (F, A BJ) is detectable. Applying Lemma
4.2, we conclude that A BJ is stable. Setting Q P V and using
(3.2) we obtain
(4.6) (A BJ)V + V(A BJ) + II(V) + S 0
for some S >_- 0. Then
(4.7) V 5(V) + R,
where 5 is defined as in Lemma 4.2 (with A BJ in place of A, and
R e"(-Br)Se"(-BJ) da). Again, by Lemma 4.1, observability of
(C, A) implies observability of (F, A BJ), and then Lemma 4.2 applied
to (4.5) shows that
converges. Since
-! v =< Iv
there follows 5v(V) --, 0, v
v
- .
_<_
Hence (4.7) yields
+ k---,-O
1,2, .-.,
as
Choose T so that TRT

mx (d, 0), dc
Then R+ ,
TD+T R-= TD-T .
-rain (d, 0), i
D, where D
1, .., n, D+
dig (d, ", dn). Define d+
diag (di+), D- diag (d).
690 w.M. WONHAM
Thus V >_- 0, and Lemma 4.3 is proved.

Turning to the proof of Theorem 4.1 we use, as before, quasi-lineariza-
tion and successive approximations. Equation (2.3) is equivalent to the
pair of equations
(4.8a) K N-B P,
(4.8b) (A BK)P + BK) + II(P) + CC + KNK O.
P(A
If Ko N-1Bp and K is arbitrary, (3.2) yields the inequality
(A BKo)P + P(A BKo) + KoNKo
(4..)
<= (A BK)P + P(A BK) + KNK
First we solve (4.8b) for a suitable fixed matrix K. If A BK is stable,
then (4.8b) is equivalent to
(4.10) P fJo et(-I:)[II(P) + CC -t- KNK]e t(-E:) dt.

Denote the right side of (4.10) by f(K, P). Since
--I P III(I) -<- II(P) _<_ P II(I),
condition (2.1) implies that for some K,
(4.11)
f0 et(A--BK)II(P)e(X--BK) dt
where 0 (0, 1) is independent of P. Hence for this K the function f(K, P)
is a contraction mapping in P, and so (4.10) has a unique solution. For
later reference we note that the approximating sequence P}, defined by
P() 0, t-)() f(K, P(-)), v 2, 3,
is monotone nondccreasiug.
We can now solve the pair of equations (4.8a, b). By assumption there
exists K such that A BK is stable and (4.11) is true. Let P be the
solution of P f(K, P) and define K N-BP. Next solve the
equation
P f(K:, P)
by successive pproximations. To see that this is possible observe that
(4.9) and Lemma 4:.3 imply that A BK is stable; hence f(K, P) is
defined. Now set
(+)
P2 (1) O, P f(Ke, F2 () 1, 2,
As before it follows that P() >= 0, 2, 3, and {
2
()
is nondecreas-
- MATRIX RICCATI EQUATION 691
ing. We shall show that

,)
(4 12) 1) < P1
The inequality (4.9) (with Ko K2 and P P1) implies
f(K2, P,()) <= et(a-)[(A BK)P + P(A BK2)
+ n(P) H(P())]e t(-) dt

P et(-K)H(P P2 ())e t(-nK) tit.
Thus if () < P then P(+) f(K:, () 0,
(4.12) is true. It follows by Lemma 3.1 that the limit
P lira
exists, and 0 P P.
Repeating this procedure we obtain sequences {K,}, P,} with
N-BP, and 0 P,+ P. Then
P lira P,
exists If
/ lira K, N-1BP,
it is clear that K, P satisfy (4.8, b), and (4.10) shows that/ -> 0.
Lemma 4.3 implies that A B/- is stable. If (C, A) is obserwble,
uniqueness of P in the class P ->_- 0 is an immedite result of the mini-
mum property. Finally, Lemm 4.1 shows that ((CC KNK)
A BK) is observable if (C, A) is observable; then it is clear from (4.10)
+ ,
that/5 > 0. Theorem 4.1 is proved.
5. Proof of Theorem 2.1 (iv): asymptotic behavior of the solution. In
this section, we prove assertion (iv) of Theorem 2.1. As in 4, the parameter
matrices A, B, C, N and the operator H are independent of t.
>=
LEMMA 5.1. Let (C, A) be observable, let Po O, and suppose P(t) satisfies
the differential equation
dP(t)
(5.1)
dt + AP(t) + p (t)A + II[P(t)]+ CC O, <- O,
(5.1b) P(0) Io.

692 w.M. WONHAM
>= 0 such that

If there exists a constant matrix P*
(5.2) AP* + P*A + II(P*) + CC O,
then
(5.3)
the operator defined by

(P)
P(t)
- P* as t-- -.
Proof. By Lemma 4.2, P* is the unique solution of (5.2). Let
AP + PA + II(P).
denote
From Lemma 4.2 we know that (regarded as a linear transformation on

the n X n symmetric matrices) is nonsingular, and that 2-1(Q) >= 0 if
Q=>o.
Since is linear and independent of t, it is enough to consider the
homogeneous equation
dQ(t____) _f_ 2[Q(t)] 0, =< 0,
(5.4) dt
Q(0) P0,
and to show that
(5.5) Q(t)--+O as t----.
For this let
_- _ <= O,
R(t) Q(s) ds,
and let/ be the (unique) solution of
(5.6) 2(/) + P0 0.
It will be shown that 0 =< R(t) / (t
so that R(t) nondecreasing as
is
dR(t____)

decreases.
). In fact, Q(t) >__ 0 by (3.5),
Integration of (5.4) yields
(5.7) 2[R(t)] P0 0.
dt
Setting F(t) [ R(t), we obtain from (5.6) and (5.7),
(5.8)
Since/
dt
.
dE(t) 2[F(t)]
__-< 0, that is, 0 _-< R (t) _<- / and R

A-
F(0)
0,
-1 (p0) >= 0, it is clear from (5.8) (cf. (3.5)) that F (t) >_- 0,
lim R (t), --, ,
exists.
0,
Next, (5.7) shows that dR/dT is bounded and then that d2R/dt is bounded.
Since
dR t)
R(-)= d,
it follows that Q(t) dR(t)/dt --) 0 as -->
Next, consider the Riccati equation
. The proof is complete.
dP(t) BK(t)]P(t) -t- P(t)[A

dt
t- [A BK(t)]
(5.9a)
II[P(t)] + CC -t- K(t)N(t)K(t) O,
nt- <- O,
(5.9b) K(t) N-BP(t),
(5.9c) P(0) P0 >= 0.
From 3 we know that (5.9a, b, c) have a unique solution P(t) => 0.
LEMMA 5.2. (i) If (A, B) i8 stabilizable and if H satisfies (2.1), then the
solution P(t) of (5.9a, b, c) is bounded on (- 0].
(ii) If Po O, then P(t) is monotone nondecreasing as decreases.
Proof. Let/ be a constant matrix such that fl A B/ is stable, and
let /(t) be the solution of (5.9a) and (5.9c) with K (t) /. By the
(5.10)
+ -
minimum property (Theorem 2.1 (iii)), P (t) =< ta (t). It will be shown that
/(t) is bounded for suitable/. Now
(t) e-tZPo e
e-(t){lI[[(s)]-k CC-t- IN2}e-(-) ds.
We solve (5.10) by successive approximation, setting Po(t)
- 0. Then
(5.11)
where
fi,+t(t) <- ,I f o
+ e-(t-")XII[P,(s)]e-(-) ds, , O, 1 ,..,
694 w.M. WONHAM
Hence,
/3(t) lim P(t)
is a bounded function of t, nd (i) follows.
To prove (ii) set P0 0 nd let (t, s) denote the fundamental matrix
ssocited with A BK (t) (cf. (3.3)). Then
(5.12) P(t)
ft ((s, t){II[P(s)] + CC + K(s)N(s)K (s)}(s, t) ds.
Let r =>_ 0 be fixed, and define

K(t) K(t- r), <= O.
If (t, s) is the fundamentl matrix determined by A BR(t), then
clearly
,(t, s)
(t s ). ,
Let P(t) be the solution of (5.9) with K replaced by nd P() 0.
Again by the minimum property (Theorem 2.1(iii)), P(t) <-_ P(t), or
P(t) <- f (s, t)[[[(s)] + CC + R(s)N(s)R(s)}g(s, t) ds

((s + , t) {n.[P(. + )] + cc
(5.13) + R(s -I- r)N(s)(s q- r)}(s + r, t) ds
q,(s, r){II[P(s q- r)] q- CC
q- K(s)N(s)K(s)}(s, t- r) ds,
where we have set P(s) 0 for s 0. Now
P( + )
f +r
.(,+ ) {II[P()] + CC + R()N()R()la(,+ r) d
+r
( -, ){n[P()] + cc
-+- K(o- r)N(z)K(r r)}(r r, s) dz
=< (a, s){II[P(a + r)] + CC + K(z)N(o)K(r)}(r,s)
Writing Q(,) P(s) P(s q-- r) and using (5.12), we see that
(5.14) Q(s) >= f, (, s)n[Q()](, s) &.

Denote the right side of (5.14) by 8Q(s). Defining $ by iteration, there

results
Q(s) >_ $Q(s)
and
$Q(s) =< (f I I)/ 1, 2, ...,
-
for suitable constants a > 0,/
we obtain Q(s) O, or
P(s + )
> 0 (which my depend on s). Letting
P(s),
Substituting this result in (5.13) and comparing with (5.12), we conclude
that
s O.
P(t) P(t-), O, O,
and the proof is complete.
LEMMA 5.3. If (A, B) is stabilizable and (C, A is detectable, if H satisfies
(2.1), and if Po O, then the solution P(t) of (5.9a, b, c) has the property
(5.15) lira P(t)
t--
P,
where P is a positive semidefinite solution of (2.3).
Proof. By Lemmas 3.1 and 5.2, the limit in (5.15) exists and is positive
semidefinite. Since P(t) is bounded, (5.9a, b, c) show that the same is
true of dP(t)/dt and dP(t)/dt; then convergence of the integral
(5.6) dP(t)
dt
shows that dP/dt 0 as
(2.3) and (5.9a, b, c).
.
The conclusion follows by inspection of
We turn to a proof of assertion (iv) of Theorem 2.1. Set T 0. Bounded-

hess of P(t) follows from Lemm 5.2. If (C, A) is observable, then, by
Theorem 4.1, (2.3) has unique solution 0. Set N-BP,
A B[i; Theorem 4.1 implies that is stable. Denote by P*(t)
the solution of (5.9) and (5.9c) with K (t) t. By the minimum property
(Theorem 2.1 (iii)), the solution P(t) of (5.9a, b, c) satisfies
(5.17)
and by Lemma 5.1 (with A replaced by
+ N[;),
p(t) p*(t),
, , P* by and CC by
o,
CC
(5.1s) P*(t) P s -.
On the other hand, if P.(t) denotes the solution of (5.9a) and (5.9b)
696 w.M. WONHAM
with P, (0)
(5.19)
0, then by Lemma 5.3,
P,(t) P as t-- -.
-
Hence the desired result will follow if we show that
(5.20) P(t) >= P,(t), -<- O.
For this observe from (3.5) that
(5.21) P(t) >= ft (s, t)lII[P(s)] + CC + K(s)N(s)K(s)}(s, t) ds
Denote the right side of (5.21) by Q(t). It will be shown that Q(t) >- P,(t).
Write K,(t) N-1BrP,(t) and ,(t, s) for the fundamental matrix
associated with A BK,(t); and let P(t) be the solution of (5.9a) with
K (t) as before and (0) 0. Then, by the minimum property,
(5.22) P.(t) <= [(t)
and
(5.23) [(t)
f (s, t)lII[P(s)] + CC + g(s)N(s)K(s)}(s, t) ds.
Then (5.21) and (5.23) yield
P(t) P(t) >= f ep(s, t)II[P(s) D(s)lo(s, t) ds
and this shows, as in the proof of Lemma 5.2 (ii), that

(5.24) P(t) P(t) ->- O, <__ O.
Inequalities (5.22) and (5.24) yield (5.20). Combining (5.17) and (5.20),
we have that
<= P(t) <-_ P*(t), <-_ O.
-
(5.25) P,(t)
Since the extreme terms of the inequMity (5.25) both tend to/5 as
the desired result is established.
,
6. Applications.
6.1. Stochastic control. An equation of type (2.2a) arises in optimal
control of a linear system with state-dependent white noise and quadratic
cost (cf. [6], where time-invariant control was discussed, leading to (2.3)).
In this problem,
[II(t, P)]. tr ei(t)rPe(t)l, i, j 1, n,
for certain Gi. We mention that an obvious generalization to include
control-dependent white noise leads to (2.2a) with II replaced by

(6.1) II + PB(F + N)-IF(F + N)-IBP.
In (6.1), P F(t, P) is a function of the same type as H. This case can be
discussed in exactly the same way, if (3.1a) is replaced by
K (F -[- N)-IBp.
6.2. Linear filtering. A well-known linear filtering scheme [7] leads to the
following equation for the covariance matrix"
dP
AP PA FF- (PC+ FG)(GG) -1 (PC FG),
+ + +
dt
(6.2)
t <t<t
P(h) =-- Po >- O.
It is clear that (6.2) is equivalent to (2.2a, b) after replacing in (6.2)
t, tl, t2 by T + to t, to, T, respectively, setting II 0, and redefining
matrices. Thus Theorem 2.1 shows that (6.2) uniquely determines the
covariance matrix. If the parameter matrices are constants, then the limit
property of Theorem 2.1 (iv) holds if (A C(GG)-IGF , C) is stabiliz-
able and (H, A C (GG)-IGF) is observable, where
FF FG (GG)-IGF HH.
In particular this is true if (C, A) is detectable, (A, F) is controllable, and
FF- FG(GG)-IGF >= pFF
for some p (0, 1]. The latter result under strengthened hypotheses
was reported in [7, 13.33].
REFERENCES
[1] J. E. POTTER, A matrix equation arising in statistical filter theory, Rep. RE-9,
Experimental Astronomy Laboratory, Massachusetts Institute of Tech-
nology, Cambridge, 1965.
[2] D. L. KLEINMAN, On the linear regulator problem and the matrix Riccati equation,
Rep. ESL-R-271, Electronic Systems Laboratory, Massachusetts Institute
of Technology, Cambridge, 1966.
[3] W. M. WONHAM, On pole assignment in multi-input controllable linear systems,
IEEE Trans. Automatic Control, AC-12 (1967), pp. 660-665.
[4] L. V. KANTOROICtt AND G. 1:). AKILOV, Functional Analysis in Normed Spaces,
Macmillan, New York, 1964.
[5] R. BELLMAN, Functional equations in the theory of dynamic programming, posi-
tivity and quasilinearity, Proc. Nat. Acad. Sci. U.S.A., 41 (1955), pp. 743-
746.
[6] W. M. WONttAM, Optimal stationary control of a linear system with state-dependent
noise, this Journal, 5 (1967), pp. 486-500.
[7] R. E. KALMAN, New methods in Wiener filtering theory, Proc. First Symposium
on Engineering Applications of Random Function Theory and Probability,
John Wiley, New York, 1963, pp. 270-388.

On A Matrix Riccati Equation of Stochastic Control

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

On A Matrix Riccati Equation of Stochastic Control

Uploaded by

Copyright:

Available Formats

SIAM J.

Vol. 6, No. 4, 1968

ON A MATRIX RICCATI EQUATION OF STOCHASTIC CONTROL*

and II are independent of t, the condition

The main result is the following theorem.

solution of (2.3), and the matrix

3. Existence, uniqueness and minimality.

Then (2.2a, b) become

P(t) a(T, t)Pr((T, t) -t- (s, t){n[s, P(s)]

Having defined K1, K, let P be the solution of

Then (3.6) is equivalent to

0 ---dP(t) + ,[P(t) R(t)] <

Here denotes the identity operator.

Since (C, A) is detectable, the integral

Since 0 ghere follows () 0; ghus ghe lasg wriggen sum is dominaged

(C, A) is observable. Then Q > 0, and convergence of the series with

LEMMA 4.3 (Minimum property). Let P 0 satisfy (2.3). Let Q >- 0

Choose T so that TRT

Thus V >_- 0, and Lemma 4.3 is proved.

(4.10) P fJo et(-I:)[II(P) + CC -t- KNK]e t(-E:) dt.

ing. We shall show that

+ n(P) H(P())]e t(-) dt

(5.1b) P(0) Io.

>= 0 such that

the operator defined by

From Lemma 4.2 we know that (regarded as a linear transformation on

__-< 0, that is, 0 _-< R (t) _<- / and R

dP(t) BK(t)]P(t) -t- P(t)[A

Let r =>_ 0 be fixed, and define

P(t) <- f (s, t)[[[(s)] + CC + R(s)N(s)R(s)}g(s, t) ds

(5.14) Q(s) >= f, (, s)n[Q()](, s) &.

Denote the right side of (5.14) by 8Q(s). Defining $ by iteration, there

We turn to a proof of assertion (iv) of Theorem 2.1. Set T 0. Bounded-

Then (5.21) and (5.23) yield

P(t) P(t) >= f ep(s, t)II[P(s) D(s)lo(s, t) ds

and this shows, as in the proof of Lemma 5.2 (ii), that

<= P(t) <-_ P*(t), <-_ O.

control-dependent white noise leads to (2.2a) with II replaced by

You might also like