Professional Documents
Culture Documents
php
The Moore—Penrose or
generalized inverse
1. Basic definitions
Equations of the form
Ax = b,AEC" lx ",xeC",beC m (1)
occur in many pure and applied problems. If AEC '1 <'1 and is invertible,
then the system of equations (1) is, in principle, easy to solve. The unique
solution is x = A 'b. If A is an arbitrary matrix in C"'"", then it becomes
-
more difficult to solve (1). There may be none, one, or an infinite number
of solutions depending on whether beR(A) and whether n-rank (A) > 0.
One would like to be able to find a matrix (or matrices) C, such that
solutions of (1) are of the form Cb. But if b0R(A), then (1) has no solution.
This will eventually require us to modify our concept of what a solution of
(1) is. However, as the applications will illustrate, this is not as unnatural
as it sounds. But for now we retain the standard definition of solution.
To motivate our first definition of the generalized inverse, consider the
functional equation
(2)
Y =f(x),xe 9 c III,
where f is a real-valued function with domain 9• One procedure for solving
(2) is to restrict the domain off to a smaller set 9' so that f=f is one to
one. Then an inverse function f -' from R(f) to 9 is defined by f - '(y) = x
if x e S' and f (x) = y. Thus f - '(y) is a solution of (2) for y e R(f ). This is
how the arcsec, arcsin, and other inverse functions are normally defined.
The same procedure can be used in trying to solve equation (1). As
usual, we let A be the linear function from C" into Cm defined by Ax = Ax
for xEC". To make A a one to one linear transformation it must be
restricted to a subspace complementary to N(A). An obvious one is
N(A)1 = R(A*). This suggests the following definition of the generalized
inverse.
Definition 1.1.1 Functional definition of the generalized inverse If
THE MOORE-PENROSE OR GENERALIZED INVERSE 9
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
of A. Using (i) and the fact that R(BC) c R(B) for matrices B and C, we get
R(A) = R(AAtA) 9 R(AAt) c R(A), so that R(A) = R(AAt).
Thus AAt = PR(A) as desired. The proof that AtA = PR(A') is similar and is
left to the reader as an exercise. One way to show uniqueness is to show
that if At satisfies (a) and (b), or (i)—(iv), then it satisfies Definition 1.
Suppose then that At is a matrix satisfying (i)—(iv), (a), and (b). If xeR(A) 1 ,
then by (a), AAtx = 0. Thus by (ii) Atx = AtAAtx = At0 = 0. If xeR(A),
then there exist yeR(A*) such that Ay = x. But Atx = AtAy = y. The last
equality follows byobserving that taking the adjoint of both sides of
(i) gives PR(At) A = A* so that R(A*) 9 R(At). But y = (AI R(A*) ) - 'x. Thus
At satisfies Definition 1. n
As this proof illustrates, equations (i) and (ii) are, in effect, cancellation
laws. While we cannot say that AB = AC implies B = C, we can say that if
ATAB = AtAC then AB = AC. This type of cancellation will frequently
appear in proofs and the exercises.
For obvious reasons, the generalized inverse is often referred to as the
Moore—Penrose inverse. Note also that if A e C" and A is invertible,
then A -1 = At so that the generalized inverse lives up to its name.
C 1 01 = P
O O J
= [1 /2
R(A) _-P R(A')^ X satisfies AX = P R(A) and XA = P R(A•)' But
Fact 1.2.1 If A E C""` ", B c C" p, then (AB)t is not necessarily the same
as BtAt. Furthermore (At) 2 is not necessarily equal to (AZ)t.
1 1 —1
Example 1.2.2 Let A = 2 0 — 2 . Then A = BJB ' where -
—1 1 1
1 0 1 0 1 0
B= 0 1 1 and J= 10 0 0. The characteristic polynomial of A
1 0 oj 0 0 2
0 0 0
and J is 2 2 (2 — 2) with elementary divisors 2 2 and I — 2. Jt = 1 0 0
0 0 1/2
and the characteristic polynomial of Jt is 2 2 (1 — 1/2) with elementary
1 2-1
I
divisors 2 2 ,(2 — 1/2). An easy computation gives At = 1/12 6 0 6 .
[-1-2 1
But At has characteristic polynomial 2(1 — (1 + 13)/12)(A — (1 — 13)/12)
and hence a diagonal Jordan form.
Thus, if A and C are similar, then about the only thing that one can
always say about At and Ct is that they have the same rank.
A type of inverse that behaves better with respect to similarity is
discussed in Chapter VII. Since the generalized inverse does not have all
the properties of the inverse, it becomes important to know what properties
it does have and which identities it does satisfy. There are, of course, an
arbitrarily large number of true statements about generalized inverses. The
next theorem lists some of the more basic properties.
A,...... 0 t Ai : ..... 0
Proposition 1.2.1 _
0......'Ä 0......Ä,t"
The proof of Proposition 1 is left as an exercise.
As the proof of Theorem 1 illustrates, it is frequently helpful to know the
ranges and null spaces of expressions involving At. For ease of reference
we now list several of these basic geometric properties. (P8) and (P9) have
been done already. The rest are left as an exercise.
3. Computation of At
In learning any type of mathematics, the working out of examples is useful,
if not essential. The calculation of At from A can be difficult, and will be
discussed more fully later. For the present we will give two methods which
will enable the reader to begin to calculate the generalized inverses of
small matrices. The first method is worth knowing because using it should
help give a feeling of what At is. The method consists of `constructing' At
according to Definition 1.
112
Example 1.3.1 Let A = j 0 2 2 I . Then R(A*) is spanned by
101
101
1 0 1 1
1
2
2 10
2
I I 101
1 1
A subset forming a basis of R(A*) is
1 1
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
13 0 3 3
1 Now A = 2 and A l= 4. Thus At
101, 10 2 1 1 2 21 1
2 1 2
3 0
and At 4 = 1 1 I . We now must calculate a basis for R(A)' = N(A*).
1 1
1
—1 —1
Solving the system A*x = 0 we get that x = x 3 1/2 + x 4 1/2 where
1 0
0 1
—1 —1
x 3 ,x 4 EC. Then At 1/2 = At 1/2 = OeC 3 . Combining all of this
1 0
0 1
3 3 —1 —1 1 0 0 0
gives At 2 4 1/2 1/2 = 10 1 0 ol or
2 1 1 0 Li 1 0 o]
21 0 1
1 0 0 01 3 3 —1 —1 -1
At = Jo 1 0 01 2 4 1/2 1/2
1 1 0 0 2 1 1 0
21 0 1
The indicated inverse can always be taken since its columns form a basis
for R(A) ® R(A)' and hence are a linearly independent set of four vectors
in C 4 .
Below is a formal statement of the method described in Example 1.
Theorem 1.3.1 Let A E Cm "" have rank r. If {v, , v 2 , ... , V r } is a basis for
R(A*) and {w 1 , w 2 , ..., w "_ r } is a basis for N(A*), then
I I I I I I I I I I I I I
1
At =[v II v2 I I ... I vr 101 I "' I 10][Av1 I Av2 II ... II Avr I I w1 I I w2 II I"'w
I I I I n- r
] -1
I I I I I I I I I I I I I
[
I I I I I I I I I I
I I I I I
_ I . 'I v r 1I 0 1 ... 1 0].
v1'.
1
Furthermore, {Av 1 , Aye, ... , Av r } must be a basis for R(A). Since R(A)-- _
N(A*), it follows that the matrix [Av, ; •.. Av r ; W 1 i ... W n _ r] must be
non-singular. The desired result is now immediate. •
14 GENERALIZED INVERSES OF LINEAR TRANSFORMATIONS
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Proof. Notice that B*B and CC* are rank r matrices in C'"' so that it
makes sense to take their inverses. Let X = C*(CC*) '(B*B) 'B*. We will
- -
show X satisfies Definition 3. This choice is made on the grounds that the
more complicated an expression is, the more difficult it becomes geometri-
cally to work with it, and Definition 3 is algebraic. Now AX =
BCC*(CC*) '(B*B) 'B* = B(B*B) - 'B*, so (AX)* = AX. Also XA =
- -
C*(CC*) '(B*B) 'B*BC = C*(CC*) 'C, so (XA)* = XA. Thus (iii) and
- - -
(iv) hold. To check (i) and (ii) use XA = C*(CC *) - ' C to get that A(XA) _
BC(C *(CC*) 'C) = BC = A. And (XA)X = C*(CC *) - 'CC*(CC *) - '
-
1 12
Then B*B = [5], CC* = [6]. Thus At = iii [1,2]1/30 = 1/30 ji 2 .
[2 4
Example 1 is typical as the next result shows.
Definition 1.3.1 A matrix EeC m "" which has rank r is said to be in row
echelon form if E is of the forrn
C O(m — r) X n
(1)
THE MOORE-PENROSE OR GENERALIZED INVERSE 15
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1 2 0 3 5 0 1
0 0 1 —2 4 0 0
E= 0 0 0 0 0 1 3 (2)
0 0 0 0 0 0 0
0 0 0 0 0 0 0
is in row echelon form. Below we state some facts about the row echelon
form, the proofs of which may be found in [65].
For AEC'°"" such that rank(A) = r:
(E1) A can always be row reduced to row echelon form by elementary
row operations (i.e. there always exists a non-singular matrix PEC'
such that PA = E A where E A is in row echelon form).
(E2) For a given A, the row echelon form E A obtained by row reducing
A is unique,
(E3) If E is the row echelon form for A and the unit vectors in E A
appear in columns i l , i 2 , ... , and i r , then the corresponding columns of A
are a basis for R(A). This particular basis is called the set of distinguished
columns of A. The remaining columns are called the undistinguished columns
of A. (For example, if A is a matrix such that its row echelon form is given
by (2) then the first, third, and sixth columns of A are the distinguished
columns.
(E4) If E A is the row echelon form (1) for A, then N(A) = N(E A ) = N(C).
(E5) If (1) is the row echelon form for A, and if BECtm r is the matrix
made up of the distinguished columns of A (in the same order as they are
in A), then A = BC where C is obtained from the row echelon form. This
is a full rank factorization such as was described in Proposition 1.
Very closely related to the row echelon form is the hermite echelon
form. However, the hermite echelon form is defined only for square
matrices.
matrix and use the fact that [A :0]t = ^* ] and __]t = [Ali0*] .
[
Algorithm 1.3.2 To obtain the full rank factorization and the generalized
inverse for any AEC'" " ".
THE MOORE -PENROSE OR GENERALIZED INVERSE 17
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
40 — 20
—R=9 19 14
[-46 — 4
0 50
—26
— 90
18 .
—44 342
(VII) Thus
1 4 —1 —27'
AT =LR
=45 20 —10 25 —45
0 0 0 45
1 6
(CC*) and (B*B) - ' we get (CC*) -1 = 3 and (B*B) -1 =
129 323
[— 1 10]
(V) Substituting the results of steps (II), (III) and (IV) into the formula
for At gives
27 6 3 6
54 12 6 12
AT = C*(CC*) -1 (B*B) -1 B* = 207 — 40 — 20 — 40
1161
288 —22 — 11 —22
—333 98 49 98
THE MOORE -PENROSE OR GENERALIZED INVERSE 19
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
d"
C
b"
cP C'7
- =B =A
=ABR (B)
? (A')B
Fig. 1.1
shaded band represents the one to one mapping of one subspace onto
another. It is assumed that AEC°"`" and BEC""p. In the figure: (a, c) =
R(B*), (a', c') = R(B), (b', d') = R(A*), and (b", d") = R(A). The total shaded
band from CP to C" represents the action of B (or B). The part that is
shaded more darkly represents P R(A , ) B = AAB. The total shaded area
from C" to Cm is A. The darker portion is AP R(R) = ABBt. The one to one
portion of the mapping AB from C° to C'" may be viewed as the v-shaped
dark band from C" to C'". Thus to `undo' AB, one would trace the dark
band backwards. That is, (AB)t = (P R(A ,^B)t(AP R(B ^)t.
There are only two things wrong with this conjecture. For one,
AB = A(AtA)(BBt)B (1)
and not AB = A(BBt)(AtA)B. Secondly, the drawing lies a bit in that the
interval (b', c') is actually standing for two slightly skewed subspaces and
not just one. However, the factorization (1) does not help and Fig. 1.1 does
seem to portray what is happening. We are led then to try and prove the
following theorem due to Cline and Greville.
Theorem 1.4.1 If A EC n'" and BEC""P, then (AB)t = (P R(A' ) B)t(AP R(B) )t
Proof Assuming that one has stumbled across this formula and is
wondering if it is correct, the most reasonable thing to do is see if the
formula for (AB)t satisfies any of the definitions of the generalized inverse.
Let X = (P R( A , ) B)t(AP R(B) )t = (AtAB)t(ABBt)t. We will proceed as follows.
We will assume that X satisfies condition (i) of the Penrose definition.
We will then try and manipulate condition (i) to get a formula that can be
verified independently of condition (i). If our manipulations are reversible,
we will have shown condition (i) holds. Suppose then that ABXAB = AB,
or equivalently,
AB(AtAB)t(ABBt)tAB = AB. (2)
Multiply (2) on the left by At and on the right by Bt. Then (2) becomes
AtAB(ATAB)t(ABBt)tABBt = AtABBt, (3)
THE MOORE-PENROSE OR GENERALIZED INVERSE 21
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Part (i) (or ii) of Corollary I is, of course, valid if A (or B) is invertible.
Corollary 1.4.2 Suppose that AEC"' " ", Be C" x P, and rank (A) = rank (B)
= n. Then (AB) = BtAt and At
, = A*(AA*) - ' while B 1 = B*B) -1 B*.
(
0 101 a 0 0
Example 1.4.1 Let A = 10 0 ii and B= 10 b 01 where a, b, ceC.
0 00] 00c
0 0 0 at 0 0
It is easy to see that At = 1 0 0 and Bt= 0 bt 0 .Now
0 10] 0 0 ct
0 b °1 00
AB = 0 0 c and (AB)t rb t 0 . On the other hand
0 0 0 0 cr 0
0 0 0 0 a 0
BlAt =I[bt 0 0 so that (AB) 1 = BtAt. Notice that BA = 10 0 b I.
0 Ct 0] 0 0 0]
By varying the values of a, b and c one can see that (AB)t = BtAt is
possible without
AB = BA (a # b)
(i)
(ii) rank (A.) = rank (B) (a = b = 0, c = 1)
(iii) rank (AB) = rank (BA) (a = b # 0, c = 0).
The list can be continued, but the point has been made. The question
remains, when is (AB)t = BtA 1 ? Consider Example I again. What was
it about that A and B that made (AB)t = BtAt? The only thing that A
and B seem to have in common are some invariant subspaces. The
subspaces R(A), R(A*), N(A), and N(A*) are all invariant for both A and B.
A statement about invariant subspaces is also a statement about projectors.
A possible method of attack has suggested itself. We will assume that
(AB)t = BtAt. From this we will try to derive statements about projectors.
In Example 1, A *A and B were simultaneously diagonalizable, so we
should see if A*A works in. Finally, we should check to see if our conditions
are necessary as well as sufficient.
Assume then that AEC'" ", BeC""', and
(AB)t = BTAt. (10)
Theorem 1 gives another formula for (AB)t. Substitute that into (10) to get
(AtAB)t(ABBt)t = BtAt. To change this into a projector equation,
THE MOORE—PENROSE OR GENERALIZED INVERSE 23
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
r
hermitian so that [BB*, AtA] = 0 and [BBt, AtA] = 0. However,
0 0
A*ABBE = 2 0 which is not hermitian so that (AB)t BtAt.
oo1 0
5. Exercises
1. Each of the following is an alternative set of equations whose unique
solution X is the generalized inverse of A. For each definition show
that it is equivalent to one of the three given in the text.
(a) AX = P R(A) , N(X*) = N(A).
(b) AX = P R(A) , XA = P RE A*) , XAX = X.
(c) XAA* = A*, XX*A* = X.
x if xER(A*)
(d) XAx =
0 if xEN(A*).
(e) XA = P RE A*) , N(X) = N(A*
Comment: Many of these have appeared as definitions or theorems in the
literature. Notice the connection between Example 2.1, Exercise 1(b), and
Exercise 3 below.
2. Derive a set of conditions equivalent to those given in Definition 1.2 or
Definition 1.3. Show they are equivalent. Can you derive others?
3. Suppose that AEC'""". Prove that a matrix X satisfies AX = P R(A ^,
XA = P R(A*) if and only if X satisfies (i), (iii), and (iv) of Definition 1.3.
Such an X is called a (1, 3,4) -inverse of A and will be discussed later.
Observe that it cannot be unique if it is not At since trivially At is
also a (1, 3,4) -inverse.
1 2 1
4. Calculate At from Theorem 3.1 when A = ' and when
[
121]
1 3
A= 1 1
1 1
Hint: For the second matrix see Example 6.
5. Show that if rank (A) = 1, then At = -A* where k =
i= 1 j= 1
6. If AEC"'"" and rank (A) = n, notice that in Theorem 3.2, C may be
chosen as an especially simple invertible matrix. Derive the formula
for At under the assumption that rank A = n. Do the same for the case
when rank (A) = m.
1 0 12
I.
1-1
7. Let A =I 0 0
0 -1 0
0 1 Calculate At from Definition 1.1.
are equivalent.
(i) (AB)t = BtAt .
(ii) AtABB*A*ABBt = BB*A*A.
(iii) AtAB = B(AB)tAB and BBtA* = A*AB(AB)t.
(iv) (AtABBt)* = (AtABBt)t and the two matrices ABBtAt and
BtAtAB are both hermitian.
18. Show that if [A, PR(B)] = 0, [At, P RA B) ] = 0, [B, PR(A)] = 0, and
[Bt, P R(A , ) ] = 0, then (AB)t = BtAt.
19. Prove that if A* = At and B* = Bt and if any of the conditions of
Exercise 18 hold, then (AB)t = BtAt = B*A*.
20. Prove that if A* = At and the third and fourth conditions of Exercise
18 hold, then (AB)t = BtAt = BtA*.
21. Prove that if B* = Bt and the first and second conditions of Exercise 18
hold, then (AB)t = BtAt.
22. Prove that if [A*A, BB*] = 0, then (AB)t = BtAt.
THE MOORE-PENROSE OR GENERALIZED INVERSE 27
Downloaded 04/19/16 to 132.239.1.231. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
23. Verify that the product of two hermitian matrices A and B is hermitian
if and only if [A, B] = 0.
24. Suppose that P. Q are hermitian projectors. Show that PQ is a
projector if and only if [P, Q] = 0.
25. Assuming (ii) —(v) of Theorem 8, show that (AP R B))t = PR (B) At =
PR(B) At without directly using the fact that (AB) ^ = BtA?.
26. Write an expression for (PAQ)t when P and Q are non-singular.
27. Derive necessary and sufficient conditions on P and A, P non-singular,
for (P - 'AP)t to equal P -1 AtP.
28. Prove that if A ' ' and the entries of A are rational numbers, then
the entries of At are rational.