You are on page 1of 134

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/36283144

Topics in matrix analysis /


Article
Source: OAI

CITATIONS

READS

2,389

286

1 author:
Dennis I. Merino
Southeastern Louisiana University
39 PUBLICATIONS 2,533 CITATIONS
SEE PROFILE

All in-text references underlined in blue are linked to publications on ResearchGate,


letting you access and read them immediately.

Available from: Dennis I. Merino


Retrieved on: 12 August 2016

TOPICS IN MATRIX ANALYSIS


Dennis Iligan Merino
A dissertation submitted to The Johns Hopkins University
in conformity with the requirements for the degree of
Doctor of Philosophy
Baltimore, Maryland
March 21, 2008

Abstract
We study properties of coninvolutory matrices (EE = I), and derive a
canonical form for them under similarity and under unitary consimilarity.
We show that any complex matrix has a coninvolutory dilation and we characterize the minimum size of a coninvolutory dilation of a square matrix.
We characterize those A Mm,n (the m-by-n complex matrices) that can
be factored as A = RE with R real and E coninvolutory, and we discuss
the uniqueness of this factorization when A is square and nonsingular. Let
A, C Mm,n and B, D Mn,m . The pair (A, B) is contragrediently equivalent to the pair (C, D) if there are nonsingular complex matrices X Mm
and Y Mn such that XAY 1 = C and Y BX 1 = D. We develop a
complete set of invariants and an explicit canonical form for this equivalence
relation. We consider a generalization of the transpose operator to a general
linear operator : Mn Mn that satisfies the following properties: for every
A, B Mn , (a) preserves the spectrum of A, (b) ((A)) = A, and (c)
(AB) = (B)(A). We show that (A, (A)) is contragrediently equivalent
to (B, (B)) if and only if A = X1 BX2 for some X1 , X2 Mn such that
Xi (Xi ) = I. We also consider a generalized polar factorization A = XY ,
where X, Y Mn satisfy and X(X) = I and Y = (Y ). We study the
nilpotent part of the Jordan canonical forms of the products AB and BA
and present a canonical form for the complex orthogonal equivalence. We
characterize the linear operators on Mm,n that preserve complex orthogonal equivalence and the linear operators on Mn (IF) that preserve unitary
t-congruence when IF = IR and IF = C.

Dedication

ii

Contents
Preface

1 A Real-Coninvolutory Analog of the Polar Decomposition


1.1 Coninvolutory Matrices . . . . . . . . . . . . . . . . . . . . .
1.1.1 Real Similarity . . . . . . . . . . . . . . . . . . . . .
1.1.2 Canonical Form . . . . . . . . . . . . . . . . . . . . .
1.1.3 Singular Values . . . . . . . . . . . . . . . . . . . . .
1.1.4 Coninvolutory Dilation . . . . . . . . . . . . . . . . .
1.2 The RE Decomposition . . . . . . . . . . . . . . . . . . . .
1.2.1 The Nonsingular Case Existence . . . . . . . . . . .
1.2.2 The Nonsingular Case Uniqueness . . . . . . . . . .
1.2.3 The Singular Case . . . . . . . . . . . . . . . . . . .
1.2.4 More Factorizations Involving Coninvolutories . . . .

.
.
.
.
.
.
.
.
.
.

2 The Contragredient Equivalence Relation


2.1 Introduction . . . . . . . . . . . . . . . . .
2.2 A Canonical Form for the
Contragredient Equivalence Relation . . .
2.3 Complex Orthogonal Equivalence and
the QS Decomposition . . . . . . . . . . .
2.4 The S Polar Decomposition . . . . . . . .
2.4.1 The Nonsingular Case . . . . . . .
2.4.2 The General Case . . . . . . . . . .
2.4.3 The Symmetric Case . . . . . . . .
2.4.4 The Skew-Symmetric Case . . . . .
2.5 A and A . . . . . . . . . . . . . . . . . .
2.6 The Nilpotent Parts of AB and BA . . . .

.
.
.
.
.
.
.
.

iii

1
2
4
5
9
13
15
16
16
18
20

21
. . . . . . . . . . . 22
. . . . . . . . . . . 23
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

40
42
44
47
49
50
50
52

2.7 A Sucient Condition For Existence of A Square Root . . . . 55


2.8 A Canonical Form for
Complex Orthogonal Equivalence . . . . . . . . . . . . . . . . 60
3 Linear Operators Preserving
Orthogonal Equivalence on Matrices
3.1 Introduction and Statement of Result
3.2 Preliminary Results . . . . . . . . . .
3.3 A Rank-Preserving Property of T 1 .
3.4 Proof of the Main Theorem . . . . .

.
.
.
.

4 Linear Operators Preserving


Unitary t-Congruence on Matrices
4.1 Introduction and Statement of Results
4.2 Preliminaries . . . . . . . . . . . . . .
4.3 The Complex Case . . . . . . . . . . .
4.4 The Real Case . . . . . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

A A New Look at the Jordan Canonical Form

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

68
69
70
74
83

.
.
.
.

85
86
89
92
104
115

Bibliography

120

Vita

123

iv

Preface
I came to America because it was a cool thing to do. And I was expected
to do so. It was the natural order of things back at the University of the
Philippines. You graduate with honors, you teach for a year or two, and
then you go to your Uncle Sam. I arrived at The Johns Hopkins University
without a clear objective (even though my application letter would tell you
otherwise). I did not know what to do. I guess I was lucky in that every
course I took was a palatable dish tasted for the first time. I was also given
choices I did not know I had.
In my second semester, I took Dr. Roger A. Horns class in Matrix Analysis. One thing I noticed was that he was far from ordinary. He was (and
still is) a performer. An artist. And our classroom was his stage. I looked
forward to, and was swept away by, each and every performance. I knew
then what I wanted to do.
The notation in this work is standard, as in [HJ1] (except for the transpose used in Chapter 4), and is explained at the beginning of each chapter.
Each chapter addresses a dierent topic and has its own set of notations and
definitions. Each chapter can be read independently of the others.
In Chapter 1, we study properties of coninvolutory matrices (EE = I),
and derive a canonical form under similarity as well as a canonical form under unitary consimilarity for them. We show that any complex matrix has
a coninvolutory dilation and we characterize the minimum size of a coninvolutory dilation of a square matrix. We characterize those A Mm,n (the
m-by-n complex matrices) that can be factored as A = RE with R real and
E coninvolutory, and we discuss the uniqueness of this factorization when A
is square and nonsingular.
Let A, C Mm,n and B, D Mn,m . The pair (A, B) is contragrediently
equivalent to the pair (C, D) if there are nonsingular complex matrices X
v

Mm and Y Mn such that XAY 1 = C and Y BX 1 = D. In Chapter 2, we


develop a complete set of invariants and an explicit canonical form for this
equivalence relation. We show that (A, AT ) is contragrediently equivalent
to (C, C T ) if and only if there are complex orthogonal matrices P and Q
such that C = P AQ. Using this result, we show that the following are
equivalent for a given A Mn : (a) A = QS for some complex orthogonal Q
and some complex symmetric S; (b) AT A is similar to AAT ; (c) (A, AT ) is
contragrediently equivalent to (AT , A); (d) A = Q1 AT Q2 for some complex
orthogonal Q1 , Q2 ; (e) A = P AT P for some complex orthogonal P . We
then consider a generalization of the transpose operator to a general linear
operator : Mn Mn that satisfies the following properties: for every
A, B Mn , (a) preserves the spectrum of A, (b) ((A)) = A, and (c)
(AB) = (B)(A). We show that (A, (A)) is contragrediently similar to
(B, (B)) if and only if A = X1 BX2 for some nonsingular X1 , X2 Mn that
satisfy X11 = (X1 ) and X21 = (X2 ). We also consider a factorization
of the form A = XY , where X Mn is nonsingular and X 1 = (X),
and Y Mn satisfies Y = (Y ). We also discuss the operators A A
and A A. We then study the nilpotent part of the Jordan canonical
forms of the products AB and BA. We use the canonical form for the
contragredient equivalence relation to give a new proof to a result due to
Flanders concerning the relative sizes of the nilpotent Jordan blocks of AB
and BA. We present a sucient condition for the existence of square roots
of AB and BA and close the chapter with a canonical form for the complex
orthogonal equivalence relation (A Q1 AQ2 , where Q1 Mm and Q2 Mn
are complex orthogonal matrices).
In Chapter 3, we characterize the linear operators on Mm,n that preserve
complex orthogonal equivalence, that is, T : Mm,n Mm,n is linear and has
the property that T (A) is orthogonally equivalent to T (B) whenever A and
B are orthogonally equivalent.
In Chapter 4, we answer the question of Horn, Hong and Li in [HHL] concerning characterization of linear operators in Mn (IF) that preserve unitary
t-congruence (A U AU t , with unitary U ) in the cases IF = IR and IF = C.
The result in the real case is a rather pleasant surprise.
In the Appendix, we present a new proof for the Jordan canonical theorem
that incorporates ideas developed in Chapter 2.
I wish to thank Dr. Roger A. Horn and Dr. Chi-Kwong Li for their help
and guidance; and Dr. Irving Kaplansky for his inspiring basis-free methods.
vi

Chapter 1
A Real-Coninvolutory Analog
of the Polar Decomposition

2
The classical polar decomposition A = P U , where P is positive semidefinite and U is unitary, can be viewed as a generalization to matrices of the
polar representation of a complex number z = rei , where r, IR. Another,
perhaps more natural, generalization would be A = R1 eiR2 , where R1 and R2
are matrices with real entries. We study this factorization and related ones.
We also investigate the matrices that can be written as E = eiR for some
real matrix R.
We denote the set of m-by-n complex matrices by Mm,n , and write Mn
Mn,n . The set of m-by-n matrices with real entries is denoted by Mm,n (IR),
and we write Mn (IR) Mn,n (IR). A matrix E Mn is said to be coninvolutory if EE = I, that is, E is nonsingular and E = E 1 . We denote the set
of coninvolutory matrices in Mn by Cn . For A, B Mn , A is similar to B
if there exists a nonsingular matrix S Mn such that A = SBS 1 and we
write A B; A and B are said to be consimilar if there exists a nonsingular
1
S Mn such that A = SBS . Given a scalar C, the n-by-n upper
triangular Jordan block corresponding to (having on the main diagonal,
1 on the super diagonal, and all other entries are 0) is denoted by Jn ().
Topological and metric properties of subsets of Mm,n (compact, bounded,
etc.) are always with respect to the topology generated by some (any) norm
on Mm,n .

1.1

Coninvolutory Matrices

The following can be verified easily.


Proposition 1.1.1 Let A Mn . Any two of the following implies the third:
(1) A is unitary.
(2) A is symmetric.
(3) A is coninvolutory.
Proposition 1.1.2 Let n be a given positive integer. Then
(a) Cn is closed under consimilarity. In particular, Cn is closed under real
similarity (A RAR1 for a real nonsingular R) and coninvolutory
congruence (A EAE for a coninvolutory E).

3
(b) Cn is closed under general similarity if and only if n = 1.
(c) Cn is closed.
(d) Cn is bounded if and only if n = 1.
(e) Cn is closed under products if and only if n = 1. If E1 , E2 Cn , then
E1 E2 Cn if and only if E1 commutes with E2 .
(f) |det E| = 1 for every E Cn .
(g) Cn Cm Cn+m for any positive integer m. In particular, E Im is
coninvolutory for any E Cn and any positive integer m.
(h) eiR is coninvolutory for any R Mn (IR).
1

Proof If E Cn and S Mn is nonsingular, then (SES )(SES


1
Hence, SES Cn . The example
E=

"

1 0
0 1

In2 Cn and S =

shows that
SES

"

1 2i
0 1

"

1 i
0 1

) = I.

In2

In2

need not be coninvolutory if n 2.

Suppose that {Ej } is a sequence of coninvolutories converging to E.


Then I = lim (Ej E j ) = (lim Ej )(lim E j ) = EE as j , so E is
coninvolutory and Cn is closed.

For any t IR, one checks that


"

1 t
0 1

In2

is coninvolutory, so Cn is not bounded for n 2. If n = 1, C1 = {z =


ei : IR} is compact and is closed under similarity and products.
Now

"

1 1
0 1

#"

1 1
0 1

"

1 2
0 1

6 C2 .

4
Therefore, a product of two coninvolutory matrices is not necessarily
coninvolutory. If E1 , E2 Cn and E1 E2 Cn , then E1 E2 = (E1 E2 )1 =
1
1
E2 E1 = E2 E1 , so E1 commutes with E2 . Conversely, if E1 and E2
are commuting coninvolutories, then E1 E2 (E1 E2 ) = E2 E1 (E1 E2 ) = I.
The assertions in (f), (g), and (h) are easily verified.

1.1.1

Real Similarity

In Theorem (69) of [K2], Kaplansky used the QS decomposition of a nonsingular matrix (A = QS with Q orthogonal and S symmetric) to prove that
there exists an orthogonal matrix Q such that A = QBQT if and only if there
exists a nonsingular S such that A = SBS 1 and AT = SB T S 1 . An analogous result about real similarity may be obtained by following the steps in
Kaplanskys proof and using the RE decomposition (see Section 1.2). Here,
we present a proof that does not make use of the RE decomposition; later,
we shall use this result in our proof of the RE decomposition.
Theorem 1.1.3 Let A, B Mn be given. There exists a nonsingular real
R Mn such that A = RBR1 if and only if there exists a nonsingular
S Mn such that A = SBS 1 and A = SBS 1 .
Proof The forward implication is easily verified. For the converse, the condi1
tions are equivalent to A = SBS 1 and A = SBS . Hence, AS = SB
and AS = SB. Let S = R1 + iR2 with R1 , R2 Mn (IR). Then
R1 = (S +S)/2 and R2 = (S S)/2i, so A(R1 +R2 ) = (R1 +R2 )B
for all , IR. Let p(z) det(R1 + zR2 ). Then p(z) is a polynomial
of degree at most n and p(z) 6 0 since p(i) 6= 0. Hence, p(0 ) 6= 0
for some 0 IR. Thus, R R1 + 0 R2 Mn (IR) is nonsingular,
AR = RB, and A = RBR1 .
Corollary (6.4.18) in [HJ2] uses the QS decomposition to prove the following: Suppose that A = SBS 1 and suppose there exists a polynomial
p(t) such that AT = p(A) and B T = p(B). Then there exists an orthogonal
Q such that A = QBQT . The following is an analog of this result for real
similarity.

5
Corollary 1.1.4 Let A, B Mn be given, suppose that A = SBS 1 , and
suppose there exists a polynomial p(t) such that A = p(A) and B = p(B).
Then there exists a nonsingular real R Mn such that A = RBR1 .
Proof This follows directly from the theorem since q(A) = Sq(B)S 1 for
any polynomial q(t).
It is known that two symmetric matrices are similar if and only if they
are orthogonally similar and two Hermitian (even normal) matrices are similar if and only if they are unitarily similar. An analogous result holds for
coninvolutory matrices.
Corollary 1.1.5 Two coninvolutory matrices are similar if and only if they
are real similar (that is, the similarity can be achieved via a nonsingular real
matrix).
Proof If A and B are coninvolutory and similar, say A = SBS 1 , we have
A = A1 = SB 1 S 1 = SBS 1 . Theorem 1.1.3 ensures that A =
RBR1 for some nonsingular real R. Conversely, if two matrices are
real similar, then they are similar.
Corollary 1.1.4 remains true if the polynomial p(t) is replaced by any
primary matrix function (see Chapter 6 of [HJ2]). Corollary 1.1.5 then follows
from Corollary 1.1.4 by taking p(t) = t1 .

1.1.2

Canonical Form

Definition 1.1.6 Let k be a given positive integer and let A, B Mk . We


define
"
#
1
A + B i(A B)
E2k (A, B)
.
(1.1.1)
A+B
2 i(A B)
For 0 6= C, we set E2k () E2k (Jk (), Jk ()1 ).
Direct computations (and a simple induction for (c)) verify the following
facts and functional equations involving the matrices E2k (A, B).
Lemma 1.1.7 Let A, B, C, D Mk be given.

6
(a) E2k (A, B)E2k (C, D) = E2k (AC, BD).
(b) E2k (A, B)E2k (C, D) = E2k (AD, BC).
(c) f (E2k (A, B)) = E2k (f (A), f (B)) for any entire analytic function f (z).
1

(d) E2k (A, B)1 = E2k (B , A ) whenever A and B are nonsingular.


(e) E2k (A, B) is coninvolutory if and only if AB = I.
(f) {E2k () : 0 6= C} is an unbounded commuting family of coninvolutory matrices.
(g) E2k (A, B) = U (A B)U 1 where
1
U
2

"

Ik iIk
iIk Ik

is unitary, symmetric, and coninvolutory.


(h)

1
E (J (), Jk ())
i 2k k

is a real matrix for every C.

Thus, the eigenvalues of E2k (A, B) are the union of the eigenvalues of A
and B. In particular, for 6= 0 the coninvolutory matrix E2k () is unitarily
similar to Jk () Jk ()1 and the eigenvalues of E2k () are and 1/, each
of multiplicity k.
Theorem 1.1.8 Let 0 6= C be given and let k be a given positive integer.
Then E2k () = eiR where R M2k (IR) is a polynomial in E2k () and R is
similar to 1i E2k (Jk (), Jk ()), with ln (principal branch).
Proof Let ln (principal branch) and notice that Jk () eJk () and
Jk ()1 eJk () . Lemma 1.1.7(h) ensures that
1
B E2k (Jk (), Jk ()) M2k (IR).
i

Thus,
E2k ()

=
=

Jk () Jk ()1
eJk () eJk ()
E2k (eJk () , eJk () )
eE2k (Jk (),Jk ())
eiB

7
by Lemma 1.1.7(g, c). Since E2k () and eiB are similar coninvolutory matrices, Corollary 1.1.5 ensures that they are real similar, so
there is some nonsingular C M2k (IR) such that E2k () = CeiB C 1 =
1
eiCBC = eiR with R CBC 1 M2k (IR). Since the spectrum
of iR Jk () Jk () is confined to the horizontal line {z C :
Im z = arg }, basic facts about primary matrix functions (see Example (6.2.16) in [HJ2]) ensure that there is a polynomial p(t) (which
may depend on ) such that iR = p(E2k ()).
Proposition 1.1.9 (a) Let E Cn be given, and suppose Jk () is a Jordan
block of E with multiplicity l. Then Jk (1/) is a Jordan block of E with
multiplicity l.
(b) Let be a given nonzero complex number and let k and l be given positive
integers. Then there exists an E C2kl such that Jk () is a Jordan block of
E with multiplicity l; if k = 1 and || = 1, there is such an E Cl .
1

Proof Since E = E , if Jk () is a Jordan block of E with multiplicity l, E


1
must have l Jordan blocks similar to Jk () , that is, equal to Jk (1/).
For (b), we may take E = E2k () E2k () (l copies); if k = 1 and
|| = 1, we may take E = I Ml .
If E Cn and if is an eigenvalue of E with || 6= 1, Proposition 1.1.9(a)
says that the Jordan structure of E associated with the eigenvalue is paired
with the Jordan structure of E associated with 1/: Each Jk () is paired
with Jk (1/). However, = 1/ when || = 1 ( = ei with IR), so
there need be no pairing in this case, and the corresponding Jordan blocks
are of the form Jk (ei ), which are similar to eiJk () . These observations and
Corollary 1.1.5 give the following canonical forms for a coninvolutory matrix.
Theorem 1.1.10 Let E Cn . There are integers r, s 0, positive integers
l1 , . . . , lr , k1 , . . . , ks such that l1 + + lr + 2(k1 + + ks ) = n, real numbers
1 , . . . r , and complex (possibly real) numbers 1 , . . . , s with all |i | > 1 such
that
(a) E is similar to
Jl1 (ei1 ) Jlr (eir ) [Jk1 (1 ) Jk1 (1/1 )] [Jks (s ) Jks (1/s )]

8
(b) E is real similar to the coninvolutory matrix
eiJl1 (1 ) eiJlr (r ) E2k1 (1 ) E2ks (s )
(c) E is real similar to the coninvolutory matrix
ei[Jl1 (1 )Jlr (r )R1 Rs ]
where each Rj 1i E2kj (Jkj (j ), Jkj (j )) M2kj (IR) and j ln j (principal branch) for j = 1, . . . , s.
If C Mn (IR) gives the real similarity asserted in Theorem 1.1.10(c), we
have E = eiR , where R C[Jl1 (1 ) Jlr (r ) R1 Rs ]C 1 . Since
the spectrum of R is confined to an open horizontal strip containing the real
axis and having height 2, Example (6.2.16) in [HJ2] ensures that there is a
polynomial p(t) (which may depend on E) such that R = p(E). We can also
i
i
write E = (e 2 R )2 , where e 2 R is coninvolutory and 12 R is also a polynomial
in E. These observations give the following result, which was proved in a
dierent way in Corollary (6.4.22) of [HJ2].
Theorem 1.1.11 Let E Mn be given. The following are equivalent:
(a) EE = I, that is, E is coninvolutory.
(b) E = eiS , where S Mn (IR) is real.
(c) E = F 2 , where F Mn is coninvolutory.
For a coninvolutory E, the real matrix S in (b) and the coninvolutory matrix
F in (c) may be taken to be polynomials in E.
For a given E Cn , the real matrix S in Theorem 1.1.11(b) and the
coninvolutory matrix F in Theorem 1.1.11(c) are not uniquely determined.
However, the general theory of primary matrix functions ensures that there
is a unique real S with spectrum in the half-open strip {z C : 2 <
Im z 2 } such that E = eiS and there is a unique coninvolutory F with
spectrum in the wedge {z C : 2 < arg z 2 } such that E = F 2 ( see
Theorem (6.2.9) of [HJ2]). For another approach to this uniqueness result,
see Theorem (2.3) of [DBS].

1.1.3

Singular Values

Definition 1.1.12 Let A Mn be nonsingular and let kk be a given matrix


norm. The condition number of A with respect to k k, denoted (A, k k),
is given by
(A, k k) kAkkA1 k.
Proposition 1.1.13 (a) If > 0 is a singular value of a given E Cn with
multiplicity k, then so is 1/.
(b) If E Cn , then the singular values of E and E 1 are the same.
(c) If E Cn and if k k is any unitarily invariant norm, then
(E, k k) = kEk2 .

(d) If > 0 is given and if k is a given positive integer, then there exists
a coninvolutory matrix E such that is a singular value of E with
multiplicity k.
Proof The first three assertions follow directly from the singular value de1
composition of E and the equation E = E . If > 0 and 6= 1,
then
"
#
0
E2 () = U
U
0 1/
has singular values and 1/. Hence, we may take E = E2 ()
E2 () (k copies). If = 1, then we may take E = Ik .
Let E Cn . Proposition 1.1.13(a) shows that the singular value decomposition of E has the form E = U V , where U and V are unitary and
1
1
= [1 In1 In1 ] [k Ink Ink ] In2j
(1.1.2)
1
k
with 1 > > k > 1 and j =

Pk

X11
X21
..
.

X12
X22
..
.

Xl1

Xl2,2
Xl1,2
Xl2

VU =
X
l2,1

Xl1,1

i=1

ni . Partition the unitary matrix

X1,l2
X2,l2
. . . ..
.
Xl2,l2
Xl1,l2
Xl,l2

X1,l1
X2,l1
..
.

X1l
X2l
..
.

Xl2,l1 Xl2,l
Xl1,l1 Xl1,l
Xl,l1
Xll

10
conformal to , with X11 , X22 Mn1 , . . . , Xl2,l2 , Xl1,l1 Mnk , Xll
1
Mn2j , and l = 2k + 1. Since U V = E = E = V T 1 U T , we have
(V U ) = 1 U T V = 1 (V U )T and hence

1 X11
1 X21
..
.

1 Xl2,1

1 Xl1,1

1 Xl1

1
XT
1 11
T
1 X12

.
.
.
1
T

k X1,l2

T
k X1,l1

X1lT

1
X
1 12
1
X
1 22

..
.

1
X
1 l2,2
1
X
1 l1,2
1
X
1 l2

1
XT
1 21
T
1 X22

..
.
1
XT
k 2,l2
T
k X2,l1
T
X2l

k X1,l2
k X2,l2
. . . ..
.
k Xl2,l2
k Xl1,l2
k Xl,l2

1
X
k 1,l1
1
X
k 2,l1

..
.

1
X
k l2,l1
1
X
k l1,l1
1
X
k l,l1

X1l
X2l
..
.
Xl2,l
Xl1,l
Xll

1
1
1
T
XT
XT
1 Xl2,1
1 l1,1
1 l1
T
T
1 Xl1,2
1 Xl2T
1 Xl2,2
..
..
..
...
.
.
.
1
1
1
T
T
T
k Xl2,l2 k Xl1,l2 k Xl,l2
T
T
T
k Xl2,l1
k Xl1,l1
k Xl,l1
T
T
T
Xl2,l
Xl1,l
Xll

(1.1.3)

Let k kF denote the Frobenius norm (square root of the sum of squares
of moduli of entries). Since all the columns and rows of V U are unit vectors,
we have
T
T
X21
Xl1T ]k2F .
k[ X11 X12 X1l ]k2F = n1 = k[ X11

(1.1.4)

T
T
Equating the first rows of (1.1.3) gives X21
= X12 and X11
= 12 X11 , . . .,
T
T
T
T
Xl2,1 = 1 k X1,l2 , Xl1,1 = (1 /k )X1,l1 , Xl1 = 1 X1l . Since all of the
coecients 12 , . . . , 1 k , 1 /k , and 1 are greater than one, substituting
these identities in (1.1.4) shows that X1i = Xi1 = 0 for all i 6= 2, and hence
T 2
kF = kX21 k2F .
kX12 k2F = n1 = kX21

Again using the fact that the rows of V U are unit vectors, we also have
n1 = k[ X21 X22 X2l ]k2F ,
and hence X2i = 0 for all i 2. Similarly, Xi2 = 0 for all i 2. Thus

0 X12 0

T
0
0
V U = X12

0
0 V1

11
T
where V1 is unitary and satisfies V1 1 = 1
1 V1 with 1 [2 In2
[k Ink 1k Ink ] In2j .
Iteration of this argument shows that

X VU =

"

0 X1
X1T 0

"

0 Xk
XkT 0

Xk+1 ,

1
I ]
2 n2

(1.1.5)

where Xk+1 Mn2j is unitary and symmetric. Thus, we can write E as


E = U V = V T XV , where and X have the special structure exhibited
in (1.1.2) and (1.1.5).
Let
i =

"

1
I
i ni

0
i Ini

Then
Yi i =

"

and Yi =

0 Xi
XiT 0

#"

"

Xi 0
0 XiT

i Ini
0

0
1
I
i ni

for 1 i k.

= i YiT .

Let Zi be a unitary and polynomial square root of Yi . Then Zi i = i ZiT for


1 i k (see Problem 27 in Section (4.4) of [HJ1]). Since Xk+1 is symmetric
and unitary, it has a symmetric and unitary square root, say Zk+1 . Let Z =
Z1 Zk Zk+1 , and = 1 k In2j . Then X = Z 2 = ZZ T .
Hence, E = V T XV = V T ZZ T V = (V T Z)(V T Z)T . We state this result
in the following theorem; for alternative approaches to this result, see (98 )
in [A] and Theorem (3.7) of [HH2].
Theorem 1.1.14 Let E Cn be given. Then E = U U T , where U Mn is
unitary,
=

"

0
1 In1

1
I
1 n1

"

0
k Ink

1
I
k nk

In2j ,

(1.1.6)

1 > > k > 1 and j = ki=1 ni . The positive numbers 1 , 11 , . . . , k , 1k ,


and 1 are the distinct singular values of E with multiplicities n1 , n1 , . . . , nk ,
nk , and n 2j, respectively. Conversely, U U T Cn whenever U Mn is
unitary and has the form (1.1.6).
It is natural to ask to what extent Proposition 1.1.13(b) characterizes Cn .

12
Corollary 1.1.15 Let a nonsingular A Mn be given. The following are
equivalent:
(a) A and A1 have the same singular values, that is, A and A1 are
unitarily equivalent.
(b) A = EW for some E Cn and unitary W Mn .
(c) A = V E for some E Cn and unitary V Mn .
(d) A = V EW for some E Cn and unitary V, W Mn .
Proof If A and A1 have the same singular values, then A = V1 V2 , where
has the form (1.1.6) and V1 , V2 Mn are unitary. The converse
assertion in Theorem 1.1.14 ensures that A = (V1 V1T )(V 1 V2 ) is a
factorization of the form asserted in (b). If A = EW as in (b), then
A = W (W T EW ) is a factorization of the form asserted in (c). If
A = V E as in (c), then one may take W = I in (d). If A = V EW as
in (d), then A1 = W EV . Thus, the singular values of A and A1
are those of E and E, which by Proposition 1.1.13(b) are the same.
Corollary (4.6.12) of [HJ1] guarantees that a positive definite A Mn and
a Hermitian or symmetric B Mn can be simultaneously diagonalized by
an appropriate nonsingular congruence. The following is an analog of these
results in the coninvolutory case.
Theorem 1.1.16 Let A Mn be positive definite and let E Mn be coninvolutory. Then there exists a nonsingular S Mn such that SAS = I and
1
SES = has the form (1.1.6).
Proof Let P be the unique positive definite square root of A. Notice that
P 1 EP Cn , hence Theorem 1.1.14 ensures that there exists a unitary
U and a of the form (1.1.6) such that P 1 EP = U U T . Let S
1
U P 1 , so that S = P 1 U . Then SAS = I and SES = , as
desired.

13

1.1.4

Coninvolutory Dilation

Definition 1.1.17 Let A Mn be given, and let k be a given positive


integer. Then B Mn+k is a dilation of A if A is a leading principal submatrix
of B.
Thompson and Kuo [TK] characterized those A Mn with a dilation
B Mn+k that has some special property such as doubly stochastic, unitary,
or complex orthogonal. For example, Thompson and Kuo found that every
A Mn has a complex orthogonal dilation B Mn+k , and the least integer
k for which this is possible is k = rank(I AAT ). When does A have a
coninvolutory dilation?
Lemma 1.1.18 Let A Mn and let k1 be a positive integer. Suppose that
A has a coninvolutory dilation of size n + k1 . Then A has a coninvolutory
dilation of size n +k for each k k1 . Moreover, k1 k0 (A) rank(I AA).
Proof Suppose that
B=

"

A X1
X2 Y

Cn+k1 .

Then B Ikk1 Cn+k is a coninvolutory dilation of A for all k k1 .


Notice that X1 , X2T Mn,k1 and AA + X1 X 2 = In , so X1 X 2 = In AA
and k0 (A) rank(In AA) = rank(X1 X 2 ) rank X1 k1 .
Definition 1.1.19 For A Mn , we define
(A)

"

A
i(I + A)
i(I A)
A

Lemma 1.1.20 Let A Mn (IR). Then (A) is a coninvolutory dilation of


A of size 2n. Moreover, Jn (1) has a coninvolutory dilation of size 2n 1.
Proof A direct computation shows that (A) C2n . If A = Jn (1), form B
from (Jn (1)) by deleting its last row and column; if A = Jn (1), form
B from (Jn (1)) by deleting its (n + 1)-st row and column. In both
cases, one checks that B C2n1 is a coninvolutory dilation of A.

14
Lemma 1.1.21 Suppose that A1 Mn1 has a coninvolutory dilation of size
n1 + k1 , and suppose that A2 Mn2 has a coninvolutory dilation of size
n2 + k2 . Then A1 A2 has a coninvolutory dilation of size n1 + n2 + k1 + k2 .
Proof Let
B1 =

"

A1 X1
Y1 Z1

Cn1 +k1

B2 =

"

A2 X2
Y2 Z2

Cn2 +k2 .

and let

An explicit computation now shows that

A1 0 X1 0
0 A2 0 X2

Cn1 +n2 +k1 +k2 .


Y1 0 Z1 0
0 Y2 0 Z2

Theorem 1.1.22 Let A Mn and let k be a given nonnegative integer.


Then A has a coninvolutory dilation of size n + k if and only if k rank(I
AA).
Proof Let A Mn be given and let k0 = rank(I AA). Theorem (4.9)
1
of [HH1] guarantees that A = SRS for some nonsingular S Mn
and R Mn (IR). Now, R is real similar to its real Jordan canonical
form JR (A) Cn1 (a1 , b1 ) Cni (ai , bi ) Jm1 (1 ) Jmj (j )
Mn (IR), where {Cnt (at , bt )} are the real Jordan blocks containing the
non-real eigenvalues of A and each t IR (Theorem (3.4.5) in [HJ1]).
1
Hence, A is consimilar to JR (A), say A = XJR (A)X . If JR (A) has
a coninvolutory dilation B of size n + k, then since Cn+k is invariant

15
under consimilarity,
"

X 0
0 I

"

X
0

0
I

"

"

"

X 0
0 I

#"

JR (A)

XJR (A)X

#"

X
0

0
I

is a coninvolutory dilation of A of size n + k. Lemma 1.1.20 guarantees that Cnt (at , bt ) has a coninvolutory dilation of size 2nt . Moreover, Jmt (t ) has a coninvolutory dilation of size 2mt , and if t = 1
then Jmt (1) has a coninvolutory dilation of size 2mt 1. Thus,
Lemma 1.1.21 ensures that JR (A) has a coninvolutory dilation of size
2n l, where l is the number of Jordan blocks in JR (A) with t = 1.
Now, nl = rank(I [JR (A)]2 ) = rank(I X[JR (A)]2 X 1 ) = rank(I
AA) = k0 . Therefore, A has a coninvolutory dilation of size n + k0 .
Lemma 1.1.18 guarantees that A has a coninvolutory dilation of size
n + k if and only if k k0 .
In [TK] several power dilation problems are considered: when does a given
square matrix A have a dilation
B=
in a given class such that
k

B =

"

Ak

"

for k = 1, . . . , N ?

Theorem 1.1.22 solves the power dilation problem for N = 1 in the class of
coninvolutory matrices; the situation for N > 1 is an open problem.

1.2

The RE Decomposition
1

For any nonsingular A Mn , one checks that A A is coninvolutory, and it


is known (Lemma (4.6.9) in [HJ1]) that every coninvolutory matrix arises in

16
this way. Theorem 1.1.11 therefore ensures that there is a coninvolutory E
1
such that E 2 = A A, and there exists such an E that is a polynomial in
1
A A. An immediate consequence of this observation is the following RE decomposition for square nonsingular complex matrices, which is an elaboration
of Theorem (6.4.23) in [HJ2].

1.2.1

The Nonsingular Case Existence

Theorem 1.2.1 Let A Mn be nonsingular. There exists a coninvolutory


1
1
E0 Mn such that E0 2 = A A and E0 is a polynomial in A A. If E
1
is a given coninvolutory matrix such that E 2 = A A, then R(A, E) AE
has real entries and A = R(A, E)E. Finally, A commutes with A if and
only if there exist R, E Mn such that A = RE, R has real entries, E is
coninvolutory, and R commutes with E.
Proof The first assertion has been dealt with. For the second, suppose
1
EE = I and E 2 = A A, and compute
1

AE = A(E 2 )E = A(A A)E = AE.


Thus, R(A, E) AE has real entries and A = AEE = R(A, E)E. If
A = RE with R real, E coninvolutory, and RE = ER, then AA =
RERE = R2 EE = R2 = REER = RERE = AA, so A commutes
1
with A. Conversely, if A commutes with A, then A commutes with A ,
which is a polynomial in A. Let E be a coninvolutory matrix such that
1
1
E 2 = A A and E is a polynomial in A A, and let R(A, E) AE.
Then A commutes with E, R(A, E) has real entries, A = R(A, E)E,
and ER(A, E) = EAE = AEE = A = AEE = R(A, E)E.
Notice that a given A Mn commutes with A if and only if AA has
real entries. Although AA need not have real entries if n > 1, AA is always
similar to a square of a real matrix (see Corollary (4.10) of [HH1]).

1.2.2

The Nonsingular Case Uniqueness

The factors in the classical polar decomposition (A = P U ) of a nonsingular


matrix are always uniquely determined. How well-determined are the factors

17
in an RE decomposition? One can always write A = RE = (R)(E). More
generally, if R1 is a real matrix such that R12 = I and R1 commutes with E,
then A = RE = (RR1 )(R1 E), RR1 is real, and (R1 E)1 = ER11 = ER1 =
ER1 = R1 E so R1 E is coninvolutory. The following results show that this is
the only non-uniqueness in the factors of a nonsingular RE decomposition.
Lemma 1.2.2 Let G Mn be a given coninvolutory matrix, and let E0 be
a given coninvolutory matrix such that E02 = G and E0 is a polynomial in
G. If R Mn has real entries, R2 = I, and R commutes with E0 , then
F RE0 is coninvolutory and F 2 = G. Conversely, if F is a coninvolutory
matrix such that F 2 = G, then R(F, E0 ) F E0 is a real matrix such that
F = R(F, E0 )E0 , R(F, E0 )2 = I, and R(F, E0 ) commutes with E0 .
Proof The forward assertion has already been established. For the con1
verse, let F be coninvolutory and satisfy F 2 = G = F F . Then
Theorem 1.2.1 ensures that R(F, E0 ) F E0 has real entries and F =
R(F, E0 )E0 . Since E0 is a polynomial in G = F 2 , both E01 = E0 and
R(F, E0 ) are polynomials in F . It follows that both F and R(F, E0 )
commute with E0 . Thus, R(F, E0 )2 = F E0 F E0 = F 2 (E0 )2 = GG = I,
as desired.
Theorem 1.2.3 Let A Mn be nonsingular, let E0 Mn be a given conin1
1
volutory matrix such that E02 = A A and E0 is a polynomial in A A, and
let R(A, E0 ) AE0 . Then R(A, E0 ) has real entries and A = R(A, E0 )E0 .
If R Mn has real entries, R2 = I, and RE0 = E0 R, then RE0 is coninvolutory, R(A, E0 )R is real, and A = (R(A, E0 )R)(RE0 ). Suppose E1 ,
R1 Mn are given with E1 coninvolutory, R1 real, and A = R1 E1 , and
define R(E1 , E0 ) E1 E0 . Then R(E1 , E0 ) is real and commutes with E0 ,
R(E1 , E0 )2 = I, R1 = R(A, E0 )R(E1 , E0 ), and E1 = R(E1 , E0 )E0 .
Proof Theorem 1.2.1 ensures that R(A, E0 ) is real and A = R(A, E0 )E0 .
Moreover, Lemma 1.2.2 ensures that RE0 is coninvolutory and A =
(R(A, E0 )R)(RE0 ) is an RE decomposition if R2 = I and R is real and
commutes with E0 . For the last assertions, apply Lemma 1.2.2 to the
1
coninvolutory matrix E1 , for which E12 = A A = E02 , and conclude
that R(E1 , E0 ) E1 E0 is real, E1 = R(E1 , E0 )E0 , R(E1 , E0 ) commutes

18
with E0 , and R(E1 , E0 )2 = I. Then
R1 =
=
=
=

AE1
AR(E1 , E0 )E0
AE0 R(E1 , E0 )
R(A, E0 )R(E1 , E0 ).

If E0 is a given coninvolutory polynomial square root of A A, we see


that the set of all pairs (R, E) with R real, E coninvolutory, and A = RE
is precisely {(AE 0 R, RE0 ) : R is real, R2 = I, and R commutes with E0 }.
Depending on the Jordan canonical form of Log E0 (see Theorem (6.4.20)
in [HJ2]), there can be many Rs that satisfy these conditions. If we use
Theorem 1.1.11(b) to write E0 = eiS with S real, and if S = T CT 1 is its
real Jordan form (see Theorem (3.4.5) in [HJ1]) with T and C both real and
C = Cn1 Cnk , then E0 = T eiC T 1 T (En1 Enk )T 1 . Any
matrix of the form R = T (In1 Ink )T 1 is real, commutes with E0 ,
and satisfies R2 = I.

1.2.3

The Singular Case

Every complex matrix, singular or nonsingular, square or not, has a classical


polar decomposition. Does every complex matrix have an RE decomposition?
Notice that if A = RE, then A = RE and A = AE 2 , so A and A necessarily
have the same range in this case. However,
A1

"

1 i
i 1

and A1 have dierent ranges, so A does not have an RE decomposition. The


example
"
#
1 i
A2
0 0
has the same range as A2 , and
A2 =

"

1 0
0 0

#"

1 i
0 1

19
T

is an RE decomposition. Notice that AT2 and A2 have dierent ranges, so


AT2 does not have an RE decomposition. Thus, it is possible for one, but not
both, of the matrices A and AT to have an RE decomposition. We now show
that equality of the ranges of A and A is sucient as well as necessary for A
to have an RE decomposition.
Theorem 1.2.4 Let A Mm,n be given. Then there exist a real R Mm,n
and a coninvolutory E Mn such that A = RE if and only if A and A have
the same range, that is, if and only if there exists a nonsingular S Mn such
that A = AS.
Proof The necessity of the condition has already been shown, so we only
need to show that it is sucient.
We first show that the asserted equivalent conditions are both invariant
under right multiplication by a nonsingular matrix.
(i) If A = RE, then AS = RES = R(ES) and ES is nonsingular.
Hence ES = R1 E1 , with R1 real and E1 coninvolutory, and AS =
R(R1 E1 ) = (RR1 )E1 and RR1 is real.
(ii) If A = AS and A1 = AX, where X is nonsingular, then A1 =
1
1
1
AX = ASX = AX(X SX) = A1 (X SX) and (X SX) is
nonsingular.
Choose a permutation matrix P Mn so that AP = [ A1 A2 ] and
A1 Mm,r has full rank. Then A2 = A1 B for some B Mr,nr and
AP = [ A1 0 ]

"

Ir B
0 Inr

Hence, without loss of generality we may assume that A = [A1 0] with


A1 having full rank. Then
[ A1 0 ] = A = AS = [ A1 0 ]

"

S11 S12
S21 S22

= [A1 S11 A1 S12 ].

Hence, A1 S12 = 0 and A1 = A1 S11 = A1 S11 S11 = A1 S11 S11 , or A1 (Ir


S11 S11 ) = 0. Since A1 has full rank, we conclude that S12 = 0 and Ir =
S11 S11 . Thus, S11 is coninvolutory and Theorem 1.1.11 ensures that

20
S11 = E 2 for some coninvolutory E. Taking R A1 E = (A1 S11 )E =
A1 E 2 E = A1 E = A1 E = R, we see that R is real and A1 = RE.
Finally, we have
A = [ A1 0 ] = [ RE 0 ] = [ R 0 ]

"

E
0
0 Inr

and A has an RE decomposition.

1.2.4

More Factorizations Involving Coninvolutories

The following result may be thought of as an analog of the singular value


decomposition.
Theorem 1.2.5 Every A Mm,n can be written as A = E1 RE2 where
E1 Mm and E2 Mn are coninvolutory, and R Mm,n has real entries.
Proof The singular value decomposition ensures that there exist nonsingular matrices X Mm and Y Mn so that A = XRY , where R Mm,n
has real entries. Write X = E1 R1 and Y = R2 E2 , with each Ei coninvolutory and each Ri real; then A = E1 (R1 RR2 )E2 .
Corollary 1.2.6 Let A Mm,n . Then there exists a coninvolutory X Mm
such that XA and XA have the same range.
Proof Write A = E1 RE2 , as in Theorem 1.2.5, and take X = E1 .
Every square complex matrix is known to be consimilar to a real matrix
(Theorem 4.9 of [HH1]), but we can now show that the consimilarity may be
taken to be coninvolutory. As a consequence, the coninvolutory factors E1
and E2 in Theorem 1.2.5 may be taken to be equal when A is square.
Theorem 1.2.7 Every A Mn can be written as A = ERE = ERE
where E Mn is coninvolutory and R Mn (IR).
1

Proof Write A = SR0 S with a nonsingular S and real R0 . Write S =


ER1 , where E is coninvolutory and R1 is real. Then A = ERE, where
R R1 R0 R11 is real.

Chapter 2
The Contragredient
Equivalence Relation

21

22

2.1

Introduction

We denote the set of m-by-n matrices by Mm,n and write Mn Mn,n . Given
a scalar C, the n-by-n upper triangular Jordan block corresponding to
is denoted by Jn (). We say that A, B Mm,n are equivalent if there are
nonsingular matrices X Mm and Y Mn such that A = XBY .
Definition 2.1.1 Let A, C Mm,n and B, D Mn,m . We say that (A, B)
is contragrediently equivalent to (C, D), and we write (A, B) (C, D), if
there are nonsingular X Mm and Y Mn such that XAY 1 = C and
Y BX 1 = D.
It can be useful to know that contragredient equivalence of two pairs of
matrices can be expressed as a block diagonal similarity of two block matrices:
"

X 0
0 Y

#"

0 A
B 0

#"

X 0
0 Y

#1

"

0 C
D 0

It is easy to verify that contragredient equivalence is an equivalence relation


on Mm,n Mn,m . In the special case n = m and B = D = I, notice that
(A, I) (C, I) if and only if A is similar to C.
Contragredient equivalence is a natural notion when one studies products
of matrices. If (A, B) (C, D), then
(1)
(2)
(3)
(4)

CD = (XAY 1 )(Y BX 1 ) = X(AB)X 1 ,


DC = (Y BX 1 )(XAY 1 ) = Y (BA)Y 1 ,
D(CD)k = Y B(AB)k X 1 for all k = 0, 1, 2, . . . , and
C(DC)k = XA(BA)k Y 1 for all k = 0, 1, 2, . . . .

Hence, AB is similar to CD, BA is similar to DC, D(CD)k is equivalent


to B(AB)k and C(DC)k is equivalent to A(BA)k for all k = 0, 1, 2, . . .. In
particular, the following rank identities hold for all integers k 0:
(10 )
(20 )
(30 )
(40 )

rank
rank
rank
rank

(AB)k = rank (CD)k


(BA)k = rank (DC)k
B(AB)k = rank D(CD)k
A(BA)k = rank C(DC)k .

Conversely, we shall show that these rank identities, together with either
of the similarity conditions (1) or (2), are sucient for the contragredient

23
equivalence of (A, B) and (C, D). We obtain a canonical form for this relation and use it to study a variety of matrix factorizations involving complex
orthogonal factors.

2.2

A Canonical Form for the


Contragredient Equivalence Relation

For a given A Mm,n and B Mn,m , it is our goal to find a canonical pair
Mn,m such that (A, B) (A,
B).
Our first step is to
A Mm,n and B
describe a reduction that shows there are only two essential cases to consider:
AB nonsingular and AB nilpotent.
Lemma 2.2.1 Let positive integers k, n be given with k < n. Let Y1 Mn,k
and P Mk,n be given with P Y1 nonsingular. Then there exists a Y2
Mn,nk such that P Y2 = 0 and [Y1 Y2 ] Mn is nonsingular.
Proof Since P has full (row) rank, we may let {1 , . . . , nk } be a basis for
the orthogonal complement of the span of the columns of P in Cn ,
and set Y2 [1 . . . nk ]. Then P Y2 = 0 and Y2 has full (column)
rank. Let Ck and Cnk and suppose
[Y1 Y2 ]

"

= Y1 + Y2 = 0.

Then 0 = P Y1 + P Y2 = P Y1 , so = 0 since P Y1 is nonsingular.


Thus, Y2 = 0 and = 0 since Y2 has full column rank. We conclude
that [Y1 Y2 ] Mn is nonsingular, as desired.
Lemma 2.2.2 Let positive integers m, n be given with m n, let A Mm,n
and B Mn,m be given, and let k rank (AB)m . Suppose the Jordan
canonical form of AB is J(AB) N , where J(AB) Mk is nonsingular if
k 1 or is absent if k = 0, and N Mmk is nilpotent if k < m or is absent
if k = n = m.
(a) If k = 0, then AB is nilpotent.

24
(b) If 1 k < n, there exist nonsingular X Mm and Y Mn such that
XAY

"

Ik 0
0 A

and Y BX

"

J(AB) 0
0
B

and AB Mmk is nilpotent.


(c) If k = n < m, there exist nonsingular X Mm and Y Mn such that
XAY 1 =

"

Ik
0

and Y BX 1 = [J(AB) 0].

(d) If k = n = m, there exist nonsingular X, Y Mn such that


XAY 1 = Im and Y BX 1 = J(AB).
Proof Let k = rank (AB)m and let X Mm be a nonsingular matrix that
reduces AB to Jordan canonical form, that is,
XABX 1 =

"

J(AB) 0
0
N

where J(AB) Mk is nonsingular and N Mmk is nilpotent if k < m


and is absent if k = n = m. If k = 0, then AB is nilpotent and we
have case (a).
Suppose k 1. Partition XA as
XA =

"

C1
C2

with C1 Mk,n and C2 Mmk,n

(2.2.1)

and partition BX 1 as
BX 1 = [D1 D2 ] with D1 Mn,k and D2 Mn,mk .

(2.2.2)

Compute
"

J(AB) 0
0
N

= XABX 1 =

"

C1
C2

[D1 D2 ] =

"

C1 D1 C1 D2
C2 D1 C2 D2

25
Thus, C1 D2 = 0, C2 D1 = 0, and C1 D1 = J(AB) is nonsingular, as is
D1T C1T .
Now suppose 1 k < n. Two applications of Lemma 2.2.1 (with
P = D1T and P = C1 , respectively) ensure that there are C3 Mnk,n
and D3 Mn,nk such that
Y

"

C1
C3

Mn and F

"

D1T
D3T

Mn

are nonsingular and satisfy C1 D3 = 0 and D1T C3T = 0, that is C3 D1 = 0.


Notice that
YF =

"

C1
C3

[D1 D3 ] =

"

C1 D1 C1 D3
C3 D1 C3 D3

"

J(AB)
0
0
C3 D3

is nonsingular, so C3 D3 is nonsingular. We also have


XAF =

"

C1
C2

[D1 D3 ] =

"

C1 D1 C1 D3
C2 D1 C2 D3

"

J(AB)
0
0
C2 D3

Now compute
XAY 1 = XAF F 1 Y 1 = (XAF )(Y F )1
=

"

J(AB)
0
0
C2 D3

#"

"

Ik
0
0 C2 D3 (C3 D3 )1

0
J(AB)1
0
(C3 D3 )1
#

"

Ik 0
0 A

and
Y BX

= Y (BX

"

)=

"

C1
C3

J(AB)
0
0
C3 D2

[D1 D2 ] =

"

"

C1 D1 C1 D2
C3 D1 C3 D2

J(AB) 0
0
B

26
where we set A C2 D3 (C3 D3 )1 and B C3 D2 . Since
"

J(AB) 0
0
N

= XABX 1 = (XAY 1 )(Y BX 1 )

"

"

Ik 0
0 A

#"

J(AB) 0
0
B
#

J(AB) 0
0
AB

we have AB = N , as desired for case (b).

If k = n < m, then C1 , D1 Mn in (2.2.1) and (2.2.2) and the identity


C1 D1 = J(AB) shows that C1 and D1 are nonsingular. Since C1 D2 = 0
and C2 D1 = 0, it follows that D2 = 0 and C2 = 0. If we set Y C1
and F D1 , we have
Y F = C1 D1 = J(AB)
and
XAF =

"

C1
0

D1 =

"

C1 D1
0

"

J(AB)
0

Thus,
XAY 1 = XAF F 1 Y 1 = (XAF )(Y F )1
=

"

J(AB)
0

J(AB)

"

Ik
0

and
Y BX 1 = C1 [D1 D2 ] = [C1 D1 C1 D2 ] = [J(AB) 0],
which is the assertion in (c).
Finally, suppose k = n = m. Then XA = C1 and BX 1 = D1 are
nonsingular. For Y C1 , we have XAY 1 = Im and Y BX 1 =
C1 D1 = J(AB), which is case (d).
For a given A Mm,n and B Mn,m , the four cases (a)(d) in Lemma
2.2.2 are exhaustive and mutually exclusive. The special forms achieved by

27
the contragredient equivalences in (c) and (d) are the canonical forms we
seek, but more work is required to achieve a canonical form in (a). Once
that is achieved, however, the special form of the reduction in (b) shows that
a canonical form in this case can be expressed as a direct sum of the canonical
forms for cases (a) and (d): If
XAY

"

Ik 0
0 A

and Y BX

"

J(AB) 0
0
B

and if X1 Ik F1 Mm and Y1 Ik G1 Mn are nonsingular, then


1

"

(X1 X)A(Y1 Y )
and
1

(Y1 Y )B(X1 X)

"

Ik
0
0 F1 AG1
1

J(AB)
0
0
G1 BF11

1
1
and (F1 AG1
is nilpotent. Thus, if F1 and G1 are
1 )(G1 BF1 ) = F1 (AB)F1
chosen to put (A, B) into canonical form, we shall have achieved a canonical
form for (A, B).
In order to achieve a canonical form for (A, B) under contragredient equivalence, we see that there are only two essential cases: AB is nonsingular or
AB is nilpotent. Before attacking the second case, it is convenient to summarize what we have learned about the first.

Lemma 2.2.3 Let m, n be given positive integers with m n, and let


A, C Mm,n and B, D Mn,m be given. Suppose rank AB = rank (AB)m =
rank CD = rank (CD)m = n. Then (A, B) (C, D) if and only if AB is
similar to CD.
Proof The forward implication is immediate (and does not require the rank
conditions). For the converse, the hypotheses ensure that the respective
Jordan canonical forms of AB and CD are J(AB)0mn and J(CD)
0mn and one may arrange the Jordan blocks in the nonsingular factor
J(CD) Mn so that J(AB) = J(CD). Inspection of the canonical
forms in Lemma 2.2.2 (c) and (d) now shows that (A, B) (C, D).

28
With the nonsingular case disposed of, we now begin a discussion of the
nilpotent case. Let m n, A Mm,n , and B Mn,m be given and suppose
AB (and hence also BA) is nilpotent. Then among all the nonzero finite
alternating products of the form
Type I:
Type II:

ABA
BAB

(ending in A)
(ending in B)

there is a longest one (possibly two, one of each type). Suppose that a longest
one is of Type I. There are two possibilities, depending on the parity of a
longest product.
Case 1. Odd parity, k 1, A(BA)k1 6= 0, (BA)k = 0.
Since we assumed that a longest product was of type I, we must
have (AB)k = 0 in this case.
Case 2. Even parity, k 2, (BA)k1 6= 0, A(BA)k1 = 0.
Again, since a longest nonzero product was assumed to be of type I,
we must have B(AB)k1 = 0.
We now consider these two cases in turn.
Type I, Case 1. If k = 1, we set (AB)k1 Im , even if A = 0. Let x Cn
and y Cm be such that y A(BA)k1 x 6= 0, and define
Z1 [x BAx . . . (BA)k1 x] Mn,k
Y1 AZ1 = [Ax A(BA)x . . . A(BA)k1 x] Mm,k

and

y
y AB
..
.
y (AB)k1

Q PA =

Mk,m

yA
y (AB)A
..
.
y (AB)k1 A

Mk,n .

29
Then

P Y1 = QZ1 =

is nonsingular since
identities:
(a)
(b)
(c)
(d)

y A(BA)k1 x

y A(BA)k1 x

Mk

y A(BA)k1 x 6= 0. Moreover, we have the following


AZ1 = Y1
PA = Q
BY1 = Z1 JkT (0)
QB = Jk (0)P.

(2.2.3)

Suppose k < n. Then Lemma 2.2.1 ensures that there exist Y2 Mm,mk
and Z2 Mn,nk such that
(a)
(b)
(c)
(d)

Y [Y1 Y2 ] Mm is nonsingular, and


P Y2 = 0;
Z [Z1 Z2 ] Mn is nonsingular, and
QZ2 = 0.

(2.2.4)

Set C Y 1 AZ2 and partition


C=

"

C1
A2

with C1 Mk,nk and A2 Mmk,nk .

Then
AZ2 = Y C = [Y1 Y2 ]

"

C1
A2

= Y1 C1 + Y2 A2 .

(2.2.5)

Now, use (2.2.5) and the identities (2.2.4b), (2.2.3b) and (2.2.4d) to compute
P AZ2 = P (AZ2 ) = P Y1 C1 + (P Y2 )A2 = (P Y1 )C1
and
P AZ2 = (P A)Z2 = QZ2 = 0.
Since P Y1 is nonsingular, we conclude that C1 = 0. Thus, (2.2.5) simplifies
to
AZ2 = Y2 A2
(2.2.6)

30
and we can use (2.2.3a) and (2.2.6) to compute
"

A[Z1 Z2 ] = [AZ1 AZ2 ] = [Y1 Y2 ]


so that
Y

AZ =

"

Ik 0
0 A2

Ik 0
0 A2

(2.2.7)

Now, set D Z 1 BY2 and partition


D=

"

D1
B2

with D1 Mk,mk and B2 Mnk,mk .

Again, notice that


BY2 = ZD = [Z1 Z2 ]

"

D1
B2

= Z1 D1 + Z2 B2 .

(2.2.8)

Now use (2.2.8) and the identities (2.2.4d), (2.2.3d) and (2.2.4b) to compute
QBY2 = Q(BY2 ) = QZ1 D1 + (QZ2 )B2 = (QZ1 )D1
and
QBY2 = (QB)Y2 = Jk (0)(P Y2 ) = 0.
Since QZ1 is nonsingular, we conclude that D1 = 0. Hence, (2.2.8) simplifies
to
(2.2.9)
BY2 = Z2 B2 ,
and we can use (2.2.3c) and (2.2.9) to compute
B[Y1 Y2 ] = [BY1 BY2 ] = [Z1 Z2 ]
and we have
Z

BY =

"

"

JkT (0) 0
0
B2

JkT (0) 0
0
B2
#

(2.2.10)

In this case, notice that A2 Mmk,nk and B2 Mnk,mk . Moreover,


(A2 B2 )k = 0 and (B2 A2 )k = 0.

31
If k = n < m, then Z2 is absent in (2.2.4c) and Z = Z1 is nonsingular; Q
is also nonsingular since QZ1 is nonsingular. The identities (2.2.3), (2.2.4a),
and (2.2.4b) still hold. Now,
AZ = Y1 = [Y1 Y2 ]

"

Ik
0

=Y

"

Ik
0

so that (2.2.7) reduces to


Y

AZ =

"

Ik
0

(2.2.11)

Using (2.2.3d) and (2.2.4b), we have


QBY2 = (QB)Y2 = Jk (0)(P Y2 ) = 0.
Since Q is nonsingular, we have BY2 = 0. Hence,
BY = B[Y1 Y2 ] = [BY1 BY2 ] = [ZJkT (0) 0] = Z[JkT (0) 0]
and (2.2.10) becomes
Z 1 BY = [JkT (0) 0].

(2.2.12)

If k = n = m, then Y2 and Z2 are absent; Y = Y1 , Z = Z1 , Q and P Mn


are all nonsingular. The identities (2.2.3) still hold and
Y 1 AZ = Ik and Z 1 BY = JkT (0).

(2.2.13)

Notice that if k = n m, then we may take A2 = 0 in (2.2.7) and B2 = 0


in (2.2.10).
Type I, Case 2. We proceed as in Case 1. Let x, y Cn be such
that y (BA)k1 x 6= 0. In this case, we have k 2, A(BA)k1 = 0, and
B(AB)k1 = 0. Let
Z1 [x BAx . . . (BA)k2 x (BA)k1 x] Mn,k
Y1 [Ax A(BA)x . . . A(BA)k2 x] Mm,k1

32

and

yB
y BAB
..
.
y B(AB)k2

y BA
..
.

y (BA)k2

y (BA)k1

We define

Mk1,m

Mk,n .

Hk [Ik1 0] Mk1,k and Kk [0 Ik1 ] Mk1,k .

(2.2.14)

Calculations similar to those in Case 1 show that P Y1 Mk1 and QZ1


Mk are nonsingular. Morever, we have the following identities:
(a)
(b)
(c)
(d)

AZ1 = Y1 Hk
P A = Kk Q
BY1 = Z1 KkT
QB = HkT P.

(2.2.15)

Suppose k < n. Then Lemma 2.2.1 guarantees that there exists a Y2


Mm,mk+1 and a Z2 Mn,nk such that
(a)
(b)
(c)
(d)

Y [Y1 Y2 ] Mm is nonsingular, and


P Y2 = 0;
Z [Z1 Z2 ] Mn is nonsingular, and
QZ2 = 0.

(2.2.16)

As in Case 1, we set C Y 1 AZ2 . Similar calculations yield


C=

"

0
A2

with A2 Mmk+1,nk ,

and
AZ2 = Y2 A2 .

(2.2.17)

33
Using the identities (2.2.15a) and (2.2.17) we have
A[Z1 Z2 ] = [AZ1 AZ2 ] = [Y1 Hk Y2 A2 ] = [Y1 Y2 ]
so that
Y 1 AZ =

"

Hk 0
0 C2

"

Hk 0
0 C2

(2.2.18)

Again, as in Case 1, set D Z 1 BY2 and use similar calculations to


obtain
"
#
0
with B2 Mnk,mk+1
D=
B2
so that
BY2 = Z2 D2 .

(2.2.19)

Using the identities (2.2.15c) and (2.2.19), we have


B[Y1 Y2 ] = [BY1 BY2 ] =

[Z1 KkT

and hence
Z

BY =

"

Z2 B2 ] = [Z1 Z2 ]
KkT 0
0 B2

"

KkT 0
0 B2

(2.2.20)

In this case, notice that A2 Mmk+1,nk and B Mnk,mk+1 . Moreover, B2 (A2 B2 )k1 = 0 and A2 (B2 A2 )k1 = 0.
If k = n m, then Z2 is absent and Z = Z1 Mn is nonsingular. Since
QZ1 is nonsingular, Q is also nonsingular. The identities (2.2.15), (2.2.16a),
and (2.2.16b) still hold. Hence,
AZ = AZ1 = Y1 Hk = [Y1 Y2 ]

"

Hk
0

so that (2.2.18) becomes


Y

AZ =

"

Hk
0

Using (2.2.15d) and (2.2.16b), we have


QBY2 = (QB)Y2 = HkT (P Y2 ) = 0.

(2.2.21)

34
Hence, BY2 = 0 since Q is nonsingular. Using this and (2.2.15c), we have
BY = B[Y1 Y2 ] = [BY1 BY2 ] = [Z1 KkT 0] = Z1 [KkT 0]
so that (2.2.20) becomes
Z 1 BY = [KkT 0].

(2.2.22)

Again, notice that if k = n m, then we may take A2 = 0 in (2.2.18)


and B2 = 0 in (2.2.20).
If a longest nonzero alternating product is of Type II, then again there are
two possibilities, depending on the parity of a longest product. Our analysis
of both cases holds with A and B interchanged. In particular, we have the
following results:
Type II, Case 1. Odd parity, k 1, B(AB)k1 6= 0, (AB)k = 0 and
(BA)k = 0. There exist nonsingular Y Mm and Z Mn such that
(a) if k < n,
Z

BY =

and
Y

AZ =

"

"

Ik 0
0 B2

JkT (0) 0
0
A2

for some B2 Mnk,mk


#

for some A2 Mmk,nk .

(2.2.23)

(2.2.24)

(b) if k = n m, then we may take B2 = 0 in (2.2.23) and A2 = 0 in


(2.2.24).
Type II, Case 2. Even parity, k 2, (AB)k1 6= 0, B(AB)k1 = 0 and
A(BA)k1 = 0. There exist nonsingular Y Mm and Z Mn such that
(a) if k < n, or if k = n < m,
Z

BY =

and
Y

AZ =

"
"

Hk 0
0 B2

for some B2 Mnk+1,mk

(2.2.25)

KkT 0
0 A2

for some A2 Mmk,nk+1 .

(2.2.26)

35
(b) if k = n m, then we may take B2 = 0 in (2.2.25) and A2 = 0 in
(2.2.26).
In all four possible combinations of types and cases, our analysis can be
applied again to the matrices A2 and B2 . Iteration of this process leads to
the following result.
Theorem 2.2.4 Let positive integers m, n be given with m n, let A
Mm,n and B Mn,m be given, and let k rank (AB)m . Then there exist
nonsingular X Mm and Y Mn such that

XAY 1

and

Y BX 1

where

JA 0
0 A1
..
..
.
.
0 0
0 0

...

JB 0
0 B1
..
..
.
.
0 0
0 0

...

0
0
..
.

0
0
..
.

(2.2.27)

Ap
0
0 0
0
0
..
.

0
0
..
.

(2.2.28)

Bp
0 0

(a) JA , JB Mk are nonsingular and JA JB is similar to J(AB), the nonsingular part of the Jordan canonical form of AB,
(b) Ai , Bi Mmi ,ni for all i = 1, . . . , p,
(c) max {mi , ni } max {mi+1 , ni+1 } for all i = 1, . . . , p 1,
(d) |mi ni | 1 for all i = 1, . . . , p,
T
T
T
T
(e) each (Ai , Bi ) {(Imi , Jm
(0)), (Jm
(0), Imi ), (Hmi , Km
), (Km
, Hmi )},
i
i
i
i

(f) Hj [Ij1 0] Mj1,j and Kj [0 Ij1 ] Mj1,j , and


T
(g) Hj KjT = Jj1
(0) and KjT Hj = JjT (0).

36
When m = n, A = In , and B Mn is nilpotent, the canonical contragredient equivalence guaranteed by Theorem 2.2.4 gives XY 1 = XAY 1 = In
(so that X = Y ), and XBY 1 = Y BY 1 reduces to the Jordan canonical
form of B. This result, together with Theorem (2.4.8) of [HJ1], leads to a
proof of the Jordan canonical form of a square matrix. For a more direct
exposition of this idea, see the Appendix.
Definition 2.2.5 Let positive integers m, n be given with m n, and let
A Mm,n and B Mn,m be given. Define
(A, B)
{rank A, rank BA, rank ABA, . . . , rank (AB)m1 A, rank (BA)m ,
rank B, rank AB, rank BAB, . . . , rank (BA)m1 B, rank (AB)m }.
The reason for introducing (A, B) is that the parts of the canonical form
in Theorem 2.2.4 that contribute to the nilpotent part of AB are completely
determined by the sequence (A, B).
Consider first what happens if A and B consist of only one canonical
block. For each choice of the possible pairs of canonical blocks, Table 1 gives
the ranks of the indicated products.

A
B
(AB)k
(BA)k
A(BA)k1
B(AB)k1
(AB)k1
(BA)k1

(Type, Case)
(I, 1) (I, 2) (II, 1) (II, 2)
Ik
Hk
JkT (0)
KkT
T
T
Jk (0) Kk
Ik
Hk
0
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
1
0
1
1
1
1
1
0

Table 1
Ranks of Products by Type and Case
Suppose AB is nilpotent and we are given the sequence (A, B). How can
we use (A, B) to determine the Ai in (2.2.27) and the Bi in (2.2.28)? Now
consider the general case of nilpotent AB. Let k be the smallest integer such

37
that (AB)k = 0 and (BA)k = 0. Then l1 rank A(BA)k1 is the number of
Ik s in the canonical form for A, and hence is also the number of JkT (0)s in
the canonical form for B, since only (Type I, Case 1) has a nonzero entry in
the row corresponding to A(BA)k1 . In particular, l1 of the Ai s are Ik and
each of the l1 corresponding Bi s are JkT (0). Similarly, l2 rank B(AB)k1
is the number of Ai s equal to JkT (0), and also the number of corresponding
Bi s equal to Ik (Type II, Case 1). Notice that the row corresponding to
(AB)k1 has three nonzero entries: (Type I, Case 1), (Type II, Case 1), and
(Type II, Case 2). Hence, if l3 rank (AB)k1 , then l3 l1 l2 of the Ai s
are KkT and the same number of Bi s are Hk , corresponding to (Type II,
Case 2). Similarly, if l4 rank (BA)k1 , then l4 l1 l2 of the Ai s are
Hk and the same number of corresponding Bi s are KkT (Type I, Case 2).
If k > 1, the minimality of k implies that (AB)k1 6= 0 or (BA)k1 6= 0.
Hence, one of l1 , l2 , l3 , l4 must be a positive integer. The only case in which
l1 = l2 = l3 = l4 = 0 is when k = 1, A = 0, and B = 0.
Let A1 be a matrix of the form (2.2.27) having the first l1 of the Ai s
equal to Ik , the next l2 of the Ai s equal to JkT (0), the next l3 l1 l2 of
the Ai s equal to KkT , and l4 l1 l2 of the Ai s equal to Hk . Let B1 be
a corresponding matrix of the form (2.2.28), that is, the first l1 of the Bi s
are JkT (0), the next l2 of the Bi s are Ik , the next l3 l1 l2 of the Bi s
are Hk , and the next l4 l1 l2 of Bi s are KkT . Subtracting the entries
of the sequence (A1 , B1 ) from corresponding entries of (A, B) gives a new
sequence of ranks corresponding to the remaining part of the canonical forms,
whereupon we can repeat the foregoing construction.
The following procedure summarizes a way of determining a canonical
form of (A, B) under the contragredient equivalence when AB is nilpotent:
Compute the sequence (A, B) and examine it to determine the smallest integer k such that (AB)k = 0 and (BA)k = 0.
Let A1 have the form (2.2.27), with l1 of the Ai s equal to Ik , l2 of the
Ai s equal to JkT (0), l3 l1 l2 of the Ai s equal to Hk , and l4 l1 l2 of
the Ai s equal to KkT . Let B1 be the corresponding matrix of the form
(2.2.28). Compute (A1 , B1 ).
Replace (A, B) by the sequence formed by the dierences of corresponding entries of the sequences (A, B) and (A1 , B1 ).

38
Repeat the preceding three steps until the all the entries in the sequence
are 0.
The following result is immediate.
Lemma 2.2.6 Let integers m, n be given with m n, and suppose A, C
Mm,n and B, D Mn,m . Suppose that AB is nilpotent. Then (A, B) (C, D)
if and only if (A, B) = (C, D).
Corollary 2.2.7 Let integers m, n be given and suppose A, C Mm,n and
B, D Mn,m . Then (A, B) (C, D) if and only if the following conditions
are satisfied:
(1) AB is similar to CD, and
(2) for all l = 0, 1, . . . , max {m, n},
(a) rank (BA)l = rank (DC)l ,
(b) rank A(BA)l = rank C(DC)l , and
(c) rank B(AB)l = rank D(CD)l .
Proof The forward implication is easily verified and was noted in the Introduction. For the converse, we may assume without loss of generality that m n. Theorem 2.2.4 guarantees that there exist nonsingular X1 , X2 Mm , Y1 , Y2 Mn , and a nonnegative integer k =
rank (AB)m = rank (CD)m such that
X1 AY11

"

"

and
X2 CY21

A1 0
0 A
C1 0
0 C

and

and

Y1 BX11

Y2 DX21

"

"

B1 0
0 B
D1 0
0 D

#
#

where A1 , B1 , C1 , D1 Mk are nonsingular and AB and CD are nilpotent. Since AB is similar to CD, the nonsingular parts of their respective Jordan canonical forms are the same. Therefore, A1 B1 is similar
to C1 D1 and hence Lemma 2.2.3 guarantees that (A1 , B1 ) (C1 , D1 ).
Moreover, (1) also ensures that rank (AB)l = rank (CD)l for all integers l = 0, 1, . . . , m. These identities and the rank identities (2) ensure
that (A, B) = (C, D), so (A, B) (C, D) by Lemma 2.2.6.

39
Corollary 2.2.8 Let positive integers m, n be given with m n, and let
A Mm,n and B Mn,m be given. Then (A, B) (C, D), where

and

JA
0
0 A(1)
..
..
.
.
0
0
0
0

...

JB
0
0 B(1)
..
..
.
.
0
0
0
0

...

0
0
..
.

0
0
..
.

(2.2.29)

A(p)
0
0
0
0
0
..
.

0
0
..
.

(2.2.30)

B(p)

0
0

JA , JB , Ai , Bi , i = 1, . . . , p, are as in Theorem 2.2.4, and is any permutation of {1, . . . , p}.


Proof Check that the conditions of Corollary 2.2.7 are satisfied.
Definition 2.2.9 For A Mm,n and B Mn,m , we define
(A, B)

"

0 A
B 0

Let k 0 be a given integer. Then (A, B)2k = (AB)k (BA)k and


(A, B)2k+1 = (A(BA)k , B(AB)k ). Hence,
rank (A, B)2k = rank (AB)k + rank (BA)k

(2.2.31)

rank (A, B)2k+1 = rank A(BA)k + rank B(AB)k .

(2.2.32)

and
If AB is similar to CD, (A, B) is similar to (C, D), and rank A(BA)l =
rank C(DC)l for all integers l 0, then the conditions of Corollary 2.2.7 are
satisfied, and hence (A, B) (C, D). Conversely, all of these conditions are
satisfied if (A, B) (C, D).

40
Corollary 2.2.10 Let integers m, n be given and suppose A, C Mm,n and
B, D Mn,m . Then (A, B) (C, D) if and only if the following conditions
are satisfied:
(1) AB is similar to CD,
(2) (A, B) is similar to (C, D), and
(3) rank A(BA)l = rank C(DC)l for all l = 0, 1, . . . , max {m, n}.

2.3

Complex Orthogonal Equivalence and


the QS Decomposition

The classical QS decomposition (the classical algebraic polar decomposition) is


the fact that any nonsingular A Mn can be written as A = QS, where Q
Mn is complex orthogonal (Q1 = QT ) and S Mn is complex symmetric
(S = S T ) (see Theorem (3) on p. 6 of [G] and Theorem (6.4.16) of [HJ2]).
However, not every singular A Mn has a QS decomposition. If A = QS,
then AT A = S 2 = QT AAT Q, so that AAT is similar to AT A. For the example
A

"

1 i
0 0

M2 , AA =

"

0 0
0 0

, A A=

"

1 i
i 1

AAT is not similar to AT A, so this A cannot have a QS decomposition. In


[K1], it is shown that similarity of AAT and AT A is sucient for A to have a
QS decomposition. Here, we use what we have learned about contragredient
equivalence to give a dierent approach to this result.
Lemma 2.3.1 Let A, C Mm,n with m n. Then the following are equivalent:
(1) (A, AT ) (C, C T ).
(2) (A, AT ) (C, C T ) via orthogonal matrices, that is, there exist orthogonal Q1 Mm and Q2 Mn such that C = Q1 AQ2 .
T

(3) AA is similar to CC and

"

0 A
AT 0

is similar to

"

0 C
.
CT 0

41
Proof Suppose (A, AT ) (C, C T ). Then there exist nonsingular X Mm
and Y Mn such that C = XAY 1 and C T = Y AT X 1 . Hence,
XAY 1 = C = (C T )T = (X 1 )T AY T , so X T XA = AY T Y . From
this it follows that p(X T X)A = Ap(Y T Y ) for any polynomial p(t).

Let p(t) be a polynomial that interpolates the principal branch of t


and its first m 1 derivatives on the joint spectra of X T X and Y T Y .
Then Corollary (6.2.12) of [HJ2] guarantees that S1 p(X T X) and
S2 p(Y T Y ) are symmetric matrices such that S12 = X T X, S22 = Y T Y ,
and S1 A = AS2 . One checks that Q1 XS11 and Q2 Y S21 are
orthogonal, X = Q1 S1 , and Y = Q2 S2 . Now compute
1 T
T
C = XAY 1 = Q1 (S1 A)S21 Q1
2 = Q1 (AS2 )S2 Q2 = Q1 AQ2 .

Since C T = (Q1 AQT2 )T = Q2 AT QT1 as well, we have shown that (1)


implies (2) One checks easily that (2) implies (3). To show that (3)
implies (1), suppose that AAT is similar to CC T and (A, AT ) is similar to (C, C T ) (see Definition 2.2.9). Then rank (A, AT )2k+1 =
rank (C, C T )2k+1 for every integer k 0. Since rank X = rank X T
for any X Mm,n , we also have
rank (B, B T )2k+1 = rank B(B T B)k + rank B T (BB T )k
= rank B(B T B)k + rank (B(B T B)k )T
= 2 rank B(B T B)k
for any B Mm,n and any integer k 0. Thus, we must also have
rank A(AT A)k = rank C(C T C)k for all k = 0, 1, 2, . . . .
Corollary 2.2.10 guarantees that (A, AT ) (C, C T ), and our proof is
complete.
Theorem 2.3.2 Let A Mn be given. The following are equivalent:
(1) (A, AT ) (AT , A).
(2) AT A is similar to AAT .
(3) A = Q1 AT Q2 for some orthogonal Q1 , Q2 Mn .
(4) A = QS for some orthogonal Q and symmetric S in Mn .

42
(5) A = QAT Q for some orthogonal Q Mn .
Proof Notice that for any A, B Mn , (A, B) = (In , In )(B, A)(In , In )
and (In , In )1 = (In , In ); we conclude that (A, B) is similar to
(B, A). In particular, (A, AT ) is similar to (AT , A). Lemma 2.3.1
now guarantees the equivalence of (1), (2) and (3). Now suppose A =
Q1 AT Q2 for some orthogonal Q1 , Q2 Mn . Then
B AQT1 = Q1 AT Q2 QT1 = (AQT1 )T (Q2 QT1 ) = B T Q,
where Q Q2 QT1 . Now, BQT = B T = (B T Q)T = QT B, so QT = Q1
commutes with B, and hence Q (which is a polynomial in Q1 ) commutes with B. Since QB = BQ implies QB T = B T Q, Q also commutes
with B T . Let R be any polynomial square root of Q, so that R commutes with B and with B T . Moreover, B = B T Q = B T R2 . Thus,
BR1 = B T R = RB T . Hence, (BR1 )2 = (BR1 )(RB T ) = BB T ,
so BB T has a square root. Theorem (4) of [CH2] (see also Problem
23 in Section (6.3) of [HJ2]) guarantees that BB T has a symmetric
square root S that is similar to BR1 , say S = Y (BR1 )Y 1 . Thus,
S 2 = Y (BB T )Y 1 and
(S, S) = Z(BR1 , BR1 )Z 1
= Z(BR1 , RB T )Z 1
= Z1 (B, B T )Z11 ,
where Z Y Y and Z1 Z(In R). Hence, S 2 is similar to
BB T and (S, S) is similar to (B, B T ). Lemma 2.3.1 guarantees
that AQT1 = B = Q3 SQ4 for some orthogonal Q3 , Q4 Mn . Thus,
A = Q3 SQ4 Q1 = QS1 , where Q Q3 Q4 Q1 is orthogonal and S1
QT1 QT4 SQ4 Q1 is symmetric, so we have shown that (3) implies (4).
Now suppose A = QS for some orthogonal Q and symmetric S Mn .
Then S = QT A and AT = SQT = QT AQT so that A = QAT Q and (4)
implies (5). Since (5) trivially implies (3), our proof is complete.

2.4

The S Polar Decomposition

In the preceding section we studied a factorization related to the transpose


operator : Mn Mn given by (A) AT : we characterized those A

43
Mn that could be written as A = XY , where X Mn is nonsingular and
X 1 = (X), and Y Mn satisfies Y = (Y ). In this section we consider
a factorization of the same form for a general linear operator : Mn Mn
that shares some basic properties of the transpose operator.
Definition 2.4.1 Let Sn+ denote the set of nonsingular symmetric (AT = A)
matrices in Mn , let Sn denote the set of all nonsingular skew-symmetric
(AT = A) matrices in Mn , and set Sn Sn+ Sn . The spectrum of
A Mn , the set of eigenvalues of A, will be denoted by (A).
Since the rank of a skew-symmetric matrix is even, Sn is empty if n is
odd.
Lemma 2.4.2 Let : Mn Mn be a given linear operator. There exists
an S Sn such that (X) = SX T S 1 for all X Mn if and only if the
following three conditions hold for all A, B Mn :
(1) ((A)) = (A),
(2) ((A)) = A, and
(3) (AB) = (B)(A).
Proof The forward implication can be verified easily. Conversely, suppose
: Mn Mn is a linear operator satisfying the three given conditions. Under assumption (1), results about linear preservers (see
Theorem (4.5.7) of [HJ2]) guarantee that there exists a nonsingular
S Mn such that either (X) = SXS 1 for all X Mn or (X) =
SX T S 1 for all X Mn . Since (AB) = (B)(A) for any A, B
Mn , it must be that (X) = SX T S 1 . For all A Mn , we have
A = ((A)) = (SAT S 1 ) = S(SAT S 1 )T S 1 = SS T AS T S 1 =
(SS T )A(SS T )1 . Hence, SS T = I, and S = S T . Thus, S =
(S T )T = 2 S. Since S is nonsingular, = 1, that is, either
S = S T (S Sn+ ) or S = S T (S Sn ). Hence, S Sn , as asserted.
Definition 2.4.3 Let S Sn be given. We define the linear operator S :
Mn Mn by
S (A) SAT S 1 for all A Mn .

44
For a given S Sn , we say that A is S symmetric if S (A) = A; A is S
skew-symmetric if S (A) = A; A is S orthogonal if AS (A) = I (that is,
A is nonsingular and A1 = S (A)). Finally, we say that A has a S polar
decomposition if A = XY for some S orthogonal X and S symmetric Y .
If S = I, then S is ordinary transposition and the S polar decomposition is the QS decomposition (algebraic polar decomposition). If S =
[ sij ] Sn has si,ni+1 = 1 for i = 1, . . . , n and all other sij = 0, then S is
anti-transposition (see [Ho]).
The following assertions are easily verified.
Lemma 2.4.4 Let S Sn be given, and define S as in Definition 2.4.3.
Then
(1) S (A1 ) = S (A)1 for all nonsingular A Mn .
(2) AS (A) and S (A)A are S symmetric for all A Mn .
(3) p(S (A)) = S (p(A)) for all A Mn and all polynomials p(t).
(4) If A Mn is S symmetric, then p(A) is S symmetric for every
polynomial p(t).
(5) If A, B Mn are S orthogonal, then AB is S orthogonal.
(6) Suppose that S Sn+ and let S1 Sn+ be such that S12 = S. Then
S1 AS11 is S symmetric if and only if A is symmetric.
It is natural to ask if Theorem 2.3.2 can be generalized to the S polar
decomposition. In particular, is it true that a given A Mn has a S polar
decomposition if and only if AS (A) is similar to S (A)A?

2.4.1

The Nonsingular Case

The following is a generalization of the QS decomposition in the nonsingular


case.
Theorem 2.4.5 (Existence) Let S Sn and a nonsingular A Mn be
given. Then

45
(a) There exists a S symmetric Y0 Mn such that Y02 = S (A)A and Y0
is a polynomial in S (A)A.
(b) If Y is a given S symmetric matrix such that Y 2 = S (A)A, then
X(A, Y ) AY 1 is S orthogonal and A = X(A, Y )Y .
(c) A commutes with S (A) if and only if there exist commuting X and Y
Mn such that A = XY , X is S orthogonal, and Y is S symmetric.
Proof We may take Y0 to be any polynomial square root of S (A)A.
For the second assertion, suppose Y 2 = S (A)A. Then
I = Y 1 Y 2 Y 1 = Y 1 S (A)AY 1 = S (AY 1 )(AY 1 ),
so X(A, Y ) AY 1 is S orthogonal and A = AY 1 Y = X(A, Y )Y .

If A = XY with X S orthogonal and Y S symmetric, and if X


and Y commute, then AS (A) = (XY )S (XY ) = XY S (Y )S (X) =
XY 2 S (X) = Y 2 = S (A)A. Conversely, if A commutes with S (A),
then A commutes with Y0 , which is a polynomial in S (A)A. It follows
that Y0 commutes with X(A, Y0 ) AY01 . But X(A, Y0 ) is orthogonal
by our second assertion, and A = X(A, Y0 )Y0 .
Theorem 2.4.6 (Uniqueness) Let S Sn and a nonsingular A Mn be
given. Let Y0 Mn be S symmetric and suppose Y02 = S (A)A and Y0 is
a polynomial in S (A)A. Let X(A, Y0 ) AY01 , so A = X(A, Y0 )Y0 is a S
polar decomposition of A. Let X, Y Mn be given. Then X is S orthogonal,
Y is S symmetric, and A = XY if and only if there exists a S orthogonal
X1 Mn such that X1 commutes with Y0 , X12 = I, X = X(A, Y0 )X1 , and
Y = X1 Y0 .
Proof For the forward implication, let X1 Y Y01 . All the assertions
follow from Theorem 2.4.5 and the observation that Y0 is a polynomial in S (A)A = Y 2 , which ensures that Y (and hence X1 ) commutes with Y0 . Conversely, under the stated assumptions we have
XY = X(A, Y0 )X12 Y0 = X(A, Y0 )Y0 = A and S (Y ) = S (X1 Y0 ) =
Y0 S (X1 ) = Y0 X11 = Y0 X1 = X1 Y0 = Y , so Y is S symmetric.

46
Moreover,
S (X)X = S ((X(A, Y0 )X1 )X(A, Y0 )X1
= S (X1 )S (X(A, Y0 ))X(A, Y0 )X1
= S (X1 )X1 = I,
so X is S orthogonal.
As an immediate application of the S polar decomposition, we have the
following generalizations of the well-known fact that two complex symmetric
matrices are similar if and only they are complex orthogonally similar; see
Section 1.1.1 for analogs of the following three results involving real similarity.
Theorem 2.4.7 Let A, B Mn and S Sn be given. There exists a S
orthogonal X Mn such that A = XBX 1 if and only if there exists a
nonsingular Z Mn such that A = ZBZ 1 and S (A) = ZS (B)Z 1 .
Proof The forward implication is easily verified. For the converse, we have
ZBZ 1 = A = S (S (A)) = S (ZS (B)Z 1 ) = S (Z)1 BS (Z),
so (S (Z)Z)B = B(S (Z)Z). Theorem 2.4.5 guarantees that there
exists a S symmetric Y Mn that is a polynomial in S (Z)Z, as
well as a S orthogonal X Mn such that Z = XY . Since S (Z)Z
commutes with B, Y also commutes with B. Thus, A = ZBZ 1 =
(XY )B(Y 1 X 1 ) = XBX 1 .
The following immediate consequences of Theorem 2.4.7 generalize Corollaries (6.4.1819) of [HJ2], which correspond to S = I.
Corollary 2.4.8 Let A, B Mn and S Sn be given, and suppose there
exists a polynomial p(t) such that S (A) = p(A) and S (B) = p(B). Then A
is similar to B if and only if there exists a S orthogonal X Mn such that
A = XBX 1 .
Corollary 2.4.9 Let S Sn be given and suppose both A, B Mn are S
symmetric, both are S skew-symmetric, or both are S orthogonal. Then A
is similar to B if and only if A is S orthogonally similar to B.

47

2.4.2

The General Case

Let S Sn and A Mn be given. If A = XY is a S polar decomposition


of A, then AS (A) = (XY )S (XY ) = XY 2 S (X) = XY 2 X 1 is similar to
S (A)A = Y 2 . Is this condition sucient for the existence of a S polar
decomposition of A? Using properties enumerated in Lemma 2.4.4, one can
follow the same argument used to prove Lemma 2.3.1 and obtain the following
generalization.
Lemma 2.4.10 Let S Sn and A, B Mn be given. The following are
equivalent:
(1) (A, S (A)) (B, S (B)).
(2) (A, S (A)) (B, S (B)) via S orthogonal matrices, that is, there exist
S orthogonal X1 , X2 Mn such that A = X1 BX2 .
(3) AS (A) is similar to BS (B) and
"

0
A
S (A) 0

is similar to

"

0
B
S (B) 0

Since for any C, D Mn , (C, D) is always similar to (D, C) (see


Definition 2.2.9), the following is an immediate consequence of Lemma 2.4.10.
Theorem 2.4.11 Let S Sn and A Mn be given. There exist S orthogonal X1 , X2 Mn such that A = X1 S (A)X2 if and only if AS (A) is similar
to S (A)A.
Let A Mn and S Sn be given. If there are S orthogonal matrices
X1 , X2 Mn such that A = X1 S (A)X2 , let
B AS (X1 ) and X X2 S (X1 )
and follow the argument in the proof of Theorem 2.3.2 to show that:
(a) B = S (B)X and X is S orthogonal.
(b) X commutes with B, and hence also with S (B).

48
Suppose R is a given square root of X that commutes with B. For
example, R may be taken to be any polynomial square root of X.
(c) BS (B) = (BR1 )2 .
(d) R commutes with S (B) because RS (B) = RBX 1 = BRX 1 =
S (B)XRX 1 = S (B)R. It follows that BR1 = RS (B).
Suppose W Mn is a given S symmetric matrix that is similar to BR1 .

(e) W 2 is similar to BS (B) and (W, S (W )) = (W, W ) is similar


to (BR1 , BR1 ) = (BR1 , RS (B)), which in turn is similar to
(B, S (B)).
Lemma 2.4.10 now ensures that there are S orthogonal matrices Z1 , Z2 Mn
such that B = Z1 W Z2 , and hence A = [Z1 Z2 X1 ][S (Z2 X1 )W (Z2 X1 )] is a
S polar decomposition. Conversely, if A = ZY for Z, Y Mn such that
Y = S (Y ) and ZS (Z) = I, and if we take X1 = X2 = Z and R = I,
then A = X1 S (A)X2 , R2 = X2 S (X1 ) = I, and AS (X1 )R1 = Y is S
symmetric. We summarize our conclusions in the following lemma.
Lemma 2.4.12 Let S Sn and A Mn be given. There exist a S orthogonal X Mn and a S symmetric Y Mn such that A = XY if and only if
there exist S orthogonal X1 , X2 and a nonsingular R such that
(1) A = X1 S (A)X2 ,
(2) R2 = X2 S (X1 ) and R commutes with AS (X1 ), and
(3) AS (X1 )R1 is similar to a S symmetric matrix.
Notice that 2.4.12(2) is always attainable for some nonsingular R, since
one may take R to be any polynomial square root of S (X1 )X2 . Moreover,
Theorem 2.4.11 ensures that 2.4.12(1) holds if and only if AS (A) is similar
to S (A)A. However, it is not clear that among the choices of X1 , X2 , and R
that satisfy 2.4.12(1, 2) there is one for which 2.4.12(3) holds. If S Sn+ , we
will show that any n-by-n matrix is similar to a S symmetric matrix so that
2.4.12(3) is satisfied for any choice of X1 , X2 , and R that satisfy 2.4.12(1,
2). In this case the question asked at the beginning of this section has an
armative answer. However, for each even n and each S Sn there is some
A Mn such that AS (A) is similar to S (A)A but A does not have a S
polar decomposition.

49

2.4.3

The Symmetric Case

Lemma 2.4.13 Let S Sn+ and A Mn be given. There exists a S


symmetric Y Mn that is similar to A.
Proof Let B Mn be symmetric and similar to A, let S1 be any symmetric
square root of S, and let C S1 BS11 . Then C is similar to A and
S (C) = SC T S 1 = S(S1 BS11 )T S 1 = S12 (S11 BS1 )S12 = S1 BS11 =
C, so that C is S symmetric.
Theorem 2.4.14 Let S Sn+ and a S symmetric Y Mn be given, and
suppose Y = A2 for some A Mn . Then there exists a S symmetric Y1 that
satisfies Y12 = Y and Y1 is similar to A.
Proof Lemma 2.4.13 guarantees that there exists a S symmetric B that
is similar to A. Then B 2 (which is S symmetric by Lemma 2.4.4(4))
is similar to A2 = Y . Corollary 2.4.9 guarantees that there exists a
S orthogonal X Mn such that Y = XB 2 X 1 = (XBX 1 )2 . Then
XBX 1 = XBS (X) is S symmetric and is similar to A.
Lemmata 2.4.12, 2.4.10 and 2.4.13 now imply the following result about
S polar decomposition in the symmetric case, which is a generalization of
Theorem 2.3.2.
Theorem 2.4.15 Let S Sn+ and A Mn be given. The following are
equivalent:
(1) (A, S (A)) (S (A), A).
(2) AS (A) is similar to S (A)A.
(3) A = X1 S (A)X2 for some S orthogonal X1 , X2 Mn .
(4) A = XY for some S orthogonal X Mn and S symmetric Y Mn .
(5) A = XS (A)X for some S orthogonal X Mn .

50

2.4.4

The Skew-Symmetric Case

Let an even integer n 2, S Sn , and a S symmetric Y Mn be


given. Then Y = S (Y ) = SY T S 1 , so Y S = SY T = S T Y T = (Y S)T .
The skew-symmetric matrix Y S, and hence also Y itself, must have an even
rank. Thus, if n is even, no A Mn with odd rank can have a S polar
decomposition.
Let an even integer n 2 and S Sn be given. Let X Mn be
nonsingular and such that
T

XSX =

"

0 1
1 0

S2 ,

if n > 2 or is absent if n = 2. Consider B J2 (0) 0 Mn ,


where S2 Sn2
1
and set A X BX. One computes

AS (A) =
=
=
=
=

(X 1 BX)S(X 1 BX)T S 1
X 1 B(XSX T )B T X T S 1 (X 1 X)
X 1 B[(XSX T )B T (XSX T )1 ]X
X 1 (J2 (0) 0n2 )(J2 (0) 0n2 )X
0.

Similarly, S (A)A = 0. Hence, AS (A) is similar (in fact, equal) to S (A)A,


but rank A = 1, so A does not have a S polar decomposition.
It is an open question to characterize the even-rank matrices with even
dimensions that have a S polar decomposition for a given S Sn .

2.5

A and A

It is natural to ask what the analogs of Lemma 2.3.1 are for the mappings
A A or A A . Although these mappings satisfy some of the conditions
2.4.2, they are not linear and do not preserve the spectrum. For a given
A, C Mm,n , when is (A, A ) (C, C )? If m = n, when is (A, A) (C, C)?
The first case is immediate, and we present it without proof.
Theorem 2.5.1 Let A, C Mm,n be given. Then the following are equivalent:
(1) (A, A ) (C, C ).

51
(2) AA is similar to CC .
(3) A and C have the same singular values, including multiplicities.
(4) (A, A ) (C, C ) via unitary matrices, that is, A = U CV for some
unitary U Mn and V Mm .
1

Recall that A, C Mn are consimilar if A = SCS for some nonsingular S Mn . We now apply the analysis of Lemma 2.3.1 to the conjugate
mapping.
Theorem 2.5.2 Let A, C Mn be given. Then (A, A) (C, C) if and only
1
if there exists a nonsingular S Mn such that A = SCS , that is, if and
only if A is consimilar to C.
Proof Suppose (A, A) (C, C). Then there exist nonsingular X, Y Mn
such that A = XCY 1 and A = Y CX 1 . Hence XCY 1 = A =
1
1
1
A = Y CX . It follows that ZC = CZ , where Z Y X. Now,
Theorem (6.4.20) of [HJ2] guarantees that there is a primary matrix
1
function log X such that log(Z ) = (log Z) and elog Z = Z. If we
1
1
1
1
1
set F e 2 log Z , then F 2 = Z and F = e 2 log Z = e 2 log Z , so there
1
1
is a single polynomial p(t) such that F = p(Z) and F
= p(Z ).
1
1
Thus, F C = p(Z)C = Cp(Z ) = CF
and so A = XCY 1 =
1
1
Y (Y X)CY 1 = Y ZCY 1 = Y F 2 CY 1 = SCS , where S Y F .
Therefore, A and C are consimilar. The converse is easily verified.
Corollary 2.2.7 and Theorem 2.5.2 now proves the following, which is
Theorem (4.1) in [HH1].
Corollary 2.5.3 Let A, C Mn be given. Then A is consimilar to C if and
only if the following two conditions are satisfied:
(1) AA is similar to CC, and
(2) rank (AA)k A = rank (CC)k C for all k = 0, 1, . . . , n.

52

2.6

The Nilpotent Parts of AB and BA

Let integers m, n be given with m n, and let A Mm,n and B Mn,m be


given. It is known (Theorem (1.3.20) of [HJ1]) that the nonsingular Jordan
blocks in the Jordan canonical forms of AB and BA are the same; this also
follows from Lemma 2.2.2. What can be said about the nilpotent Jordan
blocks of AB and BA? The statement of the following result is due to
Flanders [F] (see [T] for a dierent approach); we give a new proof based on
the canonical form in Theorem 2.2.4.
Theorem 2.6.1 Let positive integers m, n be given with m n, let A
Mm,n and B Mn,m be given, and let k rank (AB)m . There exist integers
m1 mp 1 and n1 np 1, and nonsingular X Mm and
Y Mn such that
X(AB)X 1 = J(AB) Jm1 (0) Jmp (0) 0
and
Y (BA)Y 1 = J(AB) Jn1 (0) Jnp (0) 0,
where J(AB) Mk is the nonsingular part of the Jordan canonical form of
AB, and |mi ni | 1 for each i = 1, . . . , p. Conversely, let integers m1
mp 1 and n1 np 1 be given, and set m m1 + + mp
and n n1 + + np . Suppose that |mi ni | 1 for each i = 1, . . . , p. Then
there exist A Mm,n and B Mn,m such that
AB is similar to Jm1 (0) Jmp (0)
and
BA is similar to Jn1 (0) Jnp (0).
Proof Theorem 2.2.4 ensures that there is an integer p 0 and nonsingular
X Mm and Y Mn such that

XAY 1

JA 0
0 A1
..
..
.
.
0 0
0 0

...

0
0
..
.

0
0
..
.

Ap
0 0 0

53
and

Y BX 1

JB 0
0 B1
..
..
.
.
0 0
0 0

...

0
0
..
.

0
0
..
.

Bp
0 0

where JA , JB Mk are nonsingular; each

T
T
T
T
(Ai , Bi ) {(Imi , Jm
(0)), (Jm
(0), Imi ), (Hmi , Km
), (Km
, Hmi )},
i
i
i
i

and m1 mp 1. Hence, X(AB)X 1 = JA JB A1 B1


Ap Bp 0, and Y (BA)Y 1 = JB JA B1 A1 Bp Ap 0. Since
JA and JB Mk are nonsingular, JA JB is similar to JB JA and both
T
products are similar to J(AB). For all i such that Ai {Im1 , Jm
(0)},
1
T
T
we have Ai Bi = Bi Ai = Jm1 (0); if Ai = Hm1 , then Ai Bi = Jm1 1 (0)
T
T
T
and Bi Ai = Jm
(0); and if Ai = Km
, then Ai Bi = Jm
(0) and Bi Ai =
1
1
i
T
Jm1 1 (0). Notice that all of the sizes of the corresponding nilpotent
Jordan blocks of AB and BA just enumerated dier by at most one.
Repetition of this enumeration for m2 , . . . , mp gives the asserted result.
For the converse, we look at the three possibilities for each i = 1, . . . , p:
T
If mi = ni , set Ai = Imi and Bi = Jni (0); if mi = ni + 1, set Ai = Km
i
T
and Bi = Hmi ; and if mi = ni 1, set Ai = Hmi and Bi = Kmi . Then

A1 0
B1 0
.

.
.
.
. . . ..

. . ..
A
.
..
and B ..

0 Ap
0 Bp
have the asserted properties.
We can also use the results in Section 2.2 to give a dierent proof for
Theorem (3) in [F] (see also [PM]).
Theorem 2.6.2 Let A Mm,n , B Mn,m , and a nilpotent N Mm be
given. Suppose that N A = 0. Then the Jordan canonical forms of AB and
AB + N have the same nonsingular parts, and these two matrices have the
same set of eigenvalues.

54
Proof Lemma 2.2.2 guarantees that there exist nonsingular X Mm and
Y Mn such that
XAY

"

Ik 0
0 A

and Y BX

"

J(AB) 0
0
B

where J(AB) Mk is the nonsingular part of the Jordan canonical


form of AB and AB Mmk is nilpotent. Partition XN X 1 as
XN X

"

N11 N12
N21 N22

, where N11 Mk and N22 Mmk .

Notice that
0 = N A = XN AY

= (XN X

)(XAY

)=

"

N11 N12 A
N21 N22 A

so that N11 = 0, N21 = 0, N12 A = 0 and N22 A = 0. Thus,


XN X

"

0 N12
0 N22

mk
and the nilpotence of N22 follows from that of N . Note that N22
=
mk
r t
(AB)
= 0, so any product of the form (AB) N22 vanishes when
max {r, t} m k. Since N22 A = 0, we have
l1
l
+ N22
(AB + N22 )l = (AB)l + (AB)l1 N22 + + (AB)N22

for every l = 1, 2, . . .; for l = 2(mk), each of these summands vanishes.


Thus AB + N22 is nilpotent. Since the spectra of J(AB) and AB + N22
are disjoint, it follows (see Problem (10) in Section (2.4) of [HJ1]) that
X(AB + N )X

"

J(AB)
N12
AB + N22
0

(2.6.33)

is similar to J(AB) (AB + N22 ), whose nonsingular part is J(AB),


as asserted. The assertion about the spectra of AB and AB + N is
apparent from (2.6.33).

55

2.7

A Sucient Condition For Existence of


A Square Root

It is known that for any A Mn , AA is similar to the square of a real matrix


(Corollary (4.10) of [HH1]), and we have seen that AAT has a square root
whenever AAT is similar to AT A (since A = QS and AAT = (QSQT )2 in
this case). Our goal in this section is to show how the canonical form for
contragredient equivalence leads to a sucient condition for existence of a
square root that encompasses both of these observations.
Lemma 2.7.1 Let an integer n > 1, A Mn1,n , and B Mn,n1 be given.
Suppose that BA is similar to Jn (0). Then
(1) AB is similar to Jn1 (0), and
k

(2) rank A(BA) = rank B(AB) =

nk1
0

for k = 0, . . . , n 1
for k n.

Proof The first assertion is an immediate consequence of Flanders Theorem 2.6.1, but it is easy to give a direct proof: Since (BA)n = 0
and (AB)n+1 = A(BA)n B = 0, we see that AB is nilpotent, so
rank BA < n 1. But n 2 = rank (BA)2 rank AB < n 1,
so rank AB = n 2, which ensures that AB is similar to Jn1 (0).
Using (1), we have
n k 1 = n (k + 1) =
=

=
and
nk1 =
=

rank (BA)k+1
rank B(AB)k A
rank B(AB)k
rank (AB)k
(n 1) k = n k 1

rank (BA)k+1
rank BA(BA)k
rank A(BA)k
rank (AB)k A
rank (AB)k
n k 1,

56
so rank B(AB)k = rank A(BA)k = n k 1 for k = 0, 1, . . . , n 1.
Since (AB)n1 = 0 and (BA)n = 0, rank B(AB)k = rank A(BA)k = 0
for all k n.
Let m n be given, and let A Mm,n and B Mn,m be given. Corollary 2.2.8 guarantees that there exist nonsingular X Mm and Y Mn such
that

XAY 1 =

JA 0
0
0 A1 0
0 0 A2
0 0
0

0
0
0
0

and Y BX 1 =

JB 0 0
0 B1 0
0 0 B2
0 0 0

0
0
0
0

where JA , JB Mk are nonsingular and JA JB is similar to the nonsingular


part of the Jordan canonical form of AB; A1 and B1 contain all canonical
T
blocks of the form Imi and Jm
(0), that is
i
A1 [Im1 JT1 (0)] [Imt JTt (0)]
and
T
T
B1 [Jm
(0) I1 ] [Jm
(0) It ]
t
1

with m1 mt 0 and 1 t 0 (if all mi = i = 0, then A1


and B1 are absent); and A2 and B2 contain all the canonical blocks of the
T
form Hmi and Km
, that is,
i
A2

"

Hn1 0
0 KT1

B2

"

KnT1 0
0 H1

and

"

Hns 0
0 KTs

"

KnTs 0
0 Hs

with n1 ns 0 and 1 s 0 (if all nj = j = 0, then A2


and B2 are absent).
Since Hp Mp1,p and KpT Mp,p1 satisfy Hp KpT = JpT (0), it follows
from Lemma 2.7.1(2) (or from inspection of the respective products) that
rank Hp (KpT Hp )k = rank KpT (Hp KpT )k for all k 0. Thus, we always have
rank A2 (B2 A2 )k = rank B2 (A2 B2 )k for all k = 0, 1, 2, . . .

57
without any assumptions on A or B.
If we now assume that
rank A(BA)k = rank B(AB)k for all k = 0, 1, 2, . . . ,
then it follows that
rank A1 (B1 A1 )k = rank B1 (A1 B1 )k for all k = 0, 1, 2, . . . .

(2.7.34)

If 1 > m1 , take k = 1 1 m1 , so JT1 (0)k 6= 0, JTi (0)k+1 = 0 and


Jmi (0)k = 0 for all i = 1, . . . , t and hence
T
T
A1 (B1 A1 )k = [Jm
(0)k JT1 (0)k+1 ] [Jm
(0)k JTt (0)k+1 ] = 0
t
1

and
T
T
B1 (A1 B1 )k = [Jm
(0)k JT1 (0)k ] [Jm
(0)k JTt (0)k ]
t
1
= [0 JT1 (0)k ] [0 JTt (0)k ] 6= 0.

Since this contradicts (2.7.34), we must have 1 m1 . A symmetric argument, taking k = m1 1, shows that m1 1 . Hence m1 = 1 . Repetition of
this argument shows that mi = i for all i = 2, . . . , t. Conversely, if i = mi
for all i, then rank A(BA)k = rank B(AB)k for all k = 0, 1, 2, . . .. We summarize what we have learned in the following result, which also plays a key
role in identifying a canonical form under orthogonal equivalence.
Lemma 2.7.2 Let positive integers m, n be given with m n, and let A
Mm,n and B Mn,m be given. There exist nonsingular X Mm and Y Mn
such that

XAY 1 =

JA 0
0
0 A1 0
0 0 A2
0 0
0

0
0
0
0

and Y BX 1 =

JB 0 0
0 B1 0
0 0 B2
0 0 0

0
0
0
0

where JA , JB Mk are nonsingular and JA JB is similar to the nonsingular


part of the Jordan canonical form of AB,
T
T
A1 = [Im1 Jm
(0)] [Imt Jm
(0)],
t
1

(2.7.35)

T
T
B1 = [Jm
(0) Im1 ] [Jm
(0) Imt ],
t
1

(2.7.36)

58

A2

"

Hn1 0
0 KT1

B2

"

KnT1 0
0 H1

and

"

Hns 0
0 KTs

"

KnTs 0
0 Hs

if and only if rank A(BA)k = rank B(AB)k for all k = 0, 1, 2, . . ..


Notice that A1 B1 = B1 A1 . Thus, we always have
rank (A1 B1 )k = rank (B1 A1 )k for all k = 1, 2, 3, . . .
without any assumptions on A or B.
If we now assume that
rank (AB)k = rank (BA)k for all k = 1, 2, 3, . . . ,
then it follows that
rank (A2 B2 )k = rank (B2 A2 )k for all k = 1, 2, 3, . . . .
An argument similar to the proof of Lemma 2.7.2 now shows that nj = j
for j = 1, . . . , s. Conversely, if nj = j for each j, then rank (AB)k =
rank (BA)k for all k = 1, 2, 3, . . .. Thus, we have the following complement
to Lemma 2.7.3.
Lemma 2.7.3 Let positive integers m, n be given with m n, and let A
Mm,n and B Mn,m be given. There exist nonsingular X Mm and Y Mn
such that

XAY 1 =

JA 0
0
0 A1 0
0 0 A2
0
0 0

0
0
0
0

and Y BX 1 =

JB 0 0
0 B1 0
0 0 B2
0 0 0

0
0
0
0

where JA , JB Mk are nonsingular and JA JB is similar to the nonsingular


part of the Jordan canonical form of AB,
A1 = [Im1 JT1 (0)] [Imt JTt (0)],

59
T
T
(0) I1 ] [Jm
(0) It ],
B1 = [Jm
t
1

A2 =

"

Hn1 0
0 KnT1

B2 =

"

KnT1 0
0 Hn1

and

"

Hns 0
0 KnTs

"

KnTs 0
0 Hns

(2.7.37)

(2.7.38)

if and only if rank (AB)k = rank (BA)k for all k = 1, 2, 3, . . ..


Theorem 2.7.4 Let A Mm,n and B Mn,m be given. Suppose that for
every integer k 0, we have
(1) rank A(BA)k = rank B(AB)k , and
(2) rank (BA)k+1 = rank (AB)k+1 .
Then AB and BA have square roots.
Proof Suppose m n. Then AB is similar to J(AB) A1 B1 A2 B2 0,
in which J(AB) (if present) is nonsingular and hence has a square
T
T
(0) Jm
(0)]
root. Lemma 2.7.2 guarantees that A1 B1 = [Jm
1
1
T
T
[Jm
(0)J
(0)],
and
hence
has
a
square
root.
Lemma
2.7.3
guarantees
mt
t
T
T
T
T
that A2 B2 = [Jn1 1 (0) Jn1 (0)] [Jns 1 (0) Jns (0)], so that A2 B2
also has a square root. It follows that AB has a square root. Similarly,
BA has a square root.
The conditions of Theorem 2.7.4, though sucient, are not necessary for
the existence of a square root of AB. The example A1 I2k and B1
Jk (0) Jk (0) shows that condition (1) need not be satisfied (although (2) is).
The example

0 0
"
#
1 0
1 0 0 0

A2
and B2

0 0 1 0
0 0
0 1

shows that condition (2) need not be satisfied (although (1) is). Moreover,
the example
"
#
"
#
A1 0
B1 0
A
and B
0 A2
0 B2
shows that neither condition need be satisfied.

60
Corollary 2.7.5 Let A Mn be given.
(1) AA has a square root.
(2) Let S Sn be given. If S (A)A is similar to AS (A), then S (A)A has
a square root. In particular, if AT A is similar to AAT , then AT A has
a square root.
Proof Since rank X = rank X = rank S (X) for any X Mn and any
S Sn , one checks that the conditions of Theorem 2.7.4 are satisfied
in each case.
The sucient condition in Corollary 2.7.5 is not necessary. The example
A

"

1 0
i 0

shows that AT A can have a square root without being similar to AAT .

2.8

A Canonical Form for


Complex Orthogonal Equivalence

For a given A Mm,n , what standard form can be achieved by Q1 AQ2 for
complex orthogonal Q1 Mm and Q2 Mn ? In our search for a standard
form, we are guided by the following facts and observations:
If m = n and A = QS is a QS decomposition of A, then QT A = S is
symmetric and AAT is orthogonally similar to S 2 , so we seek a standard
form that is as much as possible like a symmetric matrix whose square
is similar to AAT .
Lemma 2.3.1 ensures that Q1 AQ2 = C if and only if (A, AT ) (C, C T ),
so we may approach our standard form via a sequence of contragredient
equivalences applied to the pair (A, AT ).
rank A(AT A)k = rank (A(AT A)k )T = rank AT (AAT )k for all k =
0, 1, 2, . . ., so Lemma 2.7.2 ensures that
(A, AT ) ([JA A1 A2 0], [JAT B1 B2 0])

61
in which JA JAT is similar to the nonsingular part of the Jordan canonical form of AAT and all the direct summands in
T
T
(0)] [Imt Jm
(0)]
A1 = [Im1 Jm
t
1

and
T
T
B1 = [Jm
(0) Im1 ] [Jm
(0) Im1 ],
t
1

m1 mt 1, are paired in sizes. It may not be possible to write


A2 =

"

Hr1 0
0 KsT1

B2 =

"

KrT1 0
0 Hs1

and

"

Hrq 0
0 KsTq

(2.8.39)

"

KrTq 0
0 Hsq

(2.8.40)

in such a way that all ri = si ; the blocks Kk and Hk are defined in


(2.2.14) and satisfy the identities in Theorem 2.2.4(g). But we may
select the blocks (if any) for which the sizes can be paired, and may
write
"
#
"
#
Hj1 0
Hjs 0
A2 =

A3
0 KjT1
0 KjTs
and
B2 =

"

KjT1 0
0 Hj1

KjTs 0
0 Hjs

"

Hnp 0
0 KTp

(2.8.41)

"

KnTp 0
0 Hp

(2.8.42)

"

B3 ,

in which
A3 =

"

Hn1 0
0 KT1

B3 =

"

KnT1 0
0 H1

and

and ni 6= k for all i, k = 1, . . . , p.


The preceding comments motivate the following analyses of standard
forms for
(a) (JA , JAT ),

62
(b) ([Ik JkT (0)], [JkT (0) Ik ]),
(c)

"

Hk 0
0 KkT

# "

KkT 0
0 Hk

#!

(d)

"

Hk 0
0 KlT

# "

KkT 0
0 Hl

#!

, and
with k 6= l.

Proposition 2.8.1 Let A, B Mk be nonsingular. Then there exists a symmetric S Mk such that (A, B) (S, S).
Proof Let S1 Mk be a symmetric matrix that is similar to AB. Let S be
any polynomial square root of S1 . Then S is symmetric and S 2 = S1
is similar to AB. Lemma 2.2.2 guarantees that (A, B) (S, S).
Definition 2.8.2 Let C and a positive integer k be given. Then

Sk ()

1
0
..
.

1 0
1
1
.. . .
.
.
0 0

...
...

0
0
..
.

+ i

0 1 0
1 0 1

1 0
. 0
Mk .
.. ..
.
.
1
. .

0
0 0
0
0
..
.

It is known that the symmetric matrix Sk () is similar to the Jordan


block Jk () (see pp. 207209 of [HJ1]).
Proposition 2.8.3 For a given positive integer k,
([Ik JkT (0)], [JkT (0) Ik ]) (S2k (0), S2k (0)).
Proof Write Ak Ik JkT (0) and Bk JkT (0) Ik . Notice that Ak Bk =
Bk Ak = JkT (0) JkT (0) is similar to J2k (0)2 , which is similar to S2k (0)2 .
One checks that
rank Bk (Ak Bk )l =
=
=
=

rank Ak (Bk Ak )l
rank S2k (0)2l+1
rank Jk (0)l+1 Jk (0)l
2k 2l 1 for all l = 0, 1, 2, . . . .

Corollary 2.2.7 now guarantees that (Ak , Bk ) (S2k (0), S2k (0)).

63
Proposition 2.8.4 For a given positive integer k,
"

Hk 0
0 KkT

Proof Let
Ak

# "

"

KkT 0
0 Hk

Hk 0
0 KkT

#!

(S2k1 (0), S2k1 (0)).

, and Bk

"

KkT 0
0 Hk

T
T
(0) JkT (0) and Bk Ak = JkT (0) Jk1
(0),
Notice that Ak Bk = Jk1
2
so that Ak Bk and Bk Ak are similar to J2k1 (0) , which is similar to
S2k1 (0)2 . One checks that

rank Bk (Ak Bk )l =
=
=
=

rank Ak (Bk Ak )l
rank S2k1 (0)2l+1
rank Hk (JkT (0))l KkT (Jk1 (0))l
2k 2l 2 for all l = 0, 1, 2, . . . .

Corollary 2.2.7 now guarantees that (Ak , Bk ) (S2k1 (0), S2k1 (0)).
We have now analysed all of the basic blocks except those in (2.8.41) and
(2.8.42). If A3 and B3 are present, there is no hope of finding a symmetric S
such that (A3 , B3 ) (S, S) because there must be at least one k for which
rank A3 (B3 A3 )k 6= rank B3 (A3 B3 )k . Notice that it is precisely the presence
of A3 and B3 that is the obstruction to writing A = QS when m = n. The
next two results will permit us to handle this remaining case.
Lemma 2.8.5 Let k be a given integer with 1 k < n. There exists a
C Mk1,n such that C T C = Sk (0) 0.
Proof The symmetric n-by-n matrix Sk (0) 0 Mn has a singular value
decomposition Sk (0) 0 U T 2 U (see Corollary (4.4.4) of [HJ1]),
where U Mn is unitary and diag (1 , . . . , k1 , 0, . . . , 0) Mn
with 1 k1 > 0. Now let 1 diag (1 , . . . , k1 ) Mk1
and set C [1 0]U Mk1,n . Then C T C = U T 2 U = Sk (0) 0, as
desired.

64
Proposition 2.8.6 Let k be a given positive integer. There exists a C
Mk1,k such that C T C = Sk (0) and (C, C T ) (Hk , KkT ).
Proof Use Lemma 2.8.5 to construct a C Mk1,k such that C T C = Sk (0),
which is similar to KkT Hk = JkT (0). Lemma 2.7.1 ensures that CC T
and Hk KkT are similar to Jk1 (0), and that
rank C(C T C)l =
=
=
=

rank C T (CC T )l
rank Hk (KkT Hk )l
rank KkT (Hk KkT )l
k l 1 for all l = 0, 1, 2, . . . .

Corollary 2.2.7 guarantees that (Hk , KkT ) (C, C T ).


Assembling the preceding results, we can now state a canonical form
under orthogonal equivalence.
Theorem 2.8.7 Let positive integers m, n be given with m n, and let
A Mm,n be given. There exist orthogonal Q1 Mm and Q2 Mn such that
Q1 AQ2 =

"

S 0
0 C

(2.8.43)

where S S1 S2 S3 is symmetric,
(a1) S1 is symmetric and S12 is similar to the nonsingular part of the Jordan
canonical form of AAT ,
(a2) S2 S2m1 (0) S2mt (0),
(a3) S3 S2j1 1 (0) S2js 1 (0),
and

in which

C1 0 0
0
..
0 D1 . . . ...
.
.. . . .
. . . . . . ...
.
.. .
..
. . Cp 0
.
.
0 0 Dp
0
0 0
0
0

65
(b1) Ci Mni 1,ni has rank ni 1, CiT Ci = Sni (0), and Ci CiT is similar to
Sni 1 (0) for each i ,
(b2) Dj Mj ,j 1 has rank j 1, Dj DjT = Sj (0), and DjT Dj is similar
to Sj 1 (0) for each j , and
(b3) ni 6= j for all i and j.
Proof One checks that (A3 , B3 ) (C, C T ), so that (A, AT ) (R, RT ), where
R

"

S 0
0 C

Lemma 2.3.1 guarantees that there exist orthogonal Q1 Mm and


Q2 Mn such that R = Q1 AQ2 .
Lemma 2.8.8 Let positive integers m, n be given with m n, and let A
Mm,n be given. Let orthogonal Q1 Mm and Q2 Mn be such that Q1 AQ2
has the form (2.8.43). Then
(1) AT A is diagonalizable if and only if the following four conditions hold:
(1a) S1 is diagonalizable and may be taken to be diagonal,
(1b) S2 = S2 (0) S2 (0), or is absent,
(1c) S3 = 0, or is absent, and

(1d) C may have some Dj M2,1 but C does not have Ci .


(2) rank A = rank AT A if and only if the following three conditions hold:
(2a) S2 is absent,
(2b) S3 = 0, or is absent, and
(2c) C may have some Ci but C does not have Dj .
(3) rank (AT A)k = rank (AAT )k for all k = 1, 2, 3, . . . if and only if C = 0.
Proof To show (1), notice that AT A is diagonalizable if and only if the
nonsingular part of the Jordan canonical form of AT A is diagonalizable
and the singular part of the Jordan canonical form of AT A is equal to
0. These conditions are equivalent to conditions (1a) - (1d).

66
Suppose rank A = rank AT A. Then rank S + rank C = rank S 2 +
rank C T C. Since rank S rank S 2 and rank C rank C T C, we must
have
rank S = rank S 2 and rank C = rank C T C.
It follows that S2 is absent, and S3 is either absent or is equal to 0.
Now, rank Ci = rank CiT Ci = ni 1 for each i and rank Dj = j 1 >
j 2 = rank DjT Dj for each j. Hence, C may contain some Ci but not
Dj . The converse can be verified easily.
To prove (3), notice that rank X = rank X T for all X Mm,n . Hence,
Lemmata 2.7.2 and 2.7.3 ensure that rank (AT A)k = rank (AAT )k for
all k = 1, 2, 3, . . . if and only if A3 = 0 in (2.8.41) and B = 0 in (2.8.42).
This condition is equivalent to C = 0.
We give a new proof for the following analog of the singular value decomposition, which is Theorem (2) of [CH2].
Corollary 2.8.9 Let A Mm,n be given. Then A can be written as A =
Q1 Q2 where Q1 Mm and Q2 Mn are orthogonal and [ij ] Mm,n
is such that ij = 0 for i 6= j if and only if
(1) AT A is diagonalizable, and
(2) rank A = rank AT A.
Proof The forward assertion can be verified easily. For the converse, suppose
m n. Let orthogonal Q1 Mm and Q2 Mn be such that Q1 AQ2
has the form (2.8.43). It follows from Lemma 2.8.8 that S2 = 0 (or is
absent), S3 = 0 (or is absent), and S1 may be taken to be diagonal.
Now, Lemma 2.8.8(1d) and (2c) show that C does not have Ci and Dj .
Hence, C = 0, or is absent. Hence Q1 AQ2 [ij ] satisfies ij = 0 for
i 6= j, as desired.
The following (problem (34) on p. 488 of [HJ2]) is a generalization of the
QS decomposition in the non-square case.
Corollary 2.8.10 Let integers m, n be given with m n, and let A Mm,n
be given. There exist a Q Mm,n with QT Q = In , and a symmetric S Mn
such that A = QS if and only if rank (AT A)k = rank (AAT )k for all k =
1, 2, 3, . . ..

67
Proof Suppose A = QS with Q Mm,n , QT Q = In , and a symmetric Y
Mn . Let A1 [A 0] Mm , so that rank (AT1 A1 )k = rank (AT A)k and
rank (A1 AT1 )k = rank (AAT )k for all k 1. Theorem (2.7) of [CH1]
guarantees that there exists P [Q R] Mm such that P T P = Im .
Now,
A1 = [A 0] = [QS 0] = [Q R]

"

S 0
0 0

=P

"

S 0
0 0

is a QS decomposition of A1 . Hence, rank (AT A)k = rank (AT1 A1 )k =


rank (A1 AT1 )k = rank (AAT )k for all k = 1, 2, 3, . . .. To prove the
converse, let orthogonal matrices Q1 Mm and Q2 Mn be such that
Q1 AQ2 has the form (2.8.43). Lemma 2.8.8(3) ensures that C = 0, so
that
"
#
"
#
S 0
In
Q1 AQ2 =
=
(S 0) P1 Z1 ,
0 0
0
where P1 [In 0]T Mm,n and Z1 S 0 Mn . Hence, A =
QT1 P1 Z1 QT2 = QZ, where Z = Q2 Z1 QT2 Mn is symmetric (since Z1
is), and Q QT1 P1 QT2 Mm,n . Now, QT Q = Q2 P1T P1 QT2 = Q2 In QT2 =
In , as asserted.

Chapter 3
Linear Operators Preserving
Orthogonal Equivalence on
Matrices

68

69

3.1

Introduction and Statement of Result

Let Mm,n be the set of all m-by-n matrices with complex entries; and we
write Mn Mn,n . A matrix Q Mn is said to be (complex) orthogonal if
QT Q = I. Two matrices A, B Mm,n are said to be (complex) orthogonally
equivalent, denoted A B, if A = Q1 BQ2 for some orthogonal matrices
Q1 Mm and Q2 Mn . One checks that is an equivalence relation on
Mm,n . We are interested in studying linear orthogonal equivalence preservers
on Mm,n , that is, T : Mm,n Mm,n such that T (A) T (B) whenever
A B. We prove the following, which is our main result.
Theorem 3.1.1 Let T be a given linear operator on Mm,n . Then T preserves
orthogonal equivalence if and only if there exist complex orthogonal Q1 Mm
and Q2 Mn and a scalar C such that either
(1) T (A) = Q1 AQ2 for all A Mm,n , or
(2) m = n and T (A) = Q1 AT Q2 for all A Mn .
We have divided the proof of the theorem into three parts. In Section 3.2,
we establish that either T = 0 or T is nonsingular. In Section 3.3, we show
that T has the form asserted in the theorem, except that Q1 and Q2 are
nonsingular, but are not necessarily orthogonal. Finally, in Section 3.4, the
theorem is proved.
Similar problems have been studied in [HHL] and [HLT], and we use their
general approach. We analyse the orbits
O(A) = {X Mm,n : X A}
and their corresponding tangent spaces TA . It is known that these orbits
are homogeneous dierentiable manifolds [Bo]. As in [HHL] and [HLT], it
is necessary to develop some special techniques to supplement the general
approach in solving the problem. Unlike the relations considered in [HLT],
little is known about complex orthogonal equivalence and a simple canonical form is not available. This makes the problem more dicult and more
interesting. In fact, the results obtained in this paper may give more insight
into, and better understanding of, the orthogonal equivalence relation.
We denote the standard basis of Mm,n by {E11 , E12 , . . . , Em,n }. When
Eij Mn , we set Fij Eij Eji and to avoid confusion, when Eij Mm , we

70
set Gij Eij Eji . Notice that all Fij and Gij are skew-symmetric matrices.
We denote the standard basis of Cn by {e1 , . . . , en }. The n-by-n identity
matrix is denoted by I, or when necessary for clarity, by In . A vector x Cn
is said to be isotropic if xT x = 0.

3.2

Preliminary Results

The following observation is used repeatedly in our arguments.


Lemma 3.2.1 Let V be a subspace of Mm,n that is invariant under orthogonal equivalence, and let T : V Mm,n be a linear transformation that
preserves orthogonal equivalence. Then
T (span O(A)) span O(T (A)) and T (TA ) TT (A)
for every A V. Consequently, if T is nonsingular, then
dim span O(A) dim span O(T (A)) and dim TA dim TT (A)
for every A V.
Lemma 3.2.2 Let X, Y Mm,n be given, let T be a given linear orthogonal
preserver on Mm,n , and suppose X TX . If Y 6 TY , then T (X) 6 O(Y ).
Proof Suppose T (X) = Q1 Y Q2 for some orthogonal Q1 Mm and Q2
Mn . Let T1 QT1 T QT2 . Then Y = QT1 T (X)QT2 = T1 (X) T1 (TX )
TT1 (X) = TY .
Remark The argument used to prove the preceding lemma can be used to
obtain its conclusion under more general hypotheses: Let G be a given group
of nonsingular linear operators on Mm,n , and say A B if A = L(B) for some
L G. Let T be a given linear operator on Mm,n such that T (A) T (B)
whenever A B. If X TX and Y 6 TY , then T (X) 6= L(Y ) for all L G.
Lemma 3.2.3 Span O(A) = Mm,n for every nonzero A Mm,n .

71
Proof Let A Mm,n be a given nonzero matrix. There exists orthogonal (in
fact permutation matrices) P Mm and Q Mn such that the (1,1)entry of B = P AQ, say b11 , is nonzero. For a given positive integer k,
define the diagonal orthogonal matrix
Dk diag(1, 1, 1, . . . , 1) Mk .
Then 4b11 E11 = (Im + Dm )B(In + Dn ) = B + Dm B + BDn + Dm BDn
span O(A), so E11 span O(A). Since there exist permutation (and
hence orthogonal) matrices Pi Mm and Pj Mn such that Eij =
Pi E11 Pj , every Eij span O(A) and hence span O(A) = Mm,n .
Lemma 3.2.4 Let T be a given linear operator on Mm,n . Suppose T preserves orthogonal equivalence. Then, either T = 0 or T is nonsingular.
Proof Suppose that ker T contains a nonzero matrix A. By Lemma 3.2.1,
T (span O(A)) span O(T (A)) = {0}. Now, Lemma 3.2.3 guarantees
that span O(A) = Mm,n . Hence, T = 0.
We use the following result, which is Lemma (1) in [CH1].
Lemma 3.2.5 Let X1 , X2 Mn,k with 1 k n. There exists a complex
orthogonal Q Mn such that X1 = QX2 if and only if the following two
conditions are satisfied:
(a) X1T X1 = X2T X2 , and
(b) There exists a nonsingular B Mn such that X1 = BX2 .
Note that if X1 , X2 Mn,k have full rank, Lemma 3.2.5 ensures that
there exists an orthogonal Q Mn such that X1 = QX2 if and only if
X1T X1 = X2T X2 . In particular, for n 2 and any given nonzero z Cn ,
there are two possibilities:
(a) If z T z 6= 0, then z = Qe1 for some orthogonal Q Mn and some
nonzero C, and
(b) If z 6= 0 and z T z = 0, then z = Q(e1 +ie2 ) for some orthogonal Q Mn .

72
Let A Mm,n be given. Then O(A) = {Q1 AQ2 : Q1 Mm , Q2
Mn , QT1 Q1 = Im , and QT2 Q2 = In }. If B A, then AT A is similar to
B T B and AAT is similar to BB T . Suppose A has rank 1, so A = xy T
for some x Cm and y Cn . Then rank AT A = 0 or 1 according to
whether xT x is zero or nonzero, and rank AAT = 0 or 1 according to whether
y T y is zero or nonzero. Depending on the four possibilities for the pair
(rank AT A, rank AAT ), it follows that for some nonzero scalar C, O(A)
contains exactly one of the following: (a) E11 ; (b) (E11 +iE12 ); (c) (E11 +
iE21 ); or (d) (E11 E22 + iE12 + iE21 ).
The same reasoning leads immediately to the following lemma.
Lemma 3.2.6 Let E11 , E12 , E21 , E22 Mm,n be standard basis matrices, and
let E E11 E22 + iE12 + iE21 . Then
(a) O(E11 ) = {q1 q2T : q1 Cm , q2 Cn , and q1T q1 = q2T q2 = 1}.
(b) O(E11 + iE12 ) = {qy T : q Cm , y Cn , q T q = 1, and y T y = 0}.
(c) O(E11 + iE21 ) = {xq T : x Cm , q Cn , xT x = 0, and q T q = 1}.
(d) O(E) = {xy T : x Cm , y Cn , and xT x = y T y = 0} .
If Q(t) Mn is a dierentiable family of orthogonal matrices with Q(0) =
I, then dierentiation of the identity Q(t)Q(t)T = I at t = 0 shows that
Q0 (0) + Q0 (0)T = 0, that is, Q0 (0) is skew-symmetric. Conversely, if B Mn
is a given skew-symmetric matrix, then Q(t) etB is a dierentiable family
of orthogonal matrices such that Q(0) = I and Q0 (0) = B. If A Mm,n is
given and Q1 (t) Mm and Q2 (t) Mn are given dierentiable families of
orthogonal matrices with Q1 (0) = Im and Q2 (0) = In , one computes that
d
{Q1 (t)AQ2 (t)}|t=0 = Q01 (0)A + AQ02 (0).
dt
Thus, the tangent space to O(A) at A is given explicitly by
TA = {XA + AY : X Mm , Y Mn , X + X T = 0, and Y + Y T = 0}.
Definition 3.2.7 Let A Mm,n with n 2. Then
SA {AFij : i = 1, 2, and i < j n}.

73
Notice that SA TA for every A Mm,n .
Suppose A Mm,n and rank A 2. Then there exists a permutation
matrix Q Mn such that the first two columns of AQ are linearly independent. One checks that SAQ is a linearly independent subset of TAQ with
2n 3 elements. Thus, dim TA = dim TAQ 2n 3 whenever rank A 2.
A similar argument shows that dim TA 3n 6 whenever rank A 3.
Now suppose A Mm,n , rank A 2, and dim TA = 2n 3. Let Q Mn
be a permutation matrix such that the first two columns of
B AQ = [ b1 b2 b3 . . . bn ]
are independent. Then SB TB and dim span SB = 2n 3 = dim TA =
dim TB , so TB = span SB = {BY : Y Mn and Y + Y T = 0}. If n > 3 and
j > 3, note that
BF3j = [ 0 0 bj 0 . . . 0 b3 0 . . . 0]
and the only matrices in SB with nonzero entries in the third column are
BF13 = [ b3 0 b1 0 . . . 0 ] and BF23 = [ 0 b3 b2 0 . . . 0 ]; thus, b3 = 0
and bj is a linear combination of b1 and b2 . Hence, rank A = rank B = 2 if
n > 3. Note that if n = 2 or m = 2 and if A Mm,n with rank A 2, then
rank A = 2.
Lemma 3.2.8 Let A Mm,n be nonzero. Then
TA = {XA + AY : X Mm , Y Mn and X + X T = 0, Y + Y T = 0}
and
(a) dim TA = m + n 2 if A {E11 , E11 + iE12 , E11 + iE21 };
(b) dim TA = m + n 3 if A = E11 E22 + iE12 + iE21 ;
(c) dim TA 2n 3 if rank A 2. Moreover, if n > 3, rank A 2,
and dim TA = 2n 3, then rank A = 2 and there exists a permutation
matrix Q Mn such that TAQ = span SAQ ;
(d) dim TA 3n 6 if rank A 3.
(e) If rank A = 2 and dim TA = 2n 3, then there exists a permutation
matrix Q Mn such that dim span SAQ = 2n 3.

74
Proof The asserted form of TA , as well as assertions (c) and (d), have been
verified. Assertions (a) and (b) follow from direct computations. We
consider in detail only the case in which A = E11 +iE12 ; the other cases
can be dealt with similarly. Let X = [xij ] Mm and Y = [yij ] Mn
be skew-symmetric. Then

XA =

and

AY =

0
x21
..
.

0
ix21
..
.

xm1 ixm1

iy21 y12 y13 + iy23


0
0
0
..
..
..
.
.
.
0
0
0

0
0
.. . .
.
.
0

0
0
..
.
0

y1n + iy2n

0
..
...
.

Since y21 = y12 , dim TA = m + n 2.

3.3

A Rank-Preserving Property of T 1

Proposition 3.3.1 Let T be a nonsingular linear orthogonal equivalence


preserver on Mm,n . Then T 1 (E) has rank 1 whenever E has rank 1.
We have organized the proof of Proposition 3.3.1 into a sequence of ten
lemmata.
Lemma 3.3.2 Let T be a nonsingular linear orthogonal preserver on Mm,n .
If E Mm,n and rank E = 1, then dim TT 1 (E) m + n 2.
Proof Lemma 3.2.1 shows that dim TT 1 (E) dim TE , while Lemma 3.2.6
and Lemma 3.2.8(a, b) give dim TE m + n 2.
Lemma 3.3.3 Proposition 3.3.1 holds if m = 1 or m + 1 < n.
Proof If m = 1, then the nonsingularity of T implies that T 1 (E) 6= 0
whenever E =
6 0, and this is equivalent to the asserted rank property

75
in this case. Let E Mm,n be given with 3 m+1 < n. If rank E = 1,
then dim TT 1 (E) m + n 2 < 2n 3. The first inequality is from
Lemma 3.3.2 and the strict inequality is from our assumption that
3 m + 1 < n. Lemma 3.2.8(c) now shows that T 1 (E) must have
rank 1.

Lemma 3.3.4 Let T be a nonsingular linear orthogonal preserver on Mm,n .


If m + 1 = n > 3 or m = n > 4, and if E Mm,n and rank E = 1, then
rank T 1 (E) 2.
Proof If rank E = 1, then Lemma 3.3.2 guarantees that dim TT 1 (E)
m+ n 2. If m + 1 = n > 3, then m+ n 2 = 2n 3 < 2n 2 3n 6.
If m = n > 4, then m+n2 = 2n2 < 3n6. Thus, under the stated
hypotheses we have dim TT 1 (E) < 3n 6 and hence rank T 1 (E) 2
by Lemma 3.2.8(d).
Let rank A = 2 and suppose that A = [ a1 a2 . . . an ] Mm,n . There
are two possibilities: at least one ai is not isotropic, in which case we may
suppose aT1 a1 6= 0, or all of them are isotropic. Moreover, we may assume
that {a1 , a2 } is linearly independent.
Case 1. aT1 q
a1 6= 0
Let = aT1 a1 , so 6= 0. Lemma 3.2.5 ensures that Qa1 = [ 0 . . . 0 ]T
for some orthogonal Q Mm , so
QA =
h

"


0 A1

If we write A1 b2 b3 . . . bn , then b2 6= 0 since a1 and a2 are linearly


independent. There are two possibilities: b2 is isotropic or not.
(i) Suppose bT2 b2 6= 0. After applying the preceding argument to A1 , we
see that there exists an orthogonal Q1 Mm such that

B Q1 A = 0 , , , C, with 6= 0,
0 0 A2

76
but A2 = 0 since rank A = 2. For n m 3, and referring to the discussion after Definition 3.2.7, we see that that {G13 B, G23 B} SB is a linearly
independent subset of TB . Hence, dim TA = dim TB 2n 1 > m + n 2.
(ii) If bT2 b2 = 0, a similar argument shows that there is an orthogonal
Q1 Mm such that


0 z

,
B Q1 A =
0 i iz
0 0 0
where z M1,n2 and 6= 0. Let X = [ x1 . . . xn ] SB . Then the columns
of X have the form

xj = j , j = 1, . . . , n
iaj
0
for some aj C. Hence, for n m 3, {G12 B, G13 B} SB is a linearly
independent subset of TB . Hence, dim TA = dim TB 2n 1 > m + n 2.

Case 2. aT1 a1 = aT2 a2 = 0


Because aT1 a1 = 0, there exists an orthogonal Q Mm and 6= 0 such
that

QA =
i .
0 b2

Notice that the second column of QA is isotropic and independent of the first
column.
(i) If b2 = 0, then there is 6= 0 such that

B QA = i i .
0
0 0
For n m 3, {G13 B, G23 B} SB is a linearly independent subset of TB .
Hence, dim TA 2n 1 > m + n 2.

77
(ii) If b2 6= 0 and bT2 b2 6= 0, there exists
6= 0 such that

B Q1 A =
0
0 0

an orthogonal Q1 Mm and

For n m 4, {G14 B, G34 B} SB is a linearly independent subset of TA ,


and dim TA 2n 1 > m + n 2.
(iii) If b2 6= 0 and bT2 b2 = 0, there exists an orthogonal Q1 Mm and
6= 0 such that

B Q1 A =
0 .

0 i
0 0 0

Just as in Case 1 (ii), {G13 B, G23 B} SB TB is linearly independent.


Hence, for n m 4, dim TA = dim TB 2n 1 > m + n 2.
Therefore, if A Mm,n , n m 4, and rank A = 2, then dim TA
2n 1 > m + n 2 dim TE for any E Mm,n with rank E = 1 by
Lemma 3.2.8(a, b). Combining this result with Lemma 3.3.4 proves the
following lemma.
Lemma 3.3.5 Proposition 3.3.1 holds if n > 4 and m = n or m = n 1.
Let = {(2, 2), (2, 3), (3, 3), (3, 4), (4, 4)}. If E Mm,n with n m and
(m, n) 6 , we have shown that rank T 1 (E) = 1 whenever rank E = 1. We
treat the five special cases (m, n) separately. We use the following two
results.
Lemma 3.3.6 Let A Mn . Then XA is skew-symmetric for every skewsymmetric X Mn if and only if A = I for some C.
Proof If A = I, then XA = X is evidently skew-symmetric. Conversely,
if X and XA are skew-symmetric, then XA = (XA)T = (AT X T ) =
AT X. Let A = [aij ] and consider the skew-symmetric matrices F1k =

78
E1k Ek1 for k = 1, . . . n. Then

ak1
0
..
.

F1k A =
a
11

..

and

AT F1k

ak2
0
..
.

...

akk
0
..
.

...

akn
0
..
.

a12 a1k a1n


..
..
..
...
...
.
.
.
0

ak1 0 a11
ak2 0 a12
..
.. . .
.
. ..
.
.
akk 0 a1k
..
.. . .
.
. ..
.
.
akn 0 a1n

...

0
0
..
.

0
. . . ..
.
0

Hence, akk = a11 and aki = 0 for all i 6= k. Thus, A = a11 I.


Lemma 3.3.7 Let A Mm,n . If dim TA = dim span SAQ for some orthogonal Q Mn , then AAT = Im for some C.
Proof Suppose dim TA = dim span SAQ . Since dim TA = dim TAQ , we have
TAQ = span SAQ . Hence, for each skew-symmetric Y Mm there
exists a skew-symmetric X Mn such that Y AQ = AQX. Thus
Y AAT = Y AQT (AQT )T = (AQ)X(AQ)T is skew-symmetric for every
skew-symmetric Y , and hence AAT = Im by Lemma 3.3.6.
Lemma 3.3.8 Proposition 3.3.1 holds if (m, n) = (2, 3).
Proof Suppose E M2,3 has rank 1. Since rank T 1 (E) m = 2, there are
only two possibilities: rank T 1 (E) = 1 (which is the assertion of Proposition 3.3.1), or rank T 1 (E) = 2. We wish to exclude the latter possibility.
We look at
D {Y M2,3 : dim TY 3}.
Since dim TT 1 (Y ) dim TY , it must be the case that T 1 (Y ) D whenever
Y D. If E has rank 1, then Lemma 3.2.8(a, b) shows that E D. Suppose

79
X D and rank X = 2. Then Lemma 3.2.8(c) ensures that dim TX 3,
so that dim TX = 3 for this case. Hence, Lemma 3.2.8 ensures that there
exists an orthogonal Q M3 such that dim span SXQ = 3. Lemma 3.3.7
now guarantees that XX T = 2 I2 for some C. Moreover, Lemma (4.4)
of [HHL] shows that 6= 0. It follows from Lemma 3.2.5 that there exists an
orthogonal Q1 M3 such that
X=

"

0 0
0 0

Q1 .

(3.3.1)

We will show that T 1 (X) O(X) for some 6= 0. Since X D, it


suces to show that T (E) 6 O(X) for each E E {E11 , E11 + iE12 , E11 +
iE21 , E11 E22 + iE12 + iE21 }.
Let
"
#
0 0
A
0 0
so that O(A) = O(X). One checks that
TA =

("

0 x y
x 0 z

Let B O(A) TA . Then


"

x2 + y 2
yz
2
yz
x + z2

: x, y, z C .

= BB T = 2 I2 .

Hence, y = z = 0 and x2 = 2 . Thus, O(A) TA consists of exactly two


vectors.
Let E E11 E22 + iE12 + iE21 . Then dim TT 1 (E) dim TE = 2 <
dim TB for any B 6 {0} O(E). It follows that T 1 (E) O(E) and hence,
T (E) O(E). Thus, T (E) 6 O(A).
Suppose that T (E11 ) O(A), say T (E11 ) = Q2 AQ3 . Let T1 QT2 T QT3 .
Then, A = QT2 T (E11 )QT3 = T1 (E11 ). Notice that
TE11 =

("

0 x y
z 0 0

: x, y, z C .

Thus, {E12 , E13 , E21 } O(E11 ) TE11 so that O(E11 ) TE11 contains at
least three vectors. Moreover, {T1 (E12 ), T1 (E13 ), T1 (E21 )} contains

80
three vectors since T1 is nonsingular. However, T1 (O(E11 ) TE11 )
O(T1 (E11 )) TT1 (E11 ) = O(A) TA , which contains exactly two vectors. This
contradiction shows that T (E11 ) 6 O(A).
Now let E E11 + iE12 . Then E = E(iF12 ) TE . Since A 6 TA ,we
have T (E) 6 O(A) by Lemma 3.2.2. Similarly, if E E11 +iE21 = (iG12 )E
then, T (E) 6 O(A).
Thus, T 1 (A) cannot have rank 1, and hence it has rank 2 and T 1 (A)
O(A) for some 6= 0. It follows that T (O(A)) O( 1 A) and hence,
T 1 (E) 6 O(A) for all rank-one E M2,3 , and all 6= 0. Combining this
result with the observation that T 1 (Y ) D for any Y D, we see that
rank T 1 (E) = 1 whenever rank E = 1.
Lemma 3.3.9 Proposition 3.3.1 holds if (m, n) = (3, 4).
Proof Let E M3,4 have rank 1 and let A T 1 (E). Lemma 3.3.4 shows
that rank A is either 1 or 2. Suppose rank A = 2. We apply the analysis of
the proof of Lemma 3.3.5 to A. Notice that Cases 1(i, ii) and 2(i, iii) are
not possible here. Thus, A has the form considered in Case 2(ii):

x1 x4

A = i x2 x5

0 x3 x6

with 6= 0 and all the columns of A are isotropic vectors. It now follows
from Lemma 3.2.8(a, c) and Lemma 3.2.1 that dim TA = 5. Moreover,
Lemma 3.2.8(e) ensures that dim span SAP = 5 for some permutation matrix
P M4 . Hence, Lemma 3.3.7 guarantees that there exists a C such that
AAT = aI3 . Since it is always the case that rank A rank AAT , we must
have a = 0, that is, the rows of A are also isotropic vectors. Thus, there exists
an orthogonal Q M4 such that the first row of AQ is [ i 0 0 ]. Since
AAT = 0, the third row of AQ has the form [ 0 0 c d ], where c2 + d2 = 0
and d 6= 0. It follows that

i 0
0

AQ = i ebd ed
0
0
bd d

with e2 = b2 = 1 and d 6= 0. By considering all the possible cases, one


checks that G12 AQ 6 span SAQ . We show one such case: b = e = i. Suppose

81
that G12 AQ = AQ(a1 F12 + a2 F13 + a3 F14 + a4 F23 + a5 F24 ). Examining the
(1, 1), (2, 1), (3, 1) and (2, 4) entries, we get a1 = 1, a2 = a3 = a5 = 0. This
is a contradiction since for any a4 , G12 AQ 6= AQF12 + a4 AQF23 . Hence,
dim TA = dim TAQ 6, again a contradiction. Since we can exclude all the
possibilities associated with rank A = 2, it follows that rank A = 1.
Lemma 3.3.10 Proposition 3.3.1 holds if (m, n) = (4, 4).
Proof If A M4 has rank 2, then dim TA 7, as shown on the proof of
Lemma 3.3.5. Let B E11 E22 + iE12 + iE21 . Then dim TT 1 (B)
dim TB dim TC for any C 6 {0} O(B) and hence, T 1 (B) O(B)
has rank 1. Let D diag (1, 1, 1, 1). Then E11 +iE12 = 12 (B +DB)
and T 1 (E11 + iE12 ) = 12 (T 1 (B) + T 1 (DB)), a sum of two rankone matrices. Hence, T 1 (E11 + iE12 ) has rank at most 2. But since
dim TC 6 < 7 dim TA for all A, C M4 such that rank C = 1 and
rank A = 2, we must have rank T 1 (C) 6= 2 so that rank(T 1 (E11 +
iE12 )) = 1. Similarly, since E11 + iE21 = 12 (B + BD) and E11 =
1
((E11 + iE12 ) + (E11 + iE12 )D), both T 1 (E11 + iE21 ) and T 1 (E11 )
2
have rank 1. Thus, if E M4 has rank 1, then T 1 (E) must also have
rank 1.
Let T be a nonsingular linear orthogonal equivalence preserver on Mm,n ,
with m n and (m, n) 6 {(2, 2), (3, 3)}. Our arguments up to this point
show that T 1 preserves rank 1 matrices, that is, T 1 (E) has rank 1 whenever
E Mm,n has rank 1. Theorem 4.1 of [HLT] guarantees that there exist
nonsingular X Mm and Y Mn such that either T 1 (A) = XAY for all
A Mm,n , or m = n and T 1 (A) = XAT Y for all A Mn . Hence, either
T (A) = MAN for all A Mm,n , or m = n and T (A) = M AT N for all
A Mn , where M X 1 and N Y 1 . We will now show that the same
conclusion can be drawn for the two remaining cases (m, n) = (2, 2), (3, 3),
from which Proposition 3.3.1 follows in these two cases.
Lemma 3.3.11 Let T be a nonsingular linear orthogonal preserver on Mn .
If n = 2 or n = 3, then there exists a scalar 6= 0 such that T 1 (In )
O(In ).
Proof First note that Xn E11 E22 + iE12 + iE21 = Xn (iF12 ) TXn for
all n = 2, 3, . . ., where Eij , Xn Mn . Note also that, TIn = {X Mn :

82
X + X T = 0} is the set of all skew symmetric matrices in Mn . Hence,
dim TIn = n(n 1)/2. Moreover, In 6 TIn for each n.

Let n = 2. Then dim TI2 = 1. It follows from Lemma 3.2.8(a, b,


c) that either T 1 (I2 ) is nonsingular or T 1 (I2 ) O(X2 ). However,
T 1 (I2 ) 6 O(X2 ) by Lemma 3.2.2. Thus, T 1 (I2 ) is nonsingular and
Lemma 3.2.8(e) implies that dim TT 1 (I2 ) = 1 = dim span SI2 Q for
some permutation matrix Q M2 . Lemma 3.3.7 guarantees that
T 1 (I2 ) O(I2 ) for some 6= 0.
Let n = 3. A similar argument shows that either

rank T 1 (I3 ) 2 or T 1 (I3 ) O(X3 ),


and that the latter possibility is excluded. Hence, rank T 1 (I3 )
2. Let A T 1 (I3 ). Then dim TA = 3 = dim span SAQ for some
permutation matrix Q M3 . Hence, AAT = I3 by Lemma 3.3.7,
and Lemma (4.4) of [HHL] again guarantees that 6= 0. Therefore,
T 1 (I3 ) O(I3 ) with 2 = 6= 0.
Suppose n = 2 or 3. Then Lemma 3.3.11 ensures that T 1 (In ) O(In ).
Hence, T1 T satisfies T11 (In ) O(In ). It follows that T1 (O(In ))
O(In ). Lemma (1) of [D] guarantees that T1 (O(In )) = O(In ). Thus, Lemma
(6) of [BP] guarantees that there exist nonsingular M, N Mn such that
either T1 (A) = M AN or T1 (A) = M AT N for all A Mn . It follows that
Proposition 3.3.1 holds for these two cases as well.
The preceding ten lemmata constitute a proof of all cases of Proposition 3.3.1 in which m n. The remaining cases follow from considering
T1 (X) T (X T ) and applying the known cases to T1 : Mn,m Mn,m .
The following proposition summarizes the main conclusions of this section.
Proposition 3.3.12 Let T be a nonsingular linear orthogonal equivalence
preserver on Mm,n . Then there exist nonsingular M Mm and N Mn
such that either
(1) T (A) = MAN for all A Mm,n , or
(2) m = n and T (A) = M AT N for all A Mm,n .

83

3.4

Proof of the Main Theorem

Let T be a given linear operator on Mm,n . Suppose T preserves orthogonal


equivalence. Then Lemma 3.2.4 guarantees that either T = 0 or T is nonsingular. If T = 0, then Theorem 3.1.1 holds with = 0. If T 6= 0, we will
use the following to show that Theorem 3.1.1 still holds.
Proposition 3.4.1 Let A Mn be nonsingular. Suppose that
xT AT Ax = xT P T AT AP x
for all orthogonal P Mn and all x Cn . Then there exist an orthogonal
Q Mn and a scalar 6= 0 such that A = Q.
Proof An easy polarization argument shows that if C Mn is symmetric and xT Cx = 0 for all x Cn , then C = 0. Since xT AT Ax =
xT P T AT AP x for all x Cn , it follows that AT A = P T AT AP for all
orthogonal P Mn . Hence,

A A=

. . . . . . ...

.
.. .
.
.
.
.
.
.

Let x [ 1 1 0 . . . 0 ]T and y [ 2 0 0 . . . 0 ]T Cn . Then xT x =


y T y and hence there exists an orthogonal Q1 Mn such that y = Q1 x.
Now, 2 + 2 = xT AT Ax = xT QT1 AT AQ1 x = y T AT Ay = 2. Hence,
= 0 and AT A = I with 6= 0 since A is nonsingular. Thus,

Q 1 A is orthogonal and A = Q.
Lemma 3.4.2 Let T be a given nonsingular linear operator on Mm,n . Then
T preserves orthogonal equivalence if and only if there exist orthogonal matrices Q1 Mm , Q2 Mn and a scalar 6= 0 such that either
(1) T (A) = Q1 AQ2 for all A Mm,n , or
(2) m = n and T (A) = Q1 AT Q2 for all A Mn .

84
Proof Under the stated assumptions, Proposition 3.3.12 ensures that there
exist nonsingular M Mm and N Mn such that either T (A) = MAN
for all A Mm,n , or m = n and T (A) = M AT N for all A Mm,n . We
consider only the case T (A) = M AN ; the case T (A) = MAT N can
be dealt with similarly. Let an orthogonal P Mn be given. Since
T preserves orthogonal equivalence, for each A Mm,n there exist
orthogonal matrices Q Mm and Z Mn (which depend on A and P )
such that T (A) = Q T (AP ) Z. Hence,
MAN (MAN )T = T (A) T (A)T
= Q T (AP ) Z (Q T (AP ) Z)T
= QMAP N (M AP N )T QT .
Choose A M
"

"

xT
0

(3.4.2)

, where x Cn . Then (3.4.2) becomes

xT N N T x 0
0
0

=Q

"

xT P N N T P T x 0
0
0

QT .

Since QT = Q1 , taking the trace of both sides shows that xT N N T x =


xT P N N T P T x. Since this identity holds for all x Cn and all orthogonal P Mn , Proposition 3.4.1 ensures that N = 2 Q2 for some
orthogonal Q2 Mn and some scalar 2 6= 0. A similar analysis of
T (A)T T (A) shows that M = 1 Q1 for some orthogonal Q1 Mm and
some scalar 1 6= 0.
This completes the proof of the forward implication of Theorem 3.1.1.
The converse can be easily verified.

Chapter 4
Linear Operators Preserving
Unitary t-Congruence on
Matrices

85

86

4.1

Introduction and Statement of Results

Let Mn (IF) be the set of all n-by-n matrices over the field IF, where either
IF = C or IF = IR. The set of all matrices U Mn (IF) that satisfy U U = I
(that is, U is unitary) is denoted by Un (IF). We say that A, B Mn (IF) are
t-congruent if there exists a nonsingular X Mn (IF) such that A = XBX t ;
moreover, if X Un (IF), then A and B are said to be unitarily t-congruent.
Notice that if IF = IR, then Un (IR) is the set of all real orthogonal matrices
and unitary t-congruence becomes orthogonal similarity.
We shall also use the following notations in our discussion.
(a) Sn (IF) {X Mn (IF) : X t = X} is the set of all n-by-n symmetric
matrices.
(b) Kn (IF) {X Mn (IF) : X t = X} is the set of all n-by-n skewsymmetric matrices.
(c) Eij Mn (IF) is the matrix whose (i, j) entry is 1; all other entries are
zero. Notice that {Eij : 1 i, j n} forms a basis for Mn (IF).
(d) Fij Eij Eji . One checks that {Fij : 1 i < j n} forms a basis
for Kn (IF).
(e) If

X=

0
x12
x13 x14
x12
0
x23 x24
x13 x23
0
x34
x14 x24 x34 0

K4 (IF),

we denote by X + the matrix derived from X by interchanging x13 and


x24 . One checks that
det X = (x12 x34 + x14 x23 x13 x24 )2 = det X + .
Notice that the singular values of any A = [aij ] K4 (IF) are uniquely
P
determined by tr A A = ni,j=1 |aij |2 and |det A|2 , and hence these two
invariants determine A K4 (IF) uniquely up to unitary t-congruence.
Thus, X, Y K4 (IF) are unitarily t-congruent if and only if X + is
unitarily t-congruent to Y + . In addition, X and X + are unitarily tcongruent whenever X K4 (IF).

87
Interchanging x14 and x23 is equivalent to X U X + U t , where

0
1
0
0

1 0
1
0 0
0
0 0 1
0 1 0

Similarly, interchanging x12 and


where

0 0
0 0

V
0 1
1 0

U4 (IR).

x34 is equivalent to X V X + V t ,
0
1
0
0

1
0
0
0

U4 (IR).

In [HHL], the authors studied the linear operators T on Mn (IF) that preserve t-congruence, that is, T (A) is t-congruent to T (B) whenever A and B
are t-congruent. At the end of the paper they raised the question of characterizing the linear operators on Mn (IF) that preserve unitary t-congruence.
The purpose of this chapter is to answer their question with the following
results.
Theorem 4.1.1 Let T be a given linear operator on Mn (C). Then T preserves unitary t-congruence if and only if one of the following three conditions
holds:
(1) There exist U Un (C) and scalars , C such that
T (X) = U (X + X t )U t + U (X X t )U t
for all X Mn (C) .
(2) n = 4 and there exist U Un (C) and a scalar C such that
T (X) = U (X X t )+ U t for all X M4 (C).
(3) n = 2 and there exists B M2 (C) such that
T (X) = (tr XF12 )B for all X M2 (C).

88
Theorem 4.1.2 Let an integer n 3 and a linear operator T on Mn (IR)
be given. Then T preserves orthogonal similarity if and only if one of the
following three conditions holds:
(1) There exists A0 Mn (IR) such that
T (X) = (tr X)A0 for all X Mn (IR).
(2) There exist U Un (IR) and scalars , , IR such that
T (X) = (tr X)I + U (X + X t )U t + U (X X t )U t
for all X Mn (IR).
(3) n = 4 and there exist U U4 (IR) and scalars , IR such that
T (X) = (tr X)I + U (X X T )+ U t for all X M4 (IR).
Theorem 4.1.3 A linear operator T on M2 (IR) preserves orthogonal similarity if and only if one of the following three conditions holds:
(1) There exist A0 , B0 M2 (IR) with tr B0 = tr A0 B0 = tr At0 B0 = 0 such
that
T (X) = (tr X)A0 + (tr XF12 )B0 for all X M2 (IR).
(2) There exist U U2 (IR) and scalars , , IR such that
T (X) = (tr X)I + U (X + X t )U t + U (X X t )U t
for all X M2 (IR).
(3) There exist U U2 (IR), C0 K2 (IR), and scalars , IR such that
T (X) = (tr X)I + (tr X)C0 + U (X + X t )U t
for all X M2 (IR).

89
As we shall see, even though the result in the complex case (Theorem 4.1.1) is very similar to Theorem (4.1) of [HHL], it requires a refinement
of their proof and several additional ideas to obtain our result. The real case
is even more interesting. Our Theorems 4.1.2 and 4.1.3 are quite dierent
from the result in [HHL]. The complication is due to the fact that Mn (IR)
can be decomposed as a direct sum of the subspaces
n
{I : IR},
0
Sn (IR) {A Mn (IR) : A = At and tr A = 0}, and
Kn (IR) {A Mn (IR) : A + At = 0},

and each of these is an irreducible subspace (in the group representation


sense) of the group G of linear operators on Mn (IR) defined by A U AU t
for a given orthogonal matrix U . As a result, it is conceivable that a linear
operator T on Mn (IR) can act quite independently on these three subspaces.
However, it turns out that the behavior of T on the three subspaces cannot
be too arbitrary, especially if T is nonsingular, as shown by Theorem 4.1.2.
Nevertheless, this makes the proof of the theorem more dicult and makes
the subject more interesting. One of the key steps in our proof (Lemma 4.4.3)
makes use of a result by Friedland and Loewy [FL] concerning the minimum
dimension of a subspace of Sn (IR) that is certain to contain a matrix whose
largest eigenvalue has at least a given multiplicity r (see (4.4.40)). This result
has been used in other linear preserver problems (see [L]).
One of the basic ideas in our proof is to study the orbits
O(A) {U AU t : U Un (IF)},

A Mn (IF)

and make use of their geometric properties to help solve our problem. These
techniques have been used in [HLT], [HHL], [LRT], and in Chapter 3. We
collect some useful results from [HHL] and put them in Section 4.2 for easy
reference. In Section 4.3 we present the proof of Theorem 4.1.1. Section 4.4
is devoted to proving Theorems 4.1.2 and 4.1.3. For problems similar to ours
and their variations, we refer the reader to [HLT] and [LRT]. For a gentle
introduction to linear preserver problems, we refer the reader to [LT2].

4.2

Preliminaries

The following three results can be found in [HHL]. By private correspondence, the author has learned that Professor M. H. Lim has obtained dierent

90
proofs for these results.
Lemma 4.2.1 A nonzero linear operator T on Kn (C) preserves unitary tcongruence if and only if there exist U Un (C) and a scalar > 0 such that
either
(a) T (X) = U XU t for all X Kn (C); or
(b) n = 4 and T (X) = U X + U t for all X K4 (C).
Lemma 4.2.2 A nonzero linear operator T on Sn (C) preserves unitary tcongruence if and only if there exist U Un (C) and a scalar > 0 such that
T (X) = U XU t for all X Sn (C).
Lemma 4.2.3 A nonzero linear operator T on Kn (IR) preserves orthogonal
similarity if and only if there exist U Un (IR) and a scalar 6= 0 such that
either
(a) T (X) = U XU t for all X Kn (IR); or
(b) n = 4 and T (X) = U X + U t for all X K4 (IR).
The following is Theorem (2.2) in [Hi].
Lemma 4.2.4 A nonzero linear operator T on Sn (IR) preserves orthogonal
similarity if and only if one of the following conditions holds:
(a) There exists a nonzero A0 Sn (IR) such that
T (X) = (tr X)A0 for all X Sn (IR).
(b) There exist U Un (C) and scalars , IR with (, ) 6= (0, 0) such
that
T (X) = U XU t + (tr X)I for all X Sn (IR).
The following is Lemma (2.2) in [HHL].
Lemma 4.2.5 Suppose IF = C or IF = IR. Then span O(A) = Kn (IF) for
every nonzero A Kn (IF). Consequently, if T is a linear operator on Mn (IF)
that preserves unitary t-congruence, then ker T Kn (IF) 6= {0} if and only
if Kn (IF) ker T .

91
The following is Lemma (3.2) in [HHL].
Lemma 4.2.6 span O(A) = Sn (C) for every nonzero A Sn (C). Consequently, if T is a linear operator on Mn (C) that preserves unitary t- congruence, then ker T Sn (C) 6= {0} if and only if Sn (C) ker T .
Recall that if A Sn0 (IR), then A is symmetric and tr A = 0. Since
unitary t-congruence is real orthogonal similarity when IF = IR, any B
O(A) is also in Sn0 (IR), that is, tr B = 0 as well. Hence span O(A) Sn0 (IR)
but span O(A) 6= Sn (IR). This is one of the major dierences between the
cases IF = IR and IF = C (see Lemma 4.2.6).
Lemma 4.2.7 span O(A) = Sn0 (IR) for every nonzero A Sn0 (IR). Consequently, if T is a linear operator on Mn (IR) that preserves orthogonal similarity, then ker T Sn0 (IR) 6= {0} if and only if Sn0 (IR) ker T .
Proof Suppose A Sn0 (IR). We have already shown that span O(A)
Sn0 (IR). We now prove that Sn0 (IR) span O(A). There exist Q
Un (IR) and 1 , 2 , . . . , n1 , n IR such that
QAQt = diag (1 , 2 , . . . , n1 , n ).
Since A 6= 0 and tr A = 0, we may take 1 6= 0 and 1 6= n . Notice
that
(1 n )(E11 Enn ) = QAQt diag (n , 2 , . . . , n1 , 1 ),
so E11 Enn span O(A). Since Eij + Eji O(E11 Enn ), we have
Eij + Eji span O(A) for all 1 i < j n. The conclusion follows.
Notice that span O(A) = n for every nonzero A n . This observation,
together with Lemma 4.2.7, proves the following; for a dierent approach,
see the proof of Theorem (3.2) of [LT1].
Lemma 4.2.8 Let A Sn (IR) be given. Suppose that A 6 n Sn0 (IR). Then
span O(A) = Sn (IR). Consequently, if T is a linear operator on Mn (IR) that
preserves orthogonal similarity, then ker T Sn (IR) 6= {0} if and only if
Sn (IR) ker T .

92
Proof Write A = I + AS , where n1 tr A and AS Sn0 (IR). Notice
that 6= 0 since A 6 Sn0 (IR). Moreover, AS 6= 0 since A 6 n . There
exists a U Un (IR) such that B U AU t A = U AS U t AS 6=
0. Notice that B Sn0 (IR). Hence, Lemma 4.2.7 guarantees that
Sn0 (IR) span O(B) span O(A). Thus, I = A AS span O(A)
and n span O(A). The conclusion now follows.
Lemma 4.2.9 Let IF = C or IF = IR, and let A Mn (IF) be given. Then
(a) A + At span O(A), and
(b) A At span O(A).
Proof We follow and refine slightly the proof of Lemma (4.2) of [HHL]. Let
{Pi : 1 i 2n } denote the group of all real diagonal orthogonal
matrices in Mn (IF) and define
n

P(X)

2
X
i=1

Pi XPit for X Mn (IF).

Write X [xij ] and note that


P(X) = 2n diag(x11 , . . . , xnn ) = P(X t ) span O(X)
for all X Mn (IF). Now, since A + At Sn (IF), there exists a
U Un (IF) such that D U (A + At )U t is diagonal. Hence, 2n D =
P(D) = P(U (A + At )U t ) = 2P(U AU t ) span O(A). Thus, A + At
span O(A). Finally, A At = 2A (A + At ) span O(A).

4.3

The Complex Case

We begin the proof of the forward implication of Theorem 4.1.1 by considering a singular linear operator T on Mn (C) that preserves unitary tcongruence; the reverse implication can be proved by direct verification.
Lemma 4.3.1 Let A Mn be given and suppose that A 6 Sn (C) Kn (C).
Then span O(A) = Mn (C). Consequently, if T is a linear operator on Mn (C)
that preserves unitary t-congruence, then A ker T if and only if T = 0.

93
Proof Lemma 4.2.9 guarantees that A + At , A At span O(A). Since
A + At 6= 0 and A At 6= 0, Lemmata 4.2.5 and 4.2.6 guarantee that
Sn (C) Kn (C) span O(A). We conclude that span O(A) = Mn (C).
If T is a nonzero singular linear operator on Mn (C) that preserves unitary
t-congruence, then Lemma 4.3.1 implies that ker T Sn (C) Kn (C), that
is, if T (A) = 0, then either A Sn (C) or A Kn (C). We look at these cases
separately.
If 0 6= A Kn (C), we can use (verbatim) the proof of Lemma (4.3) of
[HHL] to get the following analogous result.
Lemma 4.3.2 Let T be a nonzero linear operator on Mn (C) that preserves
unitary t-congruence. If ker T Kn (C) 6= {0}, then ker T = Kn (C) and T
satisfies condition (1) of Theorem 4.1.1 with = 0.
Suppose now that 0 6= A Sn (C) ker T . Lemma 4.2.5 guarantees that
T (Sn (C)) = 0. Let n 3. We follow the proof of Lemma (4.6) of [HHL] to
show that T (Kn (C)) Kn (C).
Define L : Kn (C) Sn (C) by
L(X)

T (X) + T (X)t
for all X Kn (C).
2

One checks that L is a linear operator that preserves unitary t-congruence.


Moreover, if for some nonzero B Kn (C), L(B) = 0, then Lemma 4.2.6
guarantees that L(Kn (C)) = 0. Notice that T (Kn (C)) Kn (C)) if and
only if L = 0. Suppose then that L(A) 6= 0 for some (necessarily) nonzero
A Kn (C), that is, suppose L is nonsingular. There exist U Un (C) and a
scalar a1 > 0 such that
U AU t = a1 F12 + C
(4.3.1)
where C 02 C1 Kn (C) for some C1 Kn2 (C). Notice that for any
C with || = 1, ZAZ t = a1 F12 +2 C is unitarily t-congruent to A, where
Z (I2 In2 )U Un (C). Hence,
rank L(A) = rank L(a1 F12 + C) = rank (a1 L(F12 ) + L(C))

94
for all such . If we put S12 L(F12 ), we have
rank L(X) rank L(F12 ) rank S12 for all 0 6= X Kn (C).
Moreover, if rank A = 2, then C = 0 in (4.3.1) and hence rank L(A) =
rank S12 rank L(X) for any 0 6= X Kn (C).
As in the case of nonsingular t-congruence, there exists an X0 Kn (C)
such that rank L(X0 ) 3 (see the sixth paragraph of the proof of Lemma
(4.4) of [HHL]). Thus, for any A Kn (C) having rank 2, we have
rank L(A) = rank S12 3.
The analysis (for the complex case) of the proof of Lemma (4.4) of [HHL]
can be modified to show that this is not possible.
Lemma 4.3.3 Let an integer n 3 and a linear operator L : Kn (C)
Sn (C) be given. Then L preserves unitary t-congruence if and only if L = 0.
If n 3 and ker T Sn (C) 6= {0}, then Lemma 4.3.3 implies that
T (Kn (C)) Kn (C).
Hence, for this case, we may regard T as a nonzero linear operator on Kn (C).
Lemma 4.2.1 guarantees that Theorem 4.1.1 holds for this case.
Suppose now that n = 2 and let 0 6= A ker T Sn (C). Again,
Lemma 4.2.6 guarantees that T (Sn (C)) = 0. Let
A0

T (F12 )
.
2

Since X X t = (tr XF12 )F12 for all X M2 (C), we have


T (X) = 12 [T (X + X t ) + T (X X t )]
= 12 T (X X t )
= 12 (tr XF12 )T (F12 )
= (tr XF12 )A0 .
Hence T satisfies condition (3) of Theorem 4.1.1. If A0 F12 for some
C, then we may take = and U = I2 in condition (1) of Theorem 4.1.1.

95
Lemma 4.3.4 Let T be a given linear operator on Mn (C) that preserves
unitary t-congruence. Suppose that ker T Sn (C) 6= {0}. Then T (Sn (C)) =
{0} and one of the following conditions holds:
(a) There exist U Un (C) and a scalar C such that
T (X) = U (X X t )U t for all X Mn (C).
(b) n = 4 and there exist U Un (C) and a scalar C such that
T (X) = U (X X t )+ U t for all X M4 (C).
(c) n = 2 and there exists B M2 (C) such that
T (X) = (tr XF12 )B for all X M2 (C).
Lemmata 4.3.14.3.4 show that if T is a singular linear operator on Mn (C)
that preserves unitary t-congruence, then Theorem 4.1.1 holds. The following
result characterizes the nonsingular T . The proof is the same as that of
Lemma (4.7) of [HHL], and will be omitted.
Lemma 4.3.5 Let T be a nonsingular linear operator on Mn (C) that preserves unitary t-congruence. Then T (Sn (C)) = Sn (C) and T (Kn (C)) =
Kn (C).
Let us now consider the nonsingular T . Lemmata 4.2.1, 4.2.2, and 4.3.5
imply that there exist nonzero scalars , C and V1 , V2 Un (C) such that
T (X) = V1 XV1t for all X Sn (C)
and either
(A1) T (Y ) = V2 Y V2t for all Y Kn (C); or
(A2) n = 4 and T (Y ) = V2 Y + V2t for all Y K4 (C).
By considering various choices for X and Y , we will show that we may take
V2 = V1 in (A1), for some nonzero C. Moreover, we will show that
(A2) cannot happen.

96
It is without loss of generality to assume T (X) = X for all X Sn (C)
and T (Y ) = V Y V t for all Y Kn (C), where /, since we may
consider instead the linear transformation T1 1 V11 T V1t . Thus, we need
to show that we may take V = I for some nonzero C.
Suppose that n = 2. Direct calculation shows that for a given V Un (C),
V Y V t = (det V )Y for all Y K2 (C).
Hence, T (Y ) = V Y V t = (det V )Y for all Y K2 (C), as desired.
We now look at the case n 3. Let
X diag (1, 2, . . . , n) Sn (C),
and if n = 2k we let
Y B 2B kB Kn (C),
otherwise, if n = 2k + 1, then we take
Y B 2B kB 0 Kn (C),
where B F12 K2 (C).
Notice that T (V XV t + 1 Y ) = V (X + Y )V t is unitarily t-congruent to
X+Y . Since T preserves unitary t-congruence and since unitary t-congruence
is an equivalence relation,
U V XV t U t + V U Y U t V t = T [U (V XV t +

1
Y )U t ] and X + Y

are unitarily t-congruent for any U Un (C). Hence, there exists a W


Un (C) such that
X + Y = W (U V XV t U t + V U Y U t V t )W t

(4.3.2)

Since X Sn (C) and Y Kn (C), we have


X = W U V XV t U t W t = (W U V )X(W U V )t

(4.3.3)

Y = W V U Y U t V t W t = (W V U )Y (W V U )t .

(4.3.4)

and

97
Since W, U, V Un (C), we have W U V, W V U Un (C) as well. Hence we
may rewrite the identity (4.3.3) as
X(W U V ) = (W U V )X

(4.3.5)

and we may rewrite the identity (4.3.4) as


Y (W V U ) = (W V U )Y.

(4.3.6)

If we write W U V [zij ], then the identity (4.3.5) implies that iz ij = jzij .


Hence, if i 6= j, then zij = 0. Thus W U V = diag (z11 , z22 , . . . , znn ).
Similarly, if we partition W V U [Wij ] conformal to Y , then the identity
(4.3.6) implies that iBW ij = jWij B for each i, j with 1 i, j n. Notice
that B is unitary. Hence, if i 6= j, then Wij = 0. Moreover, if n = 2k + 1,
then Wi,k+1 = 0 and Wk+1,i = 0 for each i = 1, 2, . . . , k. Thus,
W V U = W11 W22 Wkk if n = 2k,
or
W V U = W11 W22 Wkk Wk+1,k+1 if n = 2k + 1,
where Wii U2 (C) for each i = 1, 2, . . . , k and Wk+1,k+1 U1 (C).
Since U was arbitrary, we may choose U so that U V U is diagonal.
Now, W U V = W V U (U V U )V . Hence V = (U V U ) (W V U ) (W U V ) is
block-diagonal and has the same form as W V U , that is, either
V = V1 V2 Vk if n = 2k,

(4.3.7)

V = V1 V2 Vk Vk+1 if n = 2k + 1,

(4.3.8)

or
where Vi U2 (C) for each i = 1, 2, . . . k and Vk+1 U1 (C).
We apply the same analysis to the same choices for X Sn (C) and
U Un (C), but this time we take
Y1 F23 = 0 B 0n3 Kn (C).
As before, there exists a W1 Un (C) such that
X(W1 U V ) = (W1 U V )X

(4.3.9)

98
and
Y1 (W1 V U ) = (W1 V U )Y1 .

(4.3.10)

The identity (4.3.9) implies that D1 W1 U V is diagonal and the identity


(4.3.10) implies that W1 V U has the form

A11 0 A13

W1 V U = 0 A22 0
,
A31 0 A33
where A11 C, A22 M2 (C), and A33 Mn3 (C). Since W1 V U =
(W1 U V )(V )(U V U ), it follows that A13 = 0, A31 = 0, and the V1 and the
V2 in (4.3.7) or (4.3.8) are diagonal. Similarly, we can show that every Vi in
(4.3.7) or (4.3.8) is diagonal. Hence, V is diagonal, say V = diag (1 , 2 , . . . , n )
with |i | = 1 for each i = 1, 2, . . . , n. Thus, T (Fij ) = i j Fij for each i, j
with 1 i < j n.
We summarize this result in the following.
Lemma 4.3.6 Let T be a nonsingular linear operator on Mn (C) that preserves unitary t-congruence. Suppose that T (X) = X for all X Sn (C), and
for some scalar C and V Un (C), T (Y ) = V Y V t for all Y Kn (C).
Then there exist scalars 1 , . . . , n C with |i | = 1 for each i = 1, . . . , n
such that V = diag (1 , . . . , n ). Consequently, T (Fij ) = i j Fij for each
i, j with 1 i < j n.
The proof of Lemma (4.9) of [HHL] can be used (again, verbatim) to prove
the following result.
Lemma 4.3.7 Let T be a nonsingular linear opreator on Mn (C) that preserves unitary t-congruence. Suppose T (X) = X for all X Sn (C), and
for each i, j with 1 i < j n, there exists a nonzero ij C such that
T (Fij ) = ij Fij . Then
1
T (X) = [(X + X t ) + 12 (X X t )] for all X Mn (C).
2
If T (X) = X for all X Sn (C) and T (Y ) = V Y V t for all Y Kn (C),
then Lemmata 4.3.6 and 4.3.7 imply that we may take V = I for some
C, as desired.

99
We now look at the remaining case n = 4, T (X) = U XU t for all X
S4 (C), and T (Y ) = V Y + V t for all Y K4 (C), where , C are nonzero
and U, V U4 (C). As before, it is without loss of generality to assume
T (X) = X for all X S4 (C) and T (Y ) = V Y + V t for all Y K4 (C).
By considering various choices for X and Y , we will show that this cannot
happen. In particular, we will prove the following.
Lemma 4.3.8 Let a scalar C and V U4 (C) be given. Then

1
T (X) (X + X t ) + V (X X t )+ V t for all X M4 (C)
2
preserves unitary t-congruence if and only if = 0.
If = 0, then direct verification shows that T preserves unitary tcongruence. Suppose 6= 0 and T preserves unitary t-congruence. Let
X diag (1, 2, 3, 4), Y B 2B, and U U1 U2 ,

where B F12 K2 (C), and U1 , U2 U2 (C). Notice that (U Y U t )+ =


U Y U t . As before (see (4.3.2)), there exists a W U4 (C) such that
X + Y = W (U V XV t U t + V U Y U t V t )W t .
Consequently (see (4.3.5)),
X(W U V ) = (W U V )X

(4.3.11)

Y (W V U ) = (W V U )Y.

(4.3.12)

and (see (4.3.6))


The identity (4.3.11) implies that W U V is diagonal and the identity (4.3.12)
implies that W V U = Z1 Z2 for some Z1 , Z2 U2 (C). Now,
U V = V (W U V ) (W V U )U .

(4.3.13)

Since W U V is diagonal, and W V U and U are block-diagonal, we may write


(W U V ) (W V U )U W1 W2 for some W1 , W2 U2 (C). Partition V [Vij ]
conformal to U . The identity (4.3.13) can be rewritten as
(a) V11 W1
(b) V12 W2
(c) V21 W1
(d) V22 W2

=
=
=
=

U1 V11
U1 V12
U2 V21
U2 V22 .

(4.3.14)

100
Let V11 = Q1 Q2 be a (given) singular value decomposition of V11 , where
= diag (b1 , b2 ) with b1 b2 0, and Q1 , Q2 U2 (C). Then the identity
(4.3.14a) implies that
(Q2 W1 Q2 ) = (Q1 U1 Q1 ).

(4.3.15)

If we write D1 Q2 W1 Q2 = [wij ] and D2 Q1 U1 Q1 = [uij ], the identity


(4.3.15) gives
(a) b1 w11 = b1 u11
(b) b1 w12 = b2 u12
(4.3.16)
(c) b2 w21 = b1 u21
(d) b1 w22 = b2 u22 .
Suppose b1 > b2 . Since b2 0, the identity (4.3.16a) shows that w11 = u11 .
The identity (4.3.16b) and the fact that both D1 and D2 are unitary imply
that w12 = u12 = 0. Moreover, w21 = u21 = 0 as well. Hence D2 = Q1 U1 Q1 is
diagonal. Since U1 is arbitrary, this is a contradiction. Thus b1 = b2 . Hence
V11 = b1 Q1 Q2 = b1 Q, where we put Q Q1 Q2 U2 (C). If b1 = 0, then
V11 = 0 and since V is unitary, V has the form
V =

"

0 V12
V21 0

(4.3.17)

If b1 > 0, we let U1 I2 . The identity (4.3.14a) becomes b1 QW1 = b1 Q.


Hence, W1 = I2 . Now, the identity (4.3.14c) becomes V21 = U2 V21 . Since U2
is arbitrary, we conclude that V21 = 0. Since V is unitary, V12 = 0 as well.
Hence V has the form
"
#
V11 0
V =
.
(4.3.18)
0 V22
Now make the same choice of X, and let

Y1 F23 + 2F14 =

0
0 0
0
0 1
0 1 0
2 0 0

2
0
0
0

and U1 diag (1 , 2 , 3 , 4 ) U4 (C). As before, (U1 Y1 U1t )+ = U1 Y1 U1t and


moreover, there exists a W1 U4 (C) such that
X(W1 U1 V ) = (W1 U1 V )X

(4.3.19)

101
and
Y (W1 V U1 ) = (W1 V U1 )Y.

(4.3.20)

Again, W1 U1 V is diagonal and the identity (4.3.20) shows that W1 V U1 has


the form

z11 0
0 z14
0 z
0

22 z23

.
(4.3.21)
0 z32 z33 0
z41 0
0 z44
Now,

U1 V = V (W1 U1 V ) (W1 V U1 )U1 .

(4.3.22)

Since W1 U1 V and U1 are diagonal, D (W1 U1 V ) (W1 V U1 )U1 has the form
(4.3.21). Since V has the form (4.3.17) or (4.3.18), the identity (4.3.22) shows
that D is diagonal, that is, z14 = z23 = z32 = z14 = 0. Now choose U1 having
distinct (diagonal) elements. Then the identity (4.3.22) implies that each of
the Vij in (4.3.17) and (4.3.18) has the form either

or

"

1 0
0 2

(4.3.23)

"

0 4
3 0

(4.3.24)

We will show that neither of these choices is possible.


To reiterate, we are assuming
1
T (X) (X + X t ) + V (X X t )+ V t for all X M4 (C), 6= 0
2
where V U4 (C) has either the form (4.3.17) or (4.3.18), and each Vij has
the form either (4.3.23) or (4.3.24). Notice that if T (A1 ) = W T (A2 )W t , then

and

1
1
(A1 + At1 ) = W (A2 + At2 )W t
2
2

(4.3.25)

1
1
V (A1 At1 )+ V t = W V (A2 At2 )+ V t W t .
2
2

(4.3.26)

102

Case 1 V

"

0 V12
V21 0

Let

A1
With

1
0
1
0

0
0
0
0

1
0
0
0

0
0
0
2

U 1

"

and A2

0 1
1 0

1
1
0
0

1
0
0
0

0
0
0
0

0
0
0
2

1 U4 (C),

(4.3.27)

one checks that A1 = U A2 U t . Hence, there exists a W U4 (C) such that


T (A1 ) = W T (A2 )W t . Now the identity (4.3.25) shows that W has the form

w11 0
0
0
0 w22 w23 0
0 w32 w33 0
0
0 w44
0

The identity (4.3.26) shows that


"

0
V21 BV12t
t
V12 BV21
0

where
B

"

=W

0 0
0 1

"

0
0
0 V21 CV21t

and C

"

0 1
1 0

W t,

(4.3.28)

Since V21 CV21t = C, where det V21 6= 0, direct computation shows that
W

"

0
0
0 V21 CV21t

Wt =

0
0
0
0
0
0
0
w23 w44
0
0
0
w33 w44
0 w23 w44 w33 w44
0

. (4.3.29)

Now, use (4.3.28) and (4.3.29) to get


w33 w44 = 0
and
V12 BV21t

= V12

"

0 0
0 1

V21t

"

0
0
0 w23 w44

(4.3.30)
#

(4.3.31)

103
Since 6= 0 and w44 6= 0, (4.3.30) shows that w33 = 0. Since W is unitary,
w22 = 0 as well. The identity (4.3.31) implies that V12 and V21 are diagonal,
say
"
#
"
#
1 0
3 0
and V21
.
V12
0 2
0 4
Now, consider

D1

1
0
0
2
2 0
0 3

2
0
3
0

0
3
0
4

and D2

1
2
0
0

2 0
3 0
0 2
0 3

0
0
3
4

With the U in (4.3.27), we have D1 = U D2 U t . Hence, there exists a W1


U4 (C) such that T (D1 ) = W1 T (D2 )W1t . Moreover,
1
1
(D1 + D1t ) = W1 (D2 + D2t )W1t
2
2

(4.3.32)

and

1
1
V (D1 D1t )+ V t = W1 V (D2 D2t )+ V t W1t .
2
2
The identity (4.3.32) shows that W1 has the form

W1

0
u11 0
0
0
0 u23 0
0 u32 0
0
0
0
0 u44

(4.3.33)

Consequently, (4.3.33) becomes

0
0
0
31 3
0
0
22 4
0
0
0
31 2
0
0
22 4
0
0

0
0 2 0
0
0
0 3

,
2
0
0 0
0
3 0 0

(4.3.34)

where u11 u32 1 2 and u23 u44 3 4 . Since W1 and V are unitary,
|| = || = |i j | = 1 for all 1 i, j 2, and hence (4.3.34) is impossible.
Case 2 V V11 V22 .

104
For this case, we take

A1

1
0
1
0

0
2
0
0

1
0
0
0

0
0
0
0

and A2

1
0
0
1

0
2
0
0

0
0
0
0

1
0
0
0

One checks that A1 and A2 are unitarily t-congruent. Hence, there exists
a W U4 (C) such that T (A1 ) = W T (A2 )W t . As in Case 1, the identities
(4.3.25) and (4.3.26) hold. The identity (4.3.25) now shows that W has the
form
W W1 W2 , with W1 =

"

w11 0
0 w22

and W2 =

"

w33 w43
w34 w44

Consequently, (4.3.26) reduces to


V1 BV2t = W1 V1 CV2t W2t ,
where
B

"

0 0
0 1

and C

"

0 1
0 0

(4.3.35)
#

The identity (4.3.35) can now be rewritten as


B(V2 W2 V2 )t = (V1 W1 V1 )C.

(4.3.36)

Since V1 has the form either (4.3.23) or (4.3.24) and W1 is diagonal, V1 W1 V1


is diagonal. Thus, if we write (V2 W2 V2 )t = [uij ], (4.3.36) shows that u21 =
u22 = 0. Hence, V2 W2 V2 is singular, a contradiction to the fact that both V2
and W2 are unitary.

4.4

The Real Case

Lemma 4.4.1 Let T be a given linear operator on Mn (IR). Suppose T preserves orthogonal similarity. Then ker T is a direct sum of one or more of
the following spaces: n , Sn0 (IR), and Kn (IR).

105
Proof Let 0 6= A ker T . Lemma 4.2.9 guarantees that AS A + At
span O(A) and AK A At span O(A). Thus, AS ker T and
AK ker T . If AK 6= 0, Lemma 4.2.5 ensures that Kn (IR) ker T ;
if AS n , then n ker T ; if 0 6= AS Sn0 (IR), Lemma 4.2.7
guarantees that Sn0 (IR) ker T ; and if AS 6 n Sn0 (IR), Lemma 4.2.8
ensures that Sn (IR) ker T .
Let T be a given linear operator on Mn (IR) that preserves orthogonal
similarity. Notice that for every X Mn (IR), we can write X = XS + XK ,
where
X + Xt
X Xt
XS
Sn (IR) and XK
Kn (IR).
2
2
Hence, T (X) = T1 (X) + T2 (X), where
T1 (X) T (XS ) and T2 (X) T (XK ) for all X Mn (IR).
One checks that both T1 and T2 preserve orthogonal similarity. Moreover,
Kn (IR) ker T1 and Sn (IR) ker T2 .
Since Sn (IR) = n Sn0 (IR), Lemma 4.4.1 guarantees that either Kn (IR)
ker T2 (so that T2 = 0) or Kn (IR) ker T2 = {0} (so that T2 is nonsingular
on Kn (IR)) . However, in the case of T1 , any subset of {n , Sn0 (IR)} could
also be a subset of ker T1 . We begin with this case.
Lemma 4.4.2 Let T be a given linear operator on Mn (IR) that preserves
orthogonal similarity. Then T1 satisfies one of the following three conditions:
(A1) There exists A0 Mn (IR) such that
T1 (X) = (tr X)A0 for all X Mn (IR).
(A2) There exist U Un (IR) and scalars , IR such that
T1 (X) = (tr X)I + U (X + X t )U t for all X Mn (IR).
(A3) n = 2 and there exist U U2 (IR), B K2 (IR), and scalars , IR
such that
T1 (X) = (tr X)I + (tr X)B + U (X + X t )U t
for all X M2 (IR).

106
Proof Suppose T preserves orthogonal similarity. Then T1 also preserves
orthogonal similarity and hence Lemma 4.4.1 guarantees that ker T1 is a direct sum of one or more of n , Sn0 (IR), Kn (IR). Suppose Sn0 (IR) ker T1 .
Since Kn (IR) ker T1 and any X Mn (IR) can be written as X =
( n1 tr X)I + XS 0 + XK , where XS 0 Sn0 (IR) and XK Kn (IR), we have
T1 (X) = (tr X)A0 for all X Mn (IR),
where we put A0 n1 T1 (I) Mn (IR). Hence condition (A1) holds.
Now suppose Sn0 (IR) 6 ker T1 . Notice that if a nonzero A Sn0 (IR)
satisfies T1 (A) = 0, then Lemma 4.2.7 ensures that Sn0 (IR) ker T1 . Hence,
in this case, T1 (A) 6= 0 for all nonzero A Sn0 (IR). Since
dim Sn0 (IR) =

n(n + 1)
n(n 1)
1>
= dim Kn (IR),
2
2

there exists a nonzero B Sn (IR) such that T1 (A) = B for some nonzero
A Sn0 (IR). It follows that T1 (Sn0 (IR)) Sn (IR). Since n Sn0 (IR) = Sn (IR),
we may regard T1 as a mapping from Sn (IR) to Mn (IR). Define
TS (X)

T1 (X) + T1 (X)t
for all X Sn (IR)
2

and

T1 (X) T1 (X)t
for all X Sn (IR).
2
Notice that TS (Sn (IR)) Sn (IR). Lemma 4.2.4 guarantees that TS satisfies
one of the following conditions:
TK (X)

(1a) There exists B1 Sn (IR) such that TS (X) = (tr X)B1 for all X
Sn (IR); or
(1b) There exist U Un (IR) and , IR such that TS (X) = (tr X)I +
U XU t for all X Sn (IR).
For every nonzero A Sn0 (IR), T1 (A) is nonzero and symmetric. It follows
that TS (A) = T1 (A) 6= 0 for all A Sn0 (IR). Hence (1a) cannot happen.
Notice also that TK (Sn (IR)) Kn (IR), so that Sn0 ker TK . Thus,
TK (X) = (tr X)A1 for all X Sn (IR),

107
where we put A1 n1 T1 (I) Kn (IR). Since T1 = TS + TK , we have
T1 (X) = (tr X)A1 + (tr X)I + U XU t for all X Sn (IR).
If n = 2, then condition (A3) of Lemma 4.4.2 holds. If n 3, we consider
various choices for X to show that A1 = 0, and hence condition (A2) holds.
Let n 3 and suppose A1 [aij ] 6= 0, that is, suppose that akh 6= 0 for
some 1 k < h n. Let
U XU t diag (1, . . . , k, . . . , h, . . . , n)
and
U Y1 U t diag (1, . . . , h, . . . , k, . . . , n).
Notice that X and Y1 are orthogonally similar. Hence, there exists W
Un (IR) such that T1 (X) = W T1 (Y1 )W t . It follows that
U XU t = W (U Y1 U t )W t

(4.4.37)

A1 = W A1 W t .

(4.4.38)

and
The identity (4.4.37) and the fact that W Un (IR) show that W [wij ] has
entries wkh , whk , wii = 1 where i = 1, . . . , n, i 6= h and i 6= k, and wij = 0
otherwise. The k th and hth rows of the identity (4.4.38) show that
(a) wkh whk ahk = akh ,
(b) wkh wii ahi = aki for i 6= k, and
(c) whk wii aki = ahi for i 6= h.

(4.4.39)

Since ahk = akh 6= 0, the identity (4.4.39a) shows that wkh whk = 1. The
identities (4.4.39b) and (4.4.39c) now show that ahi = aki = 0 for all integers
h 6= i 6= k.
Since n 3, there is an integer t such that 1 t n, t 6= k, and t 6= h.
Suppose t < k and this time consider
U Y2 U t diag (1, . . . , k, . . . , t, . . . , n).
Notice that X and Y2 are also orthogonally similar. Repeat the previous
analysis and conclude that akh = 0. Hence, A1 = 0, as desired.

108
One checks that if a given linear operator L on Mn (IR) satisfies any one
of the three conditions (A1) - (A3), then L preserves orthogonal similarity.
We now consider T2 . Recall that either T2 = 0, or T2 is nonsingular on
Kn (IR). Hence, T2 (A) 6= 0 for some (necessarily) nonzero A Kn (IR) if and
only if Kn (IR) ker T2 = {0}.
Lemma 4.4.3 Let T be a given linear operator on Mn (IR) that preserves
orthogonal similarity. Then T2 satisfies one of the following three conditions:
(B1) There exist U Un (IR) and a scalar IR such that
T2 (X) = U (X X t )U t for all X Mn (IR).
(B2) n = 4 and there exist U U4 (IR) and a scalar IR such that
T2 (X) = U (X X t )+ U t for all X M4 (IR).
(B3) n = 2 and there exists B0 M2 (IR) with tr B0 = 0 such that
T2 (X) = (tr XF12 )B0 for all X K2 (IR).
Proof We consider the restriction of T2 on Kn (IR). Define
TS (X)
and

T2 (X) + T2 (X)t
for all X Kn (IR),
2

T2 (X) T2 (X)t
for all X Kn (IR).
2
Notice that for any X Kn (IR), we have T2 (X) = TS (X) + TK (X).
Moreover, TS (Kn (IR)) Sn (IR) and TK (Kn (IR)) Kn (IR). We will show
that if n 3, then TS = 0 so that Lemma 4.2.3 ensures that either (B1) or
(B2) holds. If n = 2, we will show that T2 satisfies (B3).
Suppose TS 6= 0. Since TS also preserves orthogonal similarity, TS is
nonsingular. Since A O(A) for any A Kn (IR), we have TS (A)
O(TS (A)). Hence, if is an eigenvalue of TS (A) with multiplicity k, then
is also an eigenvalue of TS (A) with multiplicity k. Moreover, if A 6= 0, then
TS (A) has a positive eigenvalue.
TK (X)

109
Now, Im T is a subspace of Sn (IR) with dimension n(n 1)/2. If Z is a
subspace of Sn (IR) with dimension k, and if r {2, . . . , n 1} and
k (r)

(r 1)(2n r + 2)
,
2

(4.4.40)

then Theorem (1) of [FL] guarantees that Z contains a nonzero matrix B


whose largest eigenvalue has multiplicity at least r. This result, together with
the fact that the eigenvalues of TS (A) are paired, will give us a contradiction.
Let n = 2t + 1, where t = 1, 2, . . .. Then 2 t + 1 n 1 and
k=

n(n 1)
3t(t + 1)
= t(2t + 1)
= (t + 1).
2
2

Hence, there exists a nonzero B Im T whose largest eigenvalue has


multiplicity at least t + 1. Since B Im T and B 6= 0, we must have > 0.
But since is also an eigenvalue of B with multiplicity at least t + 1, this
is a contradiction.
Now let n = 2t, where t = 3, 4, . . .. Then 2 t + 1 n 1 and
k=

n(n 1)
t(3t + 1)
= t(2t 1)
= (t + 1).
2
2

Hence, there exists a nonzero B Im T whose largest eigenvalue is positive


and has multiplicity at least t + 1. Again, this is a contradiction.
Suppose now that n = 4. Notice that dim Im TS = 6 > 4 = (2).
Hence, Im TS contains a matrix of the form Y0 Qdiag (a, a, a, a)Qt
where Q U4 (IR) and a > 0. We will show that this cannot happen.
One checks that the tangent space at any A M4 (IR) is given by
TA {XA + At X t : X K4 (IR)}.
Moreover,
dim TY0 = dim TQt Y0 Q = 2
and
dim TA 4 for any nonzero A K4 (IR).
Since TS is nonsingular on K4 (IR), it must be that
dim TA dim TTS (A) for any A K4 (IR).

110
Hence, Y0 6 Im TS . Thus, if n = 4, then TS = 0.
Now consider the case n = 2. Notice that
1
X = (tr XF12 )F12 for every X K2 (IR).
2
It follows that
T (X) = (tr XF12 )A0 for all X K2 (IR),
where we put A0 12 T (F12 ). Since F12 and F12 are orthogonally similar,
A0 is orthogonally similar to A0 . Hence, tr A0 = tr (A0 ) = tr A0 , so
that tr A0 = 0. Notice that if A0 K2 (IR), then condition (B1) also holds.
We now return to consider our original T = T1 + T2 . First, we consider
n 3. Lemma 4.4.2 guarantees that T1 satisfies either (A1) or (A2), and
Lemmata 4.4.3 guarantees that T2 satisfies either (B1) or (B2). We consider
these cases separately.
Case 1. T1 (X) = (tr X)A0 and n 3.
First we consider T2 (X) = cV (X X t )V t , that is,
T (X) = (tr X)A0 + cV (X X t )V t for all X Mn (IR).
If c = 0, then condition (1) of Theorem 4.1.2 holds. Suppose now that c 6= 0.
Write A0 = AS + AK , where AS Sn (IR) and AK Kn (IR). Consider
X1

1
1
1
1
I + V t Ak V and X2 I V t Ak V.
n
2c
n
2c

One checks that X1 and X2 are orthogonally similar. Hence, T (X1 ) = AS +


2AK is orthogonally similar to T (X2 ) = AS . It follows that AK = 0 and
A0 = AS Sn (IR). We will show that A0 = I for some IR.
Since A0 Sn (IR), there exist a Q Un (IR) and a1 , a2 , a3 , . . . , an IR
such that A0 = Qdiag (a1 , a2 , a3 , . . . , an )Qt . Consider

0 0 1
1
1 t

X3 I + V Q( 0 0 0 0n3 )Qt V Mn (IR)


n
2c
1 0 0

111
and

0 0 0
1
1 t
t
X4 I + V Q( 0 0 1
0n3 )Q V Mn (IR).
n
2c
0 1 0

One checks that X3 and X4 are orthogonally similar. Hence T (X3 ) is orthogonally similar to T (X4 ). It follows that

a1 0 1
a1 0 0

1
B1
0 a2 0 and B2 0 a2

1 0 a3
0 1 a3
are orthogonally similar. Thus,
a2 (a1 a3 + 1) = det B1 = det B2 = a1 (a2 a3 + 1),
and hence a2 = a1 . Similarly, we can show that ai = a1 for each i = 3, . . . , n.
Hence, A0 = a1 I and condition (2) of Theorem 4.1.2 holds with = 0.
Suppose now that T2 satisfies (B2), that is, n = 4 and
T (X) = (tr X)A0 + cV (X X t )+ V t for all X M4 (IR).
If c = 0, then condition (1) of Theorem 4.1.2 holds. Suppose that c 6= 0. We
will show that A0 = aI for some a IR so that condition (3) of Theorem 4.1.2
holds. We will use the facts that Y + K4 (IR) and (Y + )+ = Y whenever
Y K4 (IR).
Write A0 AS + AK , where AS S4 (IR) and AK K4 (IR). Let
X1

1
1
1
1
I + (V t Ak V )+ and X2 I (V t Ak V )+ .
n
2c
n
2c

As in the previous case, X1 and X2 are orthogonally similar, so that T (X1 ) =


AS + 2AK is orthogonally similar to T (X2 ) = AS . It follows that AK = 0.
Moreover, the previous analysis can be used to show that A0 = aI, as desired.
Case 2. T1 (X) = a(tr X)I + bU (X + X t )U t
If T2 satisfies (B1), we may follow the analysis of the complex case (see
the discussion after Lemma 4.3.5) to show that we may take V = U . Thus,
condition (2) of Theorem 4.1.2 holds.

112
Suppose n = 4 and T2 satisfies (B2). If b = 0, then
T (X) = a(tr X)I + cV (X X t )+ V t for all X M4 (IR),
so that condition (3) of Theorem 4.1.2 holds. If b 6= 0, we may again follow
the analysis of the complex case (see the discussion after Lemma 4.3.8) to
show that c = 0. Thus, T satisfies condition (2) of Theorem 4.1.2 with = 0.
This proves the forward implication of Theorem 4.1.2. The converse can
be proven by direct verification.
For the remaining case, n = 2, notice that if K2 (IR) ker T , then
Lemma 4.4.2 shows that either condition (1) or condition (2) of Theorem 4.1.3
holds. It will be useful to have the following observation.
Proposition 4.4.4 Let A, B M2 (IR) be given. There exists U U2 (IR)
such that A = U BU t if and only if the following three conditions are satisfied:
(a) det A = det B,
(b) tr A = tr B, and
(c) tr At A = tr B t B.
Proof The forward implication is easily verified. For the converse, notice
that the conditions imply that A and B have the same eigenvalues and
singular values. Hence, the matrices A A + I and B B + I
also satisfy the three conditions for any IR. Notice that A and B
are orthogonally similar if and only if there is some IR such that A
and B are orthogonally similar. Thus, we may assume that tr A 6= 0
and det A 6= 0. Since A and B have the same singular values, there
exist X1 , X2 , V1 , V2 U2 (IR) such that
X1 AX1t = X2 and V1 BV1t = V2 ,

(4.4.41)

where diag (1 , 2 ) and 1 2 > 0. Since the determinant is


invariant under orthogonal similarity, it follows from condition (a) and
(4.4.41) that det (X2 ) = det (V2 ). Now, any U U2 (IR) can be written
as
"
#
cos
sin
U
, for some IR.
sin cos

113
Hence, (b) and the fact that tr A 6= 0 show that X2 and V2 have
the same diagonal elements. Direct verification now shows that X2 =
W V2 W t , where W diag (1, 1) U2 (IR). Hence,
X2 = W V2 W t = W V2 W t .
It follows that A and B are orthogonally similar.
Suppose K2 (IR) 6 ker T . Assume further that S20 (IR) ker T . Write
1
1
X = (tr X)I + XS 0 (tr XF12 )F12 .
2
2

(4.4.42)

Then XS 0 S20 (IR) and hence


T (X) = (tr X)A0 + (tr XF12 )B0 for all X M2 (IR),
where A0 12 T (I) and B0 12 T (F12 ). Consider
X1 F12 and X2 F12
to conclude that tr B0 = 0. Now let
Y1 I F12 and Y2 I + F12 .
Then we have det T (X1 ) = det T (X2 ). Direct computation shows that this
implies tr A0 B0 = 0. Moreover, we have
tr T (Y1 )t T (Y1 ) = tr T (Y2 )t T (Y2 ).
For this case, it follows that tr At0 B0 = 0. Hence, T satisfies condition (1) of
Theorem 4.1.3.
Suppose S20 (IR) 6 ker T and write X as in (4.4.42). Since the restriction
of T = T1 + T2 to S20 (IR) is nonsingular, it follows from Lemma 4.4.2 that
T (X) = (tr X)I + U (X + X t )U t + (tr X)C0 + (tr XF12 )D0
for all X M2 (IR), with 6= 0, C0 K2 (IR), and D0 12 T (F12 ). Now
use X1 = X2 = F12 to show that tr D0 = 0. Hence, we may write D0
DS 0 + F12 , where DS 0 S20 (IR).

114
Use Proposition 4.4.4 to show that
X3 U t DS 0 U + F12 and X4 U t DS 0 U F12
are orthogonally similar. It follows that T (X3 ) is orthogonally similar to
T (X4 ), and DS 0 = 0, so that D0 = F12 . Since K2 (IR) 6 ker T , we must
have 6= 0.
Consider
X5 I + A0 and X6 I A0 .
Since A0 K2 (IR), X5 and X6 are orthogonally similar. The orthogonal
similarity of T (X5 ) and T (X6 ) shows that A0 = 0. Now,
(det U )(tr XF12 )F12 = U (X X t )U t for all X M2 (IR).
Hence, T satisfies condition (2) of Theorem 4.1.3.
The converse can be verified with the help of Proposition 4.4.4.

Appendix A
A New Look at the Jordan
Canonical Form

115

116
The ultimate proof of the Jordan canonical form theorem is a kind of
mathematical holy grail, and we join others who have sought this treasure
in recent years ([Br], [FS], [GW], [Ho], [V]). Our matrix-theoretic approach
builds on the Schur triangularization theorem and shows how to reduce any
unispectral upper triangular matrix to a direct sum of Jordan blocks. Connoisseurs of the Jordan canonical form will recognize in our argument key
elements of Ptaks classic basis-free proof [P].
Our terminology and notation is standard, as in [HJ1]: we denote the set
of m-by-n complex matrices by Mm,n and write Mn Mn,n ; AT denotes the
transpose of A, and A denotes the conjugate transpose; the n-by-n upper
triangular Jordan block with eigenvalue is denoted by Jk ().
Lemma A.1 Let positive integers k, n be given with k < n. Let X1 , Y T
Mk,n be such that X1 Y Mk is nonsingular. Then there exists an X2
Mnk,n such that X2 Y = 0 and
X

"

X1
X2

Mn is nonsingular.

Proof Since Y has full (column) rank k, we let {1 , . . . , nk } Cn be a


basis for the orthogonal complement of the span of the columns of Y ,
and set X2 [1 nk ] . Then X2 Y = 0 and X2 has full (row)
rank. If X [X1T X2T ]T is singular, then its rows are dependent and
there are vectors Ck and Cnk , not both zero, such that
"

#T

X = [ ]

"

X1
X2

= T X1 + T X2 = 0.

Then 0 = T X1 Y + T X2 Y = T X1 Y , so = 0 since X1 Y is nonsingular. Thus, T X2 = 0 and = 0 since X2 has full row rank. We
conclude that X is nonsingular, as desired.
Let A Mn be nilpotent, and let k 1 be such that Ak = 0 but
Ak1 6= 0; if A = 0, then k = 1 and we set Ak1 I. Let x, y Cn be such
that x Ak1 y 6= 0. Define

X1

x
x A
..
.

x Ak1

Mk,n

117
and
Y [y Ay Ak1 y] Mn,k .

Since x Ak1 y 6= 0, the columns of

X1 Y =

x Ak1 y

x Ak1 y

Mk

are independent, so X1 Y is nonsingular (if k = n, this means X1 is nonsingular). Moreover,


(A.0.1)
X1 A = Jk (0)X1
and
AY = Y JkT (0).

(A.0.2)

If k = n we have X1 A = Jn (0)X1 and X1 is nonsingular, so Jn (0) = X1 AX11 .


If k < n, Lemma A.1 ensures that there is an X2 Mnk,n such that
X2 Y = 0
and
X

"

X1
X2

(A.0.3)

Mn is nonsingular.

Set B (X2 A)X 1 Mnk,n and partition B = [B1 A2 ] with B1 Mnk,k


and A2 Mnk . Notice that
X2 A = BX = [B1 A2 ]

"

X1
X2

= B1 X1 + A2 X2 .

(A.0.4)

Now use (A.0.4) and the identities (A.0.3) and (A.0.2) to compute
X2 AY = (X2 A)Y = B1 X1 Y + A2 (X2 Y ) = B1 (X1 Y )
and
X2 AY = X2 (AY ) = X2 (Y JkT (0)) = (X2 Y )JkT (0) = 0.
Since X1 Y is nonsingular, we conclude that B1 = 0. Thus, (A.0.4) simplifies
to
X2 A = A2 X2
(A.0.5)

118
and we can use (A.0.1) and (A.0.5) to compute
XA =

"

X1
X2

"

Jk (0) 0
0
A2

A=

"

X1 A
X2 A

#"

X1
X2

#
#

"

Jk (0)X1
A2 X2

"

Jk (0) 0
0
A2

X.

Since X is nonsingular, we have


C XAX 1 =

"

Jk (0) 0
0
A2

Hence, the given nilpotent matrix A and the block diagonal matrix C are
similar; either A2 is absent (k = n) or it has size strictly less than that of A
and Ak2 = 0.
Application of the preceding argument to A2 and its successors (finitely
many steps) now shows that any nilpotent matrix is similar to a direct sum
of singular Jordan blocks.
Lemma A.2 Let A Mn be nilpotent. Then there exists a nonsingular
S Mn and positive integers m, k1 km 1 such that k1 + +km = n
and

Jk1 (0)
0

0
Jk2 (0)
0

.
SAS =
..
..
..
...

.
.
.
0

Jkm (0)

It is a standard consequence of Schurs triangularization theorem that


any complex matrix is similar to a direct sum of upper triangular matrices,
each of which has only one eigenvalue; see Theorem (2.4.8) in [HJ1] for a
proof.

Lemma A.3 Suppose that A Mn has r eigenvalues 1 , . . . , r with respective multiplicities n1 , . . . , nr . Then A is similar to a matrix of the form

T1 0
0 T2
.. ..
. .
0 0

...

0
0
..
.

Tr

(A.0.6)

119
where each Ti Mni is upper triangular with all diagonal entries equal to i ,
i = 1, . . . , r.
For a block matrix of the form (A.0.6), notice that each Ti i Ini is
nilpotent for i = 1, . . . , r. Lemma A.2 guarantees that there exists a nonsingular Si Mni and positive integers mi , i1 imi 1 such
that Ji1 (0) Jimi (0) = Si (Ti i Ini )Si1 = Si Ti Si1 i Ini . Hence,
Si Ti Si1 = Ji1 (i ) Jimi (i ). If we now consider the similarity of (A.0.6)
obtained with the direct sum S1 Sr , we obtain the desired result:
Jordan Canonical Form Theorem Let A Mn . Then there exist positive
integers m, k1 , . . . , km with k1 + + km = n, scalars 1 , . . . , m and a
nonsingular S Mn such that
SAS

Jk1 (1 )
0
0
Jk2 (2 )
..
..
.
.
0
0

...

0
0
..
.

Jkm (m )

Once one has the Jordan form, its uniqueness (up to permutation of the
diagonal blocks) can be established by showing that the sizes and multiplicities of the blocks are determined by the finitely many integers rank (AI)k ,
an eigenvalue of A, k = 1, . . . , n; see the proof of Theorem (3.1.11) in [HJ1].

Bibliography
[A]

L. Autonne. Sur les Matrices Hypohermitiennes et sur les Matrices


Unitaries. Annales de Universite de Lyon. Nouvelle Serie I. Fasc. 38
(1915), 177.

[Bo] W. M. Boothby. An Introduction to Dierential Manifolds and Riemannian Geometry. Academic Press, New York, 1975.
[BP] E. P. Botta and S. Pierce. The Preservers of Any Orthogonal Group.
Pacific J. Math. 70 (1977), 3749.
[Br] R. A. Brualdi. The Jordan Canonical Form: an Old Proof. Amer. Math.
Monthly, 94 (1987), 257267.
[CH1] D. Choudhurry and R. A. Horn. An Analog of the GramSchmidt
Algorithm for Complex Bilinear Forms and Diagonalization of Complex Symmetric Matrices. Technical Report No. 454. Department of
Mathematical Sciences. The Johns Hopkins University. (1986).
[CH2] D. Choudhury and R. A. Horn. A Complex OrthogonalSymmetric
Analog of the Polar Decomposition. SIAM J. Alg. Disc. Meth. 8 (1987),
219225.
[DBS] N. G. De Bruijn and G. Szekeres. On Some Exponential and Polar
Representations of Matrices. Nieuw Archief voor Wiskunde 3 III (1955),
2032.
[D]

J. D. Dixon. Rigid Embedding of Simple Groups in the General Linear


Group. Can. J. Math. XXIX (1977), 384391.

[F]

H. Flanders. Elementary Divisors of AB and BA. Proc. Amer. Math.


Soc. 2 (1951), 871974.
120

121
[FS] R. Fletcher and D. C. Sorensen. An Algorithmic Derivation of the Jordan Canonical Form. Amer. Math. Monthly 90 (1983), 1216.
[FL] S. Friedland and R. Loewy. Subspaces of Symmetric Matrices Containing Matrices with a Multiple First Eigenvalue. Pacific J. Math. 62
(1976), 389399.
[GW] A. Galperin and Z. Waksman. An Elementary Approach to Jordan
Theory. Amer. Math. Monthly 87 (1981), 728732.
[G]

F. R. Gantmacher. The Theory of Matrices, Vol. 2. Chelsea Publishing


Company, New York, 1974.

[Hi] F. Hiai. Similarity Preserving Linear Maps on Matrices. Linear Algebra


Appl. 97 (1987), 127139.
[Ho] Y.P. Hong. A Canonical Form Under Equivalence. Linear Alg. Appl.
147 (1991), 501549.
[HH1] Y.P. Hong and R. A. Horn. A Canonical Form for Matrices Under
Consimilarity. Linear Algebra Appl. 102 (1988), 143168.
[HH2] Y.P. Hong and R. A. Horn. A Characterization of Unitary Congruence. Linear and Multilinear Algebra 25 (1989), 105119.
[HHL] Y.P. Hong, R. A. Horn, and C.K. Li. Linear Operators Preserving
tCongruence on Matrices. Linear Algebra Appl. to appear.
[HJ1] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University
Press, New York, 1985.
[HJ2] R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge
University Press, New York, 1991.
[HLT] R. A. Horn, C.K. Li, and N.K. Tsing. Linear Operators Preserving
Certain Equivalence Relations on Matrices. SIAM J. Matrix Analysis
Appl. 12 (1991), 195204.
[K1] I. Kaplansky. Algebraic Polar Decomposition. SIAM J. Matrix Anal.
Appl. 11 (1990), 213217.

122
[K2] I. Kaplansky. Linear Algebra and Geometry A Second Course.
Chelsea Publishing Company, New York, 1974.
[LRT] C.K. Li, L. Rodman, and N.K. Tsing. Linear Operators Preserving
Certain Equivalence Relations Originating in System Theory. Linear
Algebra Appl. 161 (1992), 165225.
[LT1] C.K. Li and N.K. Tsing. GInvariant Norms and G(c)Radii. Linear
Algebra Appl. 150 (1991), 150179.
[LT2] C.K. Li and N.K. Tsing. Linear Preserver Problems: A Brief Introduction and Some Special Techniques. Linear Algebra Appl. 162164
(1992), 217236.
[L]

R. Loewy. Linear Maps Which Preserve a Balanced Nonsingular Inertia


Class. Linear Algebra Appl. 134 (1990), 165179.

[PM] W. V. Parker and B. E. Mitchell. Elementary Divisors of Certain Matrices. Duke Math. J. 19 (1952), 483485.
[P]

V. Ptak. Eine Bemerkung zur Jordanschen Normalform von Matrizen.


Acta. Sci. Math. (Szeged) 17 (1956), 190194.

[T]

R. C. Thompson. On the Matrices AB and BA . Linear Algebra Appl.


1 (1968) 4358.

[TK] R. Thompson and C. Kuo. Doubly Stochastic, Unitary, Unimodular, and Complex Orthogonal Power Embedding. Acta Sci. Math. 44
(1982), 345357.
[V]

H. Valiaho. An Elementary Approach to the Jordan Form of a Matrix.


Amer. Math. Monthly 93 (1986), 711714.

Vita

123

124
DENNIS I. MERINO
817 Scarlett Drive
Baltimore, MD 21204
(301) 296 3620
e-mail: dennis@jhuvms.bitnet

Personal
Marital Status
Citizenship
Visa

Single
Filipino
F-1

Education / Awards
The Johns Hopkins University, Baltimore, MD
Candidate for Ph.D. degree in Mathematical Sciences,
expected May, 1992.
The Johns Hopkins University, Baltimore, MD
Master of Science in Engineering awarded May, 1990;
Recepient of the Abel Wolman Fellowship, 1988.
University of the Philippines, Quezon City, Philippines
Bachelor of Science in Mathematics, summa cum laude, November,
1986.
Awardee, Deans Medal for Academic Excellence, 1987
College Scholar, 1983-1984

University Scholar, 1983-1987

National Science and Technology Authority Scholar, 1983-1987


Phi Kappa Phi Awardee, 1987

Most Outstanding Student of Oriental Mindoro, 1985

125
Employment Experience
1989-present

Teaching/Research Assistant
Department of Mathematical Sciences
The Johns Hopkins University, Baltimore, MD
Assisted in Discrete Mathematics, Combinatorial Analysis,
Graph Theory and Game Theory; Research in Matrix Analysis;
Statistical Data Analysis for the Maryland DWI Law Institute.

1990 (summer) Mathematician


Applied Physics Laboratory
National Institutes of Health, Bethesda MD
Assisted in research on fractals and lasers.
1986-1988

Instructor
Department of Mathematics
The University of the Philippines, Quezon City, Philippines
Taught elementary courses in Analysis and Algebra.
Chairman of the Student-Faculty Relations Commitee, 1987.
Trainer for the participants in the International
Mathematical Olympiad, 1988

Membership in Professional Societies


Mathematical Association of America
Publications
R. A. Horn and D. I. Merino. A Real-Coninvolutory Analog of the
Polar Decomposition. (preprint)
R. A. Horn and D. I. Merino. A Canonical Form for Contragredient
Equivalence. (preprint)
R. A. Horn and D. I. Merino. A New Look at the Jordan

126
Canonical Form. (preprint)
R. A. Horn, C. K. Li, and D. I. Merino. Linear Operators Preserving
Orthogonal Equivalence on Matrices. (preprint)
C. K. Li, and D. I. Merino. Linear Operators Preserving
Unitary t-Congruence on Matrices. (preprint)
References
Dr. Roger A. Horn
Department of Mathematical Sciences
The Johns Hopkins University
Baltimore, MD 21218
Dr. Chi-Kwong Li
Department of Mathematics
The College of William and Mary
Williamsburg, VA 23185
Dr. Alan J. Goldman
Department of Mathematical Sciences
The Johns Hopkins University
Baltimore, MD 21218

You might also like