You are on page 1of 6

Sankhya : The Indian Journal of Statistics

1998, Volume 60, Series B, Pt. 2, pp. 215-220


INFERENCES RELATING TO THE MULTIPLICITY OF THE
SMALLEST EIGENVALUE OF A CORRELATION MATRIX
By JAMES R. SCHOTT
University of Central Florida, Orlando
SUMMARY. The minimum chi-squared test is obtained for testing the hypothesis that
the smallest r eigenvalues of an mm correlation matrix are equal, where r < m is specied
in advance. One advantage of this test is that it provides as a natural byproduct a correlation
matrix estimator that has multiplicity r for its smallest eigenvalue.
1. Introduction
The structure of a correlation matrix has important implications in some
statistical analyses, most notably, in principal components analysis. In particu-
lar, it is quite common in practice to nd that a sample correlation matrix has
a few relatively large eigenvalues that are well separated, while the remaining
eigenvalues are fairly small and close together. This would then suggest that
the underlying population correlation matrix may be well modeled by the more
parsimonious model in which the correlation matrix has a multiple eigenvalue
as its smallest eigenvalue.
Let denote an mm correlation matrix and suppose that
1

m
are its eigenvalues, while y
1
, . . . , y
m
are corresponding orthonormal eigenvectors.
The particular structure that we are interested in has
1
> >
s
and
s+1
=
=
m
= for some s; that is, the smallest eigenvalue of has multiplicity
r = ms. In this case, the correlation matrix may be expressed as
= Y
1
Y

1
+ Y
2
Y

2
, . . . (1)
where Y
1
= (y
1
, . . . , y
s
), Y
2
= (y
s+1
, . . . , y
m
), and = diag(
1
, . . . ,
s
). Note
that if we dene
d
= diag(d
1
, . . . , d
s
), where d
i
=
i
, then model (1) can
be equivalently written as = Y
1

d
Y

1
+ I
m
.
Paper received. April 1998; revised August 1998.
AMS (1991) subject classication. 62H15, 62H12.
Key words and phrases. Minimum chi-squared estimator; minimum chi-squared test; principal
components analysis;
216 james r. schott
The adequacy of model (1) is usually assessed through the sequence of tests
of the null hypotheses, H
0k
:
k+1
= =
m
, starting with k = m2 and then
continually decreasing k by 1 until a null hypothesis is rejected. If the initial
null hypothesis is rejected, then model (1) does not t; otherwise the model
ts with the choice of s given by the smallest value of k for which H
0k
was
accepted. Bartletts statistic (Bartlett, 1951) can be used to test H
0k
, although
the asymptotic null distribution of this statistic is not chi-squared so that an
approximation based on some of its moments must be used (Lawley 1956, Schott
1988). More recently Schott (1996) obtained a statistic for testing H
0k
that does
have an asymptotically chi-squared null distribution.
If
s+1
= =
m
= , then a simple consistent estimator of is given by

l = (l
s+1
+ +l
m
)/(ms), where l
1
l
m
are the eigenvalues of the sample
correlation matrix. Finding a suitable estimator of under the restriction that
its smallest eigenvalue has multiplicity ms is, however, not so simple, except
when s = 1. In this paper, we develop the minimum chi-squared test for testing
H
0k
. Unlike the other tests for H
0k
mentioned above, this test naturally leads
to an estimator of satisfying (1) that can then be used to estimate when
H
0k
is accepted.
2. The Minimum Chi-squared Test
Let R be the sample correlation matrix computed from a random sample,
x
1
, . . . , x
n
, of m 1 vectors and let v(R) denote the m(m 1)/2 1 vector
obtained by stacking the columns in the strictly lower triangular portion of R,
one underneath the other. The asymptotic distribution of n
1
2
{ v(R) v()}
is normal with zero mean vector and covariance matrix W. When sampling
from a normal distribution, this covariance matrix can be expressed as W =
1
2

L
m
(I
m
2 + K
mm
)(I
m
2 + K
mm
)

m
, where
= (I
m
)
m
()()
m
(I
m
)+(I
m
)
m
()
m
(I
m
),
K
mm
is a commutation matrix,

L
m
is an elimination matrix (see Magnus 1988,
Section 6.5),
m
=

m
i=1
(e
i,m
e

i,m
e
i,m
e

i,m
), and e
i,m
is the m1 vector with
one in the ith position and zeros elsewhere. An expression for W when sampling
from a nonnormal population can be found in Magnus (1988, Section 10.11).
We will let
k
denote the collection of all m m correlation matrices for
which the smallest eigenvalue has multiplicity k. Then the minimum chi-squared
test statistic for H
0k
is given by
t
k
= min

k
n{ v(R) v()}

M{ v(R) v()},
where M is an m(m1)/2 m(m1)/2 covariance matrix. The matrix in
k
,
say

k
, which yields the minimum t
k
is then a minimum chi-squared estimator
smallest eigenvalue of a correlation matrix 217
of under H
0k
. The asymptotic covariance matrix of n
1
2
v(

k
) is minimized
by choosing W
1
or a consistent estimator of W
1
for M (see, for example,
Ferguson 1996, Chapter 23), in which case the estimator

k
is asymptotically
ecient. Throughout the remainder of this section, we will let

W
1
denote
some consistent estimator of W
1
. In the simulation and example that appear
in the next section, our choice of

W
1
was the inverse of the simple estimator

W obtained by replacing R for in the formula for W.


The set
k
consists of all m m matrices of the form = Y
1

d
Y

1
+ I
m
,
where Y
1
= (y
1
, . . . , y
k
) and
d
= diag(d
1
, . . . , d
k
), subject to the constaints
Y

1
Y
1
= I
k
and
ii
= 1, with
ii
being the (i, i)th element of . In addition,
we must have > 0 and d
i
> 0 for each i, but in all of our simulations and
examples, we found that these two constraints did not need to be imposed. Note
that the constraint on the diagonal elements of can be expressed as (Y
1
Y
1
)d+
j
m
= j
m
, where denotes the Hadamard product, d = (d
1
, . . . , d
k
)

, and j
m
is the m1 vector of ones. Consequently, the estimator

k
can be obtained by
minimizing the Lagrangian function
f(Y
1
, d, ) = v{R (Y
1

d
Y

1
+ I
m
)}


W
1
v{R (Y
1

d
Y

1
+ I
m
)}
+tr{(Y

1
Y
1
I
k
)} +

{(Y
1
Y
1
)d (1 )j
m
},
where is a k k symmetric matrix of Lagrange multipliers and is an m1
vector of Lagrange multipliers. Dierentiating f with respect to vec(Y
1
), d, and
, and then equating to zero, leads to the equations
(
d
Y

1
I
m
)(I
m
+ K
mm
)

m

W
1
v{R (Y
1

d
Y

1
+ I
m
)}
vec(Y
1
) vec(Y
1
d

) = 0,
. . . (2)
2
k
(Y

1
Y

1
)

m

W
1
v{R (Y
1

d
Y

1
+ I
m
)} (Y

1
Y

1
) = 0, . . . (3)
j

m
= 0, . . . (4)
where
k
=

k
i=1
e
i,k
(e

i,k
e

i,k
). Equations (2)(4) can be used along with the
constraints, Y

1
Y
1
= I
k
and (Y
1
Y
1
)d (1 )j
m
= 0, to nd the solutions
for Y
1
, d, and , which we will denote by

Y
1
,

d, and

. These can then be
used to calculate the minimum chi-squared estimator as

k
=

Y
1

Y

1
+

I
m
,
where

d
= diag(

d
1
, . . . ,

d
k
). Explicit expressions for these solutions are not
possible, so numerical methods must be used. Note that equations (2)(4) can
be written in the form, H = g, where is a vector containing all of the Lagrange
multipliers, that is,
=

v()

,
v() is the k(k +1)/2 1 vector obtained by stacking the columns in the lower
triangular portion of , and the {k(m + 1) + 1} {m + k(k + 1)/2} matrix H
and the {k(m + 1) + 1} 1 vector g have elements unrelated to those in v()
218 james r. schott
and . For xed Y
1
, d, and , and consequently, xed H and g, let

denote
a least squares solution to H = g so that (H g)

(H g) is minimized at
=

. Note that if H, g, and

are computed using the true solutions

Y
1
,

d,

, then (H

g)

(H

g) = 0 since equations (2)(4) must hold. Thus, our
solutions for Y
1
, d, and will also be the solutions to
(H

g)

(H

g)+{(Y
1
Y
1
)d(1)j
m
}

{(Y
1
Y
1
)d(1)j
m
} = 0, . . . (5)
as long as the matrix Y
1
satises the semi-orthogonal condition Y

1
Y
1
= I
k
.
Numerical methods can then be used to nd the semi-orthogonal matrix Y
1
, d,
and that satisfy equation (5). For instance, in the example and simulations
discussed later in this paper, we used the downhill simplex method (see, for
example, Press, Flannery, Teukolsky, and Vetterlin 1992).
It follows from the general theory of minimum chi-squared tests (Ferguson
1996, Chapter 24) that, under H
0k
, t
k
has an asymptotic chi-squared distribu-
tion. The degrees of freedom will be given by the dierence in the number of free
parameters for in the general unrestricted case and when H
0k
holds. Under
H
0k
, we have k + 1 distinct eigenvalues,

k
i=1
(m i) parameters to determine
the orthonormal eigenvectors corresponding to the k largest eigenvalues of ,
and the m constraints to guarantee that each diagonal element of equals one.
Consequently, the degrees of freedom for t
k
will be given by

k
=
1
2
m(m1) {(k + 1) +
1
2
m(m1)
1
2
(mk)(mk 1) m}
=
1
2
(mk)(mk + 1) 1.
3. Some Empirical Results
3.1 Simulations. A small simulation study was performed so as to get some
idea of the adequacy of the chi-squared approximation to the asymptotic null
distribution of t
k
for nite samples sizes. The actual type I error probability was
estimated for m = 4, 8, s = 1, 2, 3, and several choices of the sample size. In the
simulations, each component of the eigenvectors, y
1
, . . . , y
s
, was m

1
2
. All of
the components of y
1
were positive, only the odd components of y
2
were positive,
while y
3
had only its rst m/2 components positive. In each case, the nominal
signicance level used was 0.05 and the estimated signicance level computed
was based on 1000 simulations.
The results of the simulations are given in Table 1. For the three cases for
which the smallest eigenvalue equals 0.5, we found that the estimated signi-
cance levels exceeded the nominal level. The other cases, in which this smallest
eigenvalue is less than 0.5, produced signicance level below the nominal level.
For xed values of p and n, the chi-squared approximation seems to improve as r
increases, that is, as s decreases. The convergence to the asymptotic distribution
can be rather slow in some cases as illustrated in portion (vi) of Table 1.
smallest eigenvalue of a correlation matrix 219
Table 1. Estimated probabilities of type I error when the
nominal significance level is 0.05
m=4 m=8
n=30 n=50 n=100 n=50 n=100 n=150
(i) 0.088 0.059 0.066 (iv) 0.072 0.055 0.053
(ii) 0.027 0.034 0.047 (v) 0.097 0.081 0.083
(iii) 0.008 0.026 0.038 (vi) 0.005 0.012 0.011
Parameter settings (
1
, . . . ,
s
, ) : (i)(2.5, 0.5); (ii)(2.0, 1.5, 0.25); (iii)(2.3, 1.5, 0.1);
(iv)(4.5, 0.5); (v)(3.0, 2.0, 0.5); (vi)(2.5, 2.0, 1.5, 0.4).
3.2 An example. We will use the open/closed book data set given in Mardia,
Kent & Bibby (1979) to illustrate the methods developed in this paper. This
particular data set consists of scores on each of ve exams corresponding to
ve dierent subjects, Mechanics, Vectors, Algebra, Analysis, and Statistics.
The rst two of these exams were closed-book exams, while the last three were
open-book. Scores were available for each of 88 students.
The sample correlation matrix computed from this data is
R =

1.00 0.55 0.55 0.41 0.39


0.55 1.00 0.61 0.49 0.44
0.55 0.61 1.00 0.71 0.66
0.41 0.49 0.71 1.00 0.61
0.39 0.44 0.66 0.61 1.00

,
and its eigenvalues are 3.18, 0.74, 0.44, 0.39, and 0.25. Mardia et al performed
tests of the equality of the smallest eigenvalues of the underlying population
correlation matrix, approximating the null distribution of Bartletts statistic by
using the chi-squared distribution with eective number of degrees of freedom.
We will conduct a similar sequence of tests using the minimum chi-squared test.
We begin by testing the null hypothesis, H
03
:
4
=
5
. Our minimum
chi-squared test statistic is t
3
= 4.40 with degrees of freedom given by
3
= 2.
This corresponds to an observed signicance level greater than 0.1, and so we
proceed to the next hypothesis, H
02
:
3
=
4
=
5
. This hypothesis also seems
reasonable since t
2
= 9.38 which is only slightly larger than the 0.90 quantile of
the chi-squared distribution with
2
= 5 degrees of freedom. However, the last
of our null hypotheses, H
01
:
2
= =
5
, is rejected; the statistic t
1
= 27.36
when compared to the chi-squared distribution with
1
= 9 degrees of freedom
yields an observed signicance level less than 0.005.
The minimum chi-squared estimator of , under the restriction
3
=
4
=
5
,
is given by

2
=

1.00 0.68 0.56 0.45 0.41


0.68 1.00 0.62 0.53 0.49
0.58 0.62 1.00 0.67 0.66
0.45 0.53 0.67 1.00 0.69
0.41 0.49 0.66 0.69 1.00

.
220 james r. schott
Its three distinct values for eigenvalues are 3.31, 0.77, and 0.31 while nor-
malized eigenvectors corresponding to the two largest eigenvalues are given by
(0.41, 0.45, 0.48, 0.45, 0.44)

and (0.62, 0.44, 0.12, 0.40, 0.49)

. Thus, the rst


principal component could be described as an average of the ve test scores while
the second principal component is a variable that contrasts the closed-book and
open-book exam scores.
References
Bartlett, M. S. (1951). The eect of standardization on a
2
approximation in factor
analysis. Biometrika 38, 337-44.
Ferguson, T. S. (1996). A Course in Large Sample Theory. London: Chapman and Hall.
Lawley, D. N. (1956). Tests of signicance for the latent roots of covariance and correlation
matrices. Biometrika 43,128-36.
Magnus, J. R. (1988). Linear Structures. New York: Oxford University Press.
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. London:
Academic Press.
Press, W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. (1992). Nu-
merical Recipes in Fortran (2nd ed.). Cambridge, U. K.: Cambridge University Press.
Schott, J. R. (1988). Testing the equality of the smallest latent roots of a correlation matrix.
Biometrika 75, 794-6.
(1996). Eigenprojections and the equality of latent roots of a correlation matrix.
Comp. Statist. & Data Anal. 23, 229-38.
James R. Schott
Department of Statistics
University of Central Florida
Orlando, Fl 32816-2370
e-mail : Schott@cs.ucf.edu

You might also like