You are on page 1of 5

AN ELEMENTARY APPROACH TO THE PROBLEM OF COLUMN

SELECTION IN A RECTANGULAR MATRIX

arXiv:1509.00748v1 [math.FA] 2 Sep 2015

STPHANE CHRTIEN

AND

SBASTIEN DARSES

Abstract. The problem of extracting a well conditioned submatrix from any rectangular
matrix (with normalized columns) has been studied for some time in functional and harmonic analysis; see [1, 4, 6] for methods using random column selection. More constructive
approaches have been proposed recently; see the recent contributions of [3, 7]. The column
selection problem we consider in this paper is concerned with extracting a well conditioned
submatrix, i.e. a matrix whose singular values all lie in [1 , 1 + ]. We provide individual lower and upper bounds for each singular value of the extracted matrix at the price of
conceding only one log factor in the number of columns, when compared to the Restricted
Invertibility Theorem of Bourgain and Tzafriri. Our method is fully constructive and the
proof is short and elementary.

1. Introduction
Let X Rnp be a matrix such that all columns of X have unit euclidean 2 -norm. We
denote by kxk2 the 2 -norm of a vector x and by kXk (resp. kXkHS ) the associated operator
norm (resp. the Hilbert-Schmidt norm). Let XT denote the submatrix of X obtained by
extracting the columns of X indexed by T {1, . . . , p}. For any real symmetric matrix A, let
k (A) denote the k-th eigenvalue of A, and we order the eigenvalues as 1 (A) 2 (A) .
We also write min (A) (resp. max (A)) for the smallest (resp. largest) eigenvalue of A. We
finally write |S| for the size of a set S.
The problem of well conditioned column selection that we condider here consists in finding
the largest subset of columns of X such that the corresponding submatrix has all singular
values in a prescribed interval [1 , 1 + ]. The one-sided problem of finding the largest
possible T such that min (XTt XT ) 1 is called the Restricted Invertibility Problem and
has a long history starting with the seminal work of Bourgain and Tzafriri [1]. Applications of
such results are well known in the domain of harmonic analysis [1]. The study of the condition
number is also a subject of extensive study in statistics and signal processing [5].
Here, we propose an elementary approach to this problem based on two simple ingredients:
(1) Choosing recursively y V, the set of remaining columns of X, verifying
1 X
Q(x),
Q(y)
|V|
xV

where Q is a relevant quantity depending on the previous chosen vectors;


(2) a well-known equation (sometimes called secular equation) whose roots are the eigenvalues of a square matrix after appending a row and a line.
We obtain a slightly weaker bound (up to log) regarding the involved largest subset of columns,
but also a more precise result: equispaced upper and lower bounds for all ordered individual
singular values of the extracted matrix XT .
1

STPHANE CHRTIEN AND SBASTIEN DARSES

1.1. Historical background. Concerning the Restricted Invertibility problem, Bourgain and
Tzafriri [1] obtained the following result for square matrices:
Theorem 1.1 ([1]). Given a p p matrix X whose columns have unit 2 -norm, there exists
p
T {1, . . . , p} with |T | d
such that C min (XTt XT ), where d and C are absolute
kXk2
constants.
See also [4] for a simpler proof. Vershynin [6] generalized Bourgain and Tzafriris result to
the case of rectangular matrices and the estimate of |T | was improved as follows.
e be the matrix obtained from X by
Theorem 1.2 ([6]). Given a n p matrix X and letting X
2 -normalizing its columns. Then, for any (0, 1), there exists T {1, . . . , p} with
|T | (1 )

kXk2HS
kXk2

et X
e
et e
such that C1 () min (X
T T ) max (XT XT ) C2 ().

Recently, Spielman and Srivastava proposed in [3] a deterministic construction of T which


allows them to obtain the following result.
Theorem 1.3 ([3]). Let X be a p p matrix and (0, 1). Then there exists T {1, . . . , p}
2
kXk2HS
2 kXk
such
that

min (XTt XT ).
with |T | (1 )2
kXk2
p
The technique of proof relies on new constructions and inequalities which are thoroughly
explained in the Bourbaki seminar of Naor [2]. Using these techniques, Youssef [7] improved
Vershynins result as:

e be the matrix obtained from X


Theorem 1.4 ([7]). Given a n p matrix X and letting X
by 2 -normalizing its columns. Then, for any (0, 1), there exists T {1, . . . , p} with
2 kXk2HS
et X
e
et e
such that 1 min (X
|T |
T T ) max (XT XT ) 1 + .
9 kXk2
1.2. Our contribution. We propose a short and elementary proof of the following result:

Theorem 1.5. Given a np matrix X whose columns have unit 2 -norm, a constant (0, 1)
there exists T {1, . . . , p} with |T | R and
(1.1)

R log R

2
p
,
4(1 + ) kXk2

such that 1 min (XTt XT ) max (XTt XT ) 1 + .

Notice that when the columns of X have unit 2 -norm, we have kXkHS = Tr(XX t ) = p.
The price to pay for this short proof is a log factor in (1.1), but we are able to obtain an
individual control of each eigenvalue, see Lemma 2.2, which might be interesting in its own
right.
2. Proof of Theorem 1.5
2.1. Suitable choice of the extracted vectors. Consider the set of vectors V0 = {x1 , . . . , xp }.
At step 1, choose y1 V0 . By induction, let us be given y1 , . . . , yr at step r. Let Yr denote

AN ELEMENTARY APPROACH TO THE PROBLEM OF COLUMN SELECTION

the matrix whose columns are y1 , . . . , yr and let vk be an unit eigenvector of Yrt Yr associated
to k,r := k (Yrt Yr ). Let us choose yr+1 Vr := {x1 , . . . , xp } \ {y1 , . . . , yr } so that
r
r P
r
X
(vkt Yrt yr+1 )2
1 X X (vkt Yrt x)2
1 X xVr (vkt Yrt x)2
(2.2)

=
.
k
pr
k
pr
k
xVr k=1

k=1

k=1

Lemma 2.1. For all r 1, yr+1 verifies

r
X
(v t Yrt yr+1 )2
k

k=1

1,r kXk2 log(r)


.
pr

P
Proof. Let Xr be the matrix whose columns are the x Vr , i.e. Xr Xrt = xVr xxt . Then
X

(vkt Yrt x)2 = Tr Yr vk vkt Yrt Xr Xrt Tr(Yr vk vkt Yrt )kXr Xrt k k,r kXk2 ,
xVr

which yields the conclusion by plugging in into (2.2) since k,r 1,r .

2.2. Controlling the individual eigenvalues. Let us define as


s
(1 + )kXk2 log R
=
,
p

so that, from (1.1), 2 R .


Lemma 2.2. For all r and k with 1 k r R, we have
r+k1
2r k
1
(2.3)
k,r 1 + .
r
r
Proof. It is clear that (2.3) holds for r = 1 since then, 1 is the only singular value because the
columns are supposed to be normalized.
Assume the induction hypothesis (Hr ): for all k with 1 k r < R, (2.3) holds.
Let us then show that (Hr+1 ) holds. By Cauchy interlacing theorem, we have
k+1,r+1 k,r , 1 k r
k+1,r+1 k+1,r , 0 k r 1.

Using (r + 1)(2r k)2 r(2r + 1 k)2 and (r + 1)(r + k)2 r(r + 1 + k)2 , we thus deduce
2(r + 1) (k + 1)
2r k

1+
k+1,r+1 1 +
, 1 k r,
r
r+1
(r + 1) + (k + 1) 1
r+k

k+1,r+1 1 1
, 0 k r 1.
r
r+1

It remains to obtain the upper estimate for 1,r+1 and the lower one for r+1,r+1 . We write



 t
t Y

1
yr+1
yr+1 
r
t
yr+1 Yr =
(2.4)
,
Yr+1 Yr+1 =
Yrt yr+1 Yrt Yr
Yrt
t Y
and it is well known that the eigenvalues of Yr+1
r+1 are the zeros of the secular equation:

(2.5)

q() := 1 +

r
X
(v t Y t yr+1 )2
k r

k=1

k,r

= 0.

STPHANE CHRTIEN AND SBASTIEN DARSES

We first estimate 1,r+1 which is the greatest zero of q, and assume for contradiction that

(2.6)
1,r+1 > 1 + 2 r.

From (Hr ), we then obtain that for 1 + 2 r 1,r + / r,


X
r
(vkt Yrt yr+1 )2
r
:= g().
q() 1 +

k
k=1

be the zero of g. We have g(1,r+1 ) q(1,r+1 ) = 0 = g(0 ). But g is decreasing, so


X
r
(vkt Yrt yr+1 )2
r
1,r+1 0 = 1 +
.

k
k=1

By (Hr ), 1,r 1 + 2 R 1 + . Thus, using Lemma 2.1 and noting that r p/2,

2 r (1 + )kXk2 log(R)
1,r+1 1 +
= 1 + 2 r,

which yields a contradiction with the inequality (2.6). Thus, we have that 1,r+1 1 + 2 r,
2r+1
and therefore, 1,r+1 1 +
. This shows that the upper bound in (Hr+1 ) holds.
r+1
Finally,
to estimate r+1,r+1
which is the smallest zero of q, we write using (Hr ) that for

1 2 r r,r / r,
X
r
(vkt Yrt yr+1 )2
r
:= ge().
q() 1

k
k=1

By means of the same reasonning as above, we prove by contradiction that r+1,r+1 12 r,


2r+1
which gives r+1,r+1 1
and shows that the lower bound in (Hr+1 ) holds. This
r+1
completes the proof of Lemma 2.2.


In particular, we have for all r R, 1,r 1 + 2 R 1 + and r,r 1 2 R 1 .


This concludes the proof of Theorem 1.5.
Let

Remark 2.3. Many other induction hypothesis may be proposed: k,r u(k, r), where u is
required to verify u(k, r) u(k + 1, r + 1). The criteria to choose the next vector yr+1 has then
to be modified accordingly.
For instance, it can also be proven that one can extract a submatrix

so that k,r 1 + r k. This yields as well the weaker bound with the log.
References
1. Bourgain, J. and Tzafriri, L., Invertibility of "large submatrices with applications to the geometry of
Banach spaces and harmonic analysis. Israel J. Math. 57 (1987), no. 2, 137224.
2. Naor, A., Sparse quadratic forms and their geometric applications [following Batson, Spielman and Srivastava]. Sminaire Bourbaki: Vol. 2010/2011. Exposs 10271042. Astrisque No. 348 (2012), Exp. No.
1033, viii, 189217.
3. Spielman, D. A. and Srivastava, N., An elementary proof of the restricted invertibility theorem. Israel J.
Math. 190 (2012), 8391.
4. Tropp, J., The random paving property for uniformly bounded matrices, Studia Math., vol. 185, no. 1, pp.
6782, 2008.
5. Tropp, J., Norms of random submatrices and sparse approximation. C. R. Acad. Sci. Paris, Ser. I (2008),
Vol. 346, pp. 1271-1274.
6. Vershynin, R., Johns decompositions: selecting a large part. Israel J. Math. 122 (2001), 253277.
7. Youssef, P. A note on column subset selection. Int. Math. Res. Not. IMRN 2014, no. 23, 64316447.

AN ELEMENTARY APPROACH TO THE PROBLEM OF COLUMN SELECTION

National Physical Laboratory, Hampton road, Teddington TW11 0LW, UK


E-mail address: stephane.chretien@npl.co.uk
LATP, UMR 6632, Universit Aix-Marseille, Technople Chteau-Gombert, 39 rue Joliot
Curie, 13453 Marseille Cedex 13, France
E-mail address: sebastien.darses@univ-amu.fr

You might also like