You are on page 1of 9

w e extend the transient analysis of the earlier chapters to more general filter recursions,

starting with the normalized LMS algorithm.


25.1 NLMS FILTER
We thus consider the 6-NLMS recursion
for which the data normalization in (22.2) is given by
S[Ui] = + llUiIl2
(25.1)
(25.2)
In this case, relations (22.26)-(22.27) and (22.29) become
and we see that we need to evaluate the moments
Unfortunately, closed form expressions for these moments are not available in general,
even for Gaussian regressors. Still, we will be able to show that the filter is convergent in
the mean and is also mean-square stable for step-sizes satisfying p < 2, and regardless
of the input distribution (Gaussian or otherwise) - see App. 25.B. We therefore treat the
general case directly. Since the arguments are similar to those in Chapter 24 for LMS, we
shall be brief.
Thus, introduce the M2 x 1 vectors
A A
u = vec(C), r = vec(R,)
371
Adaptive Filters, by Ali H. Sayed
Copyright @ 2008 John Wiley & Sons, Inc
as well as the M2 x M2 matrices
372
-
wi A
CHAPTER 25
DATA-
NORMALIZED
FILTERS
- E llwillz
E IIwiIIL
E I I ~ i I I $ z u (25.1 1)
E I I ~ i l l F ( MZ - l ) u - 2
= A
[ ( ; : : l 12) T @ ( E $:,12)]
and the M x M matrix
(25.5)
(25.6)
The matrix A is positive-definite, while B is nonnegative-definite - see Prob. V.6. Apply-
ing the vec notation to both sides of the expression for C' in (25.3) we find that it reduces
to
d = F r (25.7)
where F is M2 x M2 and given by
A
F = I - p A + p 2 B (25.8)
Moreover, the recursion for E w i can be written as
The same arguments that were used in Sec. 24.1 will show that the mean-square behavior
of E-NLMS is characterized by the following M2-dimensional state-space model:
where 3 is the companion matrix
3=
with
0 1
0 0 1
0 0 0 1
0 0 0 1
-Po -p1 -p2 ... -PMz-l
( M2 x M2)
Ma-1
p(z) = A det(z1- F ) = xM2 +
pkx'
k=O
and the k-th entry of Y is given by
373
-
SECTION 25.2
NLMS FILTER
k = 0' 1, . . . , M2 - 1 (25.12)
The definitions of {Wi, Y } are in terms of any o of interest, e.g., most commonly, o = q
or u = r. It is shown in App. 25.B that any p < 2 is a sufficient condition for the stability
of (25.10).
Theorem 25.1 (Stability of E-NLMS) Consider the E-NLMS recursion (25.1)
and assume the data { d( i ) , ui } satisfy model (22.1) and the independence
assumption (22.23). The regressors need not be Gaussian. Then the filter
is convergent i n the mean and is also mean-square stable for any p < 2.
Moreover, the transient behavior of the filter is characterized by the state-
space recursion (25.10)-(25.12), and the mean-square deviation and the ex-
cess mean-square error are given by
where r = vec(R,) and q = vec(1).
The expressions for the MSD and EMSE in the statement of the theorem are derived in
a manner similar to (23.51) and (23.56). They can be rewritten as
~ MSD = p20:Tr(SCm,d) and EMSE = p20: Tr ( SEemq
~
where
)
S A E ( ufu2
( 6 + llu%112)2
and the weighting matrices {Cmsd, C, , , , } correspond to the vectors Umsd = (I - F) - l q
and gems, = (I - F) - ' r. That is,
Cmsd = vec-l(um,d) and c,,,, = vec-'(oemSe)
Learning Curve
Observe that since E lea(i)I2 = E llt&-. l~l~u, we again find that the time evolution of
E iea(i)12 is described by the top entry of the state vector Wi in (25.10)-(25.12) with o
chosen as u = r. The learning curve of the filter will be E le(i)i2 = 0," + E lea(i)12.
Small Step-Size Approximations
Several approximations for the EMSE and MSD expressions that appear in the above theo-
rem are derived in Probs. V.18-V.20. The ultimate conclusion from these problems is that
for small enough , Y and E, we get
1
2
1
2
EMSE = &E (m) Tr(R,) and MSD = B E (m)
2 - P 2 - P
I
(25.13)
The expression for the EMSE is the same we derived earlier in Lemma 17.1.
- 374 Gaussian Regressors
CHAPTER 25
NORMALIZED
FILTERS
If the regressors happen to be Gaussian, then it can be shown that the M2-dimensional
state-space model (25.10)-(25.12) reduces to an M-dimensional model - this assertion
is proved in Probs. V.16 and V.17.
DATA-
25.2 DATA-NORMALIZED FILTERS
The arguments that were employed in the last two sections for LMS and 6-NLMS are
general enough and can be applied to adaptive filters with generic data nonlinearities of the
form (22.2)-(22.3). To see this, consider again the variance and mean relations (22.26)-
(22.27) and (22.29), which are reproduced below:
(25.14)
If we now introduce the M2 x M2 matrices
A (E [%IT@1) + (,,, [%I) (25.15)
B A E ( [ Z l T @ [ % ] ) (25.16)
and the M x M matrix
(25.17)
then
Et3i = (I - pP) Et 3i - l
and the expression for C' can be written in terms of the linear vector relation
6' = Fo (25.18)
where F is MZ x M2 and given by
A
F = I - p A + p 2 B (25.19)
Let
H = [ -:I2 ] ( 2M2 x 2M2) (25.20)
Then the same arguments that were used in Chapter 24 will lead to the statement of
Thm. 25.2 listed further ahead. The expressions for the MSD and EMSE in the state-
ment of the theorem are derived in a manner similar to (23.51) and (23.56). They can be
rewritten as
I MSD = p2a:Tr(SC,,d) and EMSE = p2a:Tr(SCe,,e) I
where
S = E ( ~ b ~ i / g ~ [ ~ i ] )
and the weighting matrices {Cmsd, C,,,,} correspond to the vectors Omsd = (I - F) - l q
and gemse = (I - F) - ~ T. That is,
Cmsd = VeC-l(Om,d) and C,,,, = VeC-l(Oemse)
Theorem 25.2 (Stability of data-normalized filters) Consider data normalized
adaptive filters of the form (22.2)-(22.3), and assume the data { d( i ) , ui} sat-
isfy model (22.1) and the independence assumption (22.23). Then the filter
is convergent in the mean and is mean-square stable for step-sizes satisfying
0 < p < min{2/Xm,(P), l/Xmax(A-'B), l/max{X(H) E R'}}
where the matrices {A, B, P, H} are defined by (25.15)-(25.17) and (25.20)
and B is assumed finite. Moreover, the transient behavior of the filter is
characterized by the M2-dimensional state-space recursion Wi = W i - 1 +
p 2 g: y , where F is the companion matrix
3=
with
0 1
0 0 1
0 0 0 1
( M2 x M2 )
MZ-1
A
p ( z ) = det (s1-F) =s MZ +
p k z k
k=O
denoting the characteristic polynomial of F in (25.19). Also,
A
w, =
for any 0 of interest, e.g., g = q or o = T . In addition, the mean-square
deviation and the excess mean-square error are given by
375
SECTION 25.2
DATA-
NORMALIZED
FILTERS
where T = vec(R,) and q = vec(1).
- 376 Learning Curve
CHAPTER 25
NORMALIZED
FILTERS
As before, since E lea(i)i2 = E lIGi-1 Ilku, we find that the time evolution of E lea(z)l2 is
described by the top entry of the state vector Wi in with o chosen as o = T . The learning
curve of the filter will be E le(i)I2 = c: + E lea(i)I2.
DATA-
Small Step-Size Approximations
In Rob. V.39 it is shown that under a boundedness requirement on the matrix B of fourth
moments, data-normalized adaptive filters can be guaranteed to be mean-square stable for
sufficiently small step-sizes. That is, there always exists a small-enough step-size that lies
within the stability range described in Thm. 25.2.
Now observe that the performance results of Thm. 25.2 are in terms of the moment
matrices { A, B, P}. These moments are generally hard to evaluate for arbitrary input
distributions and data nonlinearities g[.). However, some simplifications occur when the
step-size is sufficiently small. This is because, in this case, we may ignore the quadratic
term in p that appears in the expression for C' in (25.14), and thereby approximate the
variance and mean relations by
where P is as in (25.17). Using the weighting vector notation, we can write
(25.21)
(25.22)
F = I - p A (25.23)
where now
A=( P~CN) +( I BP)
The variance relation (25.22) would then lead to the following approximate expressions
for the filter EMSE and MSD:
EMSE = p2aiTr(SCemse) and MSD = p20~Tr(sCmsd)
where
and the weighting matrices {Cemse, Cmsd} correspond to the vectors oemse = A-'vec( R,)/p
and cmsd = A-lvec(I)/p. That is, {Cemse, Cmsd} are the unique solutions of the Lya-
punov equations
S = E (~tui/g~[~i])
p p z ms d + PCrnsdP = I
and
p p z e ms e -I- p x e ms e p = Ru
It is easy to Verify that Cmsd = p-'P-'/2 so that the performance expressions can be
rewritten as
I EMSE = p20:Tr(SCemse),
MSD = pa:Tr(SP-')/2 1
Remark 25.1 (Filters with error nonlinearities) There is more to say about the transient per-
formance of adaptive filters, especially for filters with error nonlinearities in their update equations.
This is a more challenging class of filters to study and their performance is examined in App. 9.C
of Sayed (2003) by using the same energy-conservation arguments of this part. The derivation used
i n that appendix to study adaptive filters with error nonlinearities can also be used to provide an
alternative simplified transient analysis for data-normalized filters. The derivation is based on a long
filter assumption in order to justify a Gaussian condition on the distribution of the a priori error
signal. Among other results, it is shown in App. 9.C of Sayed (2003) that the transient behavior of
data-normalized filters can be approximated by an M-dimensional linear time-invariant state-space
model even for non-Gaussian regressors. Appendix 9.E of the same reference further examines the
learning abilities of adaptive filters and shows, among other interesting results, that the learning
behavior of LMS cannot be fully captured by relying solely on mean-square analysis!
0
25.A APPENDIX: STABILITY BOUND
Consider a matrix F of the form F = I - pA+p2 B with A > 0, B 2 0, and p > 0. Matrices of this
form arise frequently in the study of the mean-square stability of adaptive filters (see, e.g., (25.19)).
The purpose of this section is to find conditions on p in terms of { A, B } in order to guarantee that
all eigenvalues of F are strictly inside the unit circle, i.e., so that - 1 < X(F) < 1.
To begin with, in order to guarantee X(F) < 1 , the step-size p should be such that (cf. the
Rayleigh-Ritz characterization of eigenvalues from Sec. B. 1):
max z * ( I - p A+ p 2 B) z < 1
/ l xl l =l
or, equivalently, A - p B > 0. The argument in parts (b) and (c) of Rob. V.3 then show that this
condition holds if, and only if,
/I < l/Xrnax(A-'B) (25.24)
Moreover, in order to guarantee X(F) > -1, the step-size p should be such that
min z * ( I - p A+ p 2 B) z > -1
11~11=1
A
or, equivalently, G( p) = 21 - p A + pZB > 0. When p = 0, the eigenvalues of G are all
positive and equal to 2. As p increases, the eigenvalues of G vary continuously with p. Indeed, the
eigenvalues of G(p) are the roots of det[XI - G(p)] = 0. This is a polynomial equation in X and its
coefficients are functions of p. A fundamental result in function theory and matrix analysis states that
the zeros of a polynomial depend continuously on its coefficients and, consequently, the eigenvalues
of G(p) vary continuously with p. This means that G( p) will first become singular before becoming
indefinite. For this reason, there is an upper bound on p, say, prnax, such that G( p) > 0 for all
p < pmax. This bound on p is equal to the smallest value of p that makes G( p) singular, i.e., for
which det[G(p)] = 0. Now note that the determinant of G( p) is equal to the determinant of the
block matrix
since
det ([
I) = det(Z)det(X - WZ-lY)
whenever 2 is invertible. Moreover, since we can write
we find that the condition det[K(p)] = 0 is equivalent to det(1- pH) = 0, where
377
-
SECTION 25.A
STABILITY
BOUND
378
CHAPTER 25
DATA-
NORMALIZED
FILTERS
In this way, the smallest positive p that results in det[K(p)] = 0 is equal to
1
max{X(H) E Rt}
in terms of the largest positive real eigenvalue of H when it exists.
The results (25.24X25.25) can be grouped together to yield the condition
min
(25.25)
(25.26)
If H does not have any real positive eigenvalue, then the corresponding condition is removed and we
only require p < l/Xm,(A-lB). The result (25.26) is valid for general A > 0 and B 2 0. The
above derivation does not exploit any particular structure in the matrices A and B defined by (25.16).
25.B APPENDIX: STABILITY OF NLMS
The purpose of this appendix is to show that for E-NLMS, any p < 2 is sufficient to guarantee
mean-square stability. Thus, refer again to the discussion in Sec. 25.1 and to the definitions of the
matrices {A, B, P, F } in ( 254x253) . We already know from the result in App. 25.A that stability
in the mean and mean-square senses is guaranteed for step-sizes in the range
1 1
Xmax(A-'B)' max { X( H) E lF+}
where the third condition is in terms of the largest positive real eigenvalue of the block matrix,
A A/2 - B/ 2
. = [ I - . 0 ]
The first condition on p, namely, p < 2/Xma,(P), guarantees convergence in the mean. The second
condition on p, namely, p < 1/Xmax(A-'B), guarantees X(F) < 1. The last condition, p <
1/ max {X(H) E R'}. enforces X(F) > -1. The point now is that these conditions on p are met
by any p < 2 (i.e., F is stable for any p < 2). This is because there are some important relations
between the matrices {A, B, P} in the E-NLMS case. To see this, observe first that the term
which appears in the expression (25.6) for P is generally a rank-one matrix (unless ui = 0); it has
M - 1 zero eigenvalues and one possibly nonzero eigenvalue that is equal to8 ~ ~ ~ i ~ / ~ / ( ~ + 11ui11*).
This eigenvalue is less than unity so that
Xmax (-) 5 (25.28)
Now recalling the following Rayleigh-Ritz characterization of the maximum eigenvalue of any Her-
mitian matrix R (from Sec. B.1):
(25.29)
*Every rank-one matrix of the form xx', where x is a column vector of size M, has M - 1 zero eigenvalues and
one nonzero eigenvalue that is equal to 1 1 ~ 1 1 ~ .
we conclude from (25.28) that
Applying the same characterization (25.29) to the matrix P in (25.6), and using the above inequality,
we find that
A
x,,,(P) = max x*Px =
llxll=1
I 1 (25.30)
In other words, the maximum eigenvalue of P is bounded by one, so that the condition p <
2/Xm,,(P) can be met by any p < 2.
Let us now examine the condition p < l / Xmax ( A- l B) . Using again the fact that the matrix in
(25.27) has rank one, it is shown in Prob. V.8 that
2[(*)TS(*)] E + 1 I ~ ~ 1 l 2 E + llU~ll2 I [(-)T.l] + [rS(-)]
(25.31)
Taking expectations of both sides, and using the definitions (25.4)-(25.5) for A and B, we conclude
that 2B - A I 0 so that the condition p < l / Xmax( A- ' B) can be met by any p satisfying p < 2.
What about the third condition on p in terms of the positive eigenvalues of the matrix H? It turns
out that p < 2 is also sufficient since it already guarantees mean-square convergence of the filter, as
can be seen from the following argument. Choosing C = I in the variance relation (25.3) we get
where
Obviously, S 5 P so that C' 5 I - 2 p P + p2P and, hence,
Now from the result of part (a) of Prob. V.6 we know that R, > 0 implies P > 0. We also know
from (25.30) that A,,,(P) < 1. Therefore, all the eigenvalues of P are positive and lie inside the
open interval ( 0, l ) . Moreover, over the interval 0 < p < 2, the following quadratic function of p,
k( p) e 1 - 2px +pZX
I - 2 p ~ + /J'P I [I - 2p~mi n( p) + / ~ ~ x r n i n ( p ) ] ~
assumes values between 1 and 1 - X for each of the eigenvalues X of P. Therefore, it holds that
from which we conclude that
379
-
SECTION 25.B
STABILITY
OF NLMS
where the scalar coefficient (Y = 1 - 2pAmin(P) + p2X,i,(P) is positive and strictly less than one
for 0 < p < 2. It then follows that E 11.iii112 remains bounded for all i.

You might also like