You are on page 1of 6

TSINGHUA SCIENCE AND TECHNOLOGY

ISSN 1007-0214 04/19 pp24-29


Volume 9, Number 1, February 2004

Wavelet Neural Networks for Adaptive Equalization by


Using the Orthogonal Least Square Algorithm*

JIANG Minghu ()1,**, DENG Beixing ()2, Georges Gielen3

1. Lab of Computational Linguistics, Department of Chinese Language, Tsinghua University, Beijing 100084, China;
2. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China;
3. Department of Electrical Engineering, K.U.Leuven, Kasteelpark Arenberg 10, B3001 Heverlee, Belgium

Abstract: Equalizers are widely used in digital communication systems for corrupted or time varying
channels. To overcome performance decline for noisy and nonlinear channels, many kinds of neural
network models have been used in nonlinear equalization. In this paper, we propose a new nonlinear
channel equalization, which is structured by wavelet neural networks. The orthogonal least square
algorithm is applied to update the weighting matrix of wavelet networks to form a more compact wavelet
basis unit, thus obtaining good equalization performance. The experimental results show that performance
of the proposed equalizer based on wavelet networks can significantly improve the neural modeling
accuracy and outperform conventional neural network equalization in signal to noise ratio and channel
non-linearity.
Key words: adaptive equalization; wavelet neural networks (WNNs); orthogonal least square (OLS)

Introduction works (WNNs)[5] have been successfully applied to


system identification and control, function approxima-
The aim of equalization is to find an unknown signal tion, pattern recognition, signal detecting and com-
from a corrupted and imperfect observation. Early pressing, and rapid classification of varying signals. In
equalization algorithms deal only with the case of lin- each case many promising results have been report-
ear operations whereas the equalizer performance of ed[6-11] on the powerful nonlinear ability of WNNs and
these algorithms is greatly degraded under a nonlinear their ability to sample arbitrarily complex decision re-
operation. In order to solve the problem, researchers gions. WNNs have a single hidden layer structure and
have used a number of nonlinear neural network have scaling functions and wavelets as activation
models, such as the multilayer perceptron (MLP)[1], functions. They can be understood as neural structures
the radial basis function (RBF) networks[2,3] and the which employ a wavelet layer to perform an adaptive
high order function neural networks[4], to carry out feature extraction in the time-frequency domain.
nonlinear channel equalization in a digital communi- These wavelet analyses in combination with neural
cation system. In the last decade, wavelet neural net- networks are used in feature extraction and dimension
reduction of the input space. WNNs can be used for
Received: 2002-06-24 classifying channel equalization disturbances. Not
Supported by the Tsinghua University Research Foundation, only weights, but also the parameters of the wavelet
the Excellent Young Teacher Program of the Ministry of
functions (translation, dilation) can be jointly fitted
Education, and the Returnee Science Research Startup Fund
from the input data. In the current study, the number
of the Ministry of Education of China
of wavelet functions may be chosen by the user and
To whom correspondence should be addressed.
E-mail: jiang.mh@tsinghua.edu.cn; the parameters are optimized by a learning process.
Tel: 86-10-62788647 The more wavelets used, the more precise the classi-
JIANG Minghuet alWavelet Neural Networks for Adaptive Equalization 25

fication. WNNs perform a nonlinear mapping on the state


obtained at the input layer such that more compact
1 Modeling of WNNs for Adaptive clusters are created for the different classes as the out-
Equalization by the OLS put of the network. The equalizer order, channel order
Algorithm and delay are the determining factors in establishing
the number of wavelet units. Increasing the equalizer
Assume that the transmitted sequence x(n) is passed order can improve the performance of the system to
through a dispersive channel, where the channel out- some extent, but too large an order is also concomitant
put y(n) is corrupted by an additive zero-mean white with increasing noise power in the equalizer input.
noise u(n) (independent of x(n)) with variance V n2 , Generally, the equalizer order is chosen to be not less
and then y(n) is equal to: than the channel delay[12]. The constructed WNNs are
y ( n ) g ( n)  u ( n) h ( n) x ( n )  u ( n ) expected to have the best performance for a certain
complexity or provide a certain performance level
Nc
h(i ) x(n  i )  u (n) (1) with a minimum complexity.
i 0 The WNNs are constructed to approximate the
where g(n) denotes the output of the noise-free chan- nonlinear discriminant function f ( ):
nel, and h(i) and Nc are the i-th channel coefficients, m l m 1
and the channel order, respectively. v(n) wk yk( n )  w j  m \ q ( j ),r ( j ) ( yk( n ) ) 
k 0 j 1 k 0
The role of an equalizer is to produce an estimate of
the x (n  d ) value from the observed data y(n),
s
m 1
wi  m l \ q (i ),r (i ) ( yk( n ) )  u (n)
y(n1),, y(nm+1). The integers m and d are known i 1 k 0
as the order and the decision delay of the equalizer. K

The signal x(n) is distorted while it propagates


wk pk( n )  u (n) (4)
k 0
through the nonlinear channel. The estimate x (n  d ) where K = l+s+m+1 is the number of wavelet basis
is given as units; l and s are the wavelet product and the sum of
v(n) sgn( x (n  d )) the number of basis units; wk and pk are the k-th
f ( y (n), y (n  1), , y (n  m  1)) (2) weight coefficient (equalizer coefficient) and wavelet
where f ( ) is a nonlinear discriminative function basis. Because the second and third terms in Eq. (4)
specified by the channel characteristic and noise. The can easily express any non-linearity, Eq. (4) can real-
WNNs are adopted to approximate this function by ize arbitrary nonlinear transformations within certain
using the minimum mean square error (MSE) accuracy ranges when l and s are large enough. The
criterion. number of the WNNs units grows exponentially with
It is well known that wavelet decomposition allows the channel filter length. Assume that the vector
us to decompose any function using a family of func- of the hidden layer P ( n ) { p0( n ) , p1( n ) ,  , pK( n ) } {1,
tions obtained by dilating and translating a single m1 m1
y1(n) , y2(n) ,, ym(n) , \q(1),r(1) ( yk(n) ),, \q(s),r(s) ( yk(n) )} is the
mother wavelet function \ ( y ) . Let P be a wavelet k 0 k 0
frame in the space L2(R1)[5] : input vector Y through wavelet functional expansion
P ^\ q,r ( y)` a r / 2 r
\ (a y  qb), to higher dimensions. The higher performance of the
WNN structure is easily obtained due to its capability
r Z, q Z1 (3) of forming more complex nonlinear discrimination
where b and a are shifts and dilations for each daugh- functions in the input pattern space and of using
ter wavelet, e.g., a = 2, b = 1. For the i-th pattern, the wavelet-hidden spaces with larger dimensions. The
input pattern is defined by output of WNNs performs a nonlinear mapping on the
Y (i) [ y(i), y(i  1),, y(i  m  1)] state obtained at the input layer such that more com-
pact clusters are created for the different classes. The
[ y1(i ) , y2(i ) , , ym(i ) ] . expanded model can easily yield a flat network solu-
tion W, which means the model is linearly separable.
26 Tsinghua Science and Technology, February 2004, 9(1): 2429

Although wavelet functional expansion can easily ob- dom-based approach. It does not produce the smallest
tain higher nonlinear ability, the expanded wavelet network for a given approximation accuracy. Instead,
basis is often redundant for estimating v(n). In practice, it tries to transform those good features which contain
such expansions are much too inefficient computa- only problem-specific information of the input pat-
tionally to reduce the number of relevant channel terns and remove much of the additional irrelevant
states. It is reasonable instead of using a limited ex- information. This approach leads to a simpler and
pansion in terms of a few basis functions with appro- more reasonable structure. The number reduction of
priate parameter values, and therefore, both the train- wavelet basis units is a more practical design for the
ing and testing sets can be represented with acceptable equalizer. The equalizer order, channel order and de-
accuracy. For the WNNs equalizer, the channel states lay are the determining factors in establishing the
are linearly separable. We can make use of the OLS number of wavelet units. We expect to reduce the
method to reduce the number of relevant channel number of wavelet basis units needed to achieve the
states. same performance provided that only those wavelets
Take Eq. (4) into consideration. If N input and out- which contain useful information are retained. WNNs
put measurements are available, we have the matrix are recursively established by decreasing number of
form as units that result in the network convergence at each
VNu1 [v(1), v(2),, v( N )]T stage. The algorithm for carrying out the procedure is
described as follows. Equation (5) can be rewritten in
PN u( K 1)W( K 1)u1  U N u1 R N (5) matrix form as
where v1 p0 (1) p1 (1)  pK (1) w0
v p (2) p (2)  p (2) w
PN u( K 1) [ P (1) , P (2) , , P ( N ) ]T R N u( K 1) . 2 0 1 K 1  [u ]
     
wk can be obtained by minimizing the MSE be-
tween the input x(nd) and the estimated signal vN p0 ( N ) p1 ( N )  pK ( N ) wK
x (n  d ) , i.e., (7)
N  m 1
E 0.5 (v(n)  v(n)) 2 where [u] is assumed to be a zero mean white noise
n m sequence which is not correlated with the input and
N  m 1
K

2 output data. PN u( K 1) is a wavelet basis matrix that
0.5 v(n)  wk pk (6)
n m k 1 can be decomposed into the product of an orthogonal
Minimization of Eq. (6) to determine the parame- matrix H and an upper triangular matrix A.
ters wk can be performed by using error W( K 1)u1 represents unknown parameters to be esti-
back-propagation of gradient descent[5]. In order to mated. Equation (7) can be re-arranged to yield:
reduce the number of wavelet basis terms and increase VN u1 PN u( K 1)W( K 1)u1  U N u1
the computational efficiency, we used the OLS opti-
H N u( K 1) A( K 1)u( K 1)W( K 1)u1  U N u1
mal algorithm to select the wavelet nodes.
OLS algorithms have been used to construct RBF H N u( K 1)G( K 1)u1  U N u1 (8)
networks and to reduce the number of basis units [13-15].
where
Although RBF networks are used in adaptive equali- T
zation, WNNs are generally not an RBF network, H 0 h0 (1) h1 (1)  hK (1)

since multi-dimensional scaling functions are H1 h0 (2) h1 (2)  hK (2)
H 
non-radial symmetric[11]. The training data do not pro-    

vide any information for determining the coefficients H K h0 ( N ) h1 ( N )  hK ( N )
of these empty wavelet basis units which should be
eliminated, and therefore, a small number of basis (9)
[16]
units may be adequate for determining the solution of According to the Gram-Schmidt algorithm , the or-
the minimized MSE. The OLS method is a more effi- thogonal optimal wavelet basis units are used to con-
cient selection of wavelet basis units than the ran- struct WNNs. The set of orthogonal vector {H k } is
JIANG Minghuet alWavelet Neural Networks for Adaptive Equalization 27

constructed from {Pk } vectors by: g0


AK1  AK1C K 1
P0  (18)
H 0 P  C H 0 1 g
 1 1,0 0 K
H1  (10)
 Ak1 can be calculated recursively:

 K 1
P  C H 1 C1 A2 C 2
H K K k ,i i
A2 0 1 , A3 0 1
, ,
i 0
(19)
Equation (10) is equivalent with the following two Ak 1 C k 1
Ak 0
equations: 1
h0 (l ) p0 (l ) , l=1, 2,, N (11a) where
k 1 Ck (Ck ,0 , Ck ,1 , , Ck ,k 1 )T
hk (l ) pk (l )  Ck , i hi (l ) (11b)
i 0 According to Eq. (19), we can now easily obtain the
where Ck,i is determined by the orthogonal condition inverse matrix:
over the data record: Ak11  Ak11C k 1
Ak1 (20)
H , H 0 H , P  C H , H ,
j k j k k ,i j j 0 1
jzk (12) After some simple derivations, one can easily arrive
N at recursive formulas for unknown weights as :
H j , Pk h j (l ) pk (l )
l 1 wK g K (21)
Ck , j ,
H , H N
j j
h 2j (l ) wk
K
g k  C j ,k w j , k K  1, K  2,,1, 0
l 1
j k 1
j = 0,1,, k1; k =1,2,, K (13)
(22)
G( K 1)u1 ( H T H ) 1 H T
( K 1)u N N u( K 1) ( K 1)u N VN u1 Taking the square of Eq. (17) yields
(14) K
v 2 (l ) hk2 (l ) g k2  u 2 (l ) (23)
where , denotes the inner product, and k 0

( H T H )1 The error of k-th term is expressed as


N
( H 0T H 0 )1 0  0
hk2 (l ) g k2
0 ( H 1T H 1 )1  0 Ek l 1
N
u 100% (24)
    v 2 (l )
l 1
0 0  ( H KT H K )1
The value of Ek is computed together with the pa-
(15) rameter estimates to indicate the significance of each
is a diagonal matrix. Equation (14) is equivalent with term, and then the terms are ranked according to their
the following form: contributions to the overall mean square error. If Ek is
N very small, which indicates an insignificant term, then
hk (l )vl the k-th term is canceled. The procedure provides an
l 1
gk N
, k=0,1,, K (16) optimal reduction of the wavelet basis units by re-
hk2 (l ) moving some redundant units at each stage so as to
l 1
retain the highest achievable accuracy with the re-
Taking into consideration the constraint of Eq. (8),
maining units. The selection procedure is terminated
we have
K
when a desired error tolerance T is achieved:
V H k g k  [u ] (17) Kc
k 0 1  Ek  T (25)
k 1
and
The process of selecting terms is continued until the
W( K 1)u1 A(K1 1)u( K 1)G( K 1)u1 sum of the error reduction ratios approaches 100%.
28 Tsinghua Science and Technology, February 2004, 9(1): 2429

After application of the OLS algorithm, the basis units giving the maximum number of different equalizer
form a more compact representation and enable the inputs equal to 512.
input patterns to be free from redundant or irrelevant 2 1/ 4
We choose the Mexihat \ ( x) S (1  x 2 )
information and correlate well with the process states. 3
From this point of view, the constructed WNNs will 2
e  x / 2 as a mother wavelet. The signal to noise ratio
have the best performance for a certain complexity or
(SNR) of the equalizer input is defined as
provide a certain performance level with minimum
hi2 / V n2 (hi is the i-th channel coefficient). The sig-
complexity. i

nal sequence of different SNRs is applied as training


2 Experiments data and 500 000 training data pairs (y(n), x(n)) are
used in the training of networks. Performance com-
To show the validity of the proposed method and the
parisons among the equalizers based on the WNNs,
associated learning algorithm, experimental simula-
MLP and the linear LMS (LLMS) are carried out un-
tions are performed on linear and nonlinear channel
der different noise conditions. The processes are
equalization problems. Assume the transmitted se-
simulated with different noise variances. The LLMS
quence is real and independent and that equiprobable
algorithm uses a linear adaptive filter which has no
binary symbols x(n) pass through a linear or nonlinear
hidden nodes and no nonlinear activation function.
channel to produce the observation signal y(n) and to
The MLP structure consists of (3+1) input nodes,
obtain the vector Y (i ) [ y(i), y(i  1),, y(i  m  1)] .
(12+1) hidden nodes and 1 output node for channel 1
The vector is fed to equalizers based on WNNs, MLP (i.e., H1(Z)). The MLP structure consists of (4+1) in-
and LLMS, respectively. We design one channel such put nodes, (15+1) hidden nodes and 1 output node for
that its characteristic of dispersion is represented as channel 2 (i.e., H2(Z)). Here 1 denotes bias. The num-
H1 ( z ) 0.181  0.272 z 1  0.905 z 2  0.272 z 3 bers of selected initial wavelets for H1(Z) and H2(Z)
(26) are 25 (m = 3, l = 12, s = 9) and 30 (m = 4, l = 14, s =
11). The tolerance is set at T = 0.005-0.01. After ap-
The equalizer input of the nonlinear channel is
shown as plication of the OLS algorithm the wavelet basis units
are optimally reduced to 10 and 13 units. Figures 1-4
y(n)=g(n)+0.15g(n) tanh(g(n))+u(n) (27)
show the simulation results. Figures 1 and 3 show a
where u(n) is an additive zero-mean white Gaussian
comparison of results for error probability (EP), aver-
noise, and g(n) denotes the output of the noise-free
aged over 50 independent runs for the linear channel
channel. Generally, the equalizer order is chosen to be
equalizers (WNNs, MLP and LLMS). 10 000 data
not less than the channel delay. For the H1(Z) channel,
samples are used to train these networks per each run.
Nc = 3 and d = 2 are the channel order and delay. We
Different random initial weights and parameters are
choose the equalizer order m=3 which is larger than d.
used in each run. Figures 2 and 4 show a comparison
Here assuming that the channel order is equal to the
of results for EP averaged over 50 independent runs
equalizer order, the maximum number of different
for the nonlinear channel equalizers (WNNs, MLP
(possible) equalizer inputs is equal to 23+3+1 = 128. We
and LLMS). The simulation results show that the first
design another more complex channel; its impulse re-
channel is better than the second, and the deterioration
sponse and the corresponding equalizer input of the
of former nonlinear is also smaller than the latter. The
nonlinear channel are expressed as:
WNNs and MLP are capable of forming complex de-
H 2 ( z) 0.202  0.364 z 1  0.808 z 2  cision regions in input pattern space. The former ex-
0.364z 3  0.202z 4 (28) hibits a more powerful nonlinear ability and gives su-
2 perior performance with the lowest bit error rate
y(n) = g(n)+0.15cos (2Sg (n))  0.28 g (n)+
(BER). It can be observed that WNNs equalizer out-
0.09 sigmoid3 (g(n))+0.21g4 (n)+u (n) (29) performs MLP, and far outperforms LLMS equalizer
For the latter channel, we use channel order Nc = over a wide range of SNR conditions.
4, delay parameter d = 3, and equalizer order m = 4,
JIANG Minghuet alWavelet Neural Networks for Adaptive Equalization 29

The reduced WNNs equalizer using an OLS algo-


rithm is practical to implement and outperforms the
conventional neural network equalizers. The OLS al-
gorithm for WNNs gives a small number of wavelet
units and less BER under any SNR condition for lin-
ear and nonlinear channels. The superior performance
is due to the powerful nonlinear ability of the WNNs
equalizer and to the computational refinement arising
from the use of the OLS optimazation algorithm.

Fig. 1 BER performance of several equalizers for 3 Conclusions


channel H1(Z) with different SNRs: Linear channel
WNNs are capable of forming arbitrarily complex
nonlinear decision boundaries to solve complex clas-
sification problems. Equalizers based WNNs are ca-
pable of performing quite well in compensating
nonlinear distortions introduced in a channel. The
OLS algorithm for WNNs gives a small number of
wavelet units (forming a more compact set of basis
units), free from redundant or irrelevant information.
Experimental results show that the proposed WNNs
based equalizer can significantly improve the neural
modeling accuracy, and outperforms conventional
Fig. 2 BER performance of several equalizers for neural networks in signal to noise ratio and channel
channel H1(Z) with different SNRs: Nonlinear channel
non-linearity.

References

[1] Chen S, Gibson G J, Cowan C F N, et al. Adaptive


equalization of finite nonlinear channels using multilayer
perceptrons. Signal Processing, 1990, 20(2): 107-119.
[2] Gan Q, Saratchandran P, Sundaraarjan N. A complex val-
ued radial basis function network for equalization of fast
time varying channels. IEEE Transactions on Neural Net-
works, 1999, 10(4): 958-960.
[3] Chen S, Mulgrew B, Grant P M. A clustering techniques
Fig. 3 BER performance of several equalizers for
channel H2(Z) with different SNRs: Linear channel for digital communications channel equalization using ra-
dial basis function networks. IEEE Transactions on Neu-
ral Networks, 1993, 4(4): 570-579.
[4] Patra J C, Pal R N. A functional link artificial neural net-
work for adaptive channel equalization. Signal Processing,
1995, 43: 181-195.
[5] Zhang Q, Benveniste A. Wavelet networks. IEEE Trans-
actions on Neural Networks, 1992, 3: 889-898.
[6] Fang Y, Chow T W S. Orthogonal wavelet neural net-
works applying to identification of Wiener model. IEEE
Transactions on Circuits and Systems-I, 2000, 47(4):
591-593.
Fig. 4 BER performance of several equalizers for
channel H2(Z) with different SNRs: Nonlinear channel (Continued on page 37)

You might also like