You are on page 1of 10

1

Newton-Like Minimum Entropy Equalization Algorithm


for APSK Systems
Anum Ali, Shafayat Abrar, Azzedine Zerguine and Asoke K. Nandi
AbstractIn this paper, we design and analyse a Newton-like blind
equalization algorithm for APSK system. Specically, we exploit the
principle of minimum entropy deconvolution and derive a blind equal-
ization cost function for APSK signals and optimize it using Newtons
method. We study and evaluate the steady-state excess mean square
error performance of the proposed algorithm using the concept of energy
conservation. Numerical results depict a signicant performance enhance-
ment for the proposed scheme over well established blind equalization
algorithms. Further, the analytical excess mean square error of the
proposed algorithm is veried with computer simulations and is found
to be in good conformation.
Index Termsconstant modulus algorithm, blind equalizer, recursive
least squares algorithm, Newtons method, tracking performance, ampli-
tude phase shift keying
I. INTRODUCTION
In digital communications at high enough data rates, almost all
physical channels exhibit inter-symbol interference (ISI). One of the
solutions to this problem is blind equalization, which is a method to
equalize distortive communication channels and mitigate ISI without
supervision. Blind equalization algorithms do not require training
at either the startup period or restart after system breakdown. This
independence of blind schemes with respect to training sequence
results in improved system bandwidth efciency.
In blind equalization, the desired signal is unknown to the receiver,
except for its probabilistic or statistical properties over some known
alphabets. As both the channel and its input are unknown, the
objective of blind equalization is to recover the unknown input
sequence based solely on its probabilistic and statistical properties
[1], [2], [3]. From the available literature, it can be found that any
admissible blind objective (or cost) function has two main attributes:
1) it makes use of the statistics which are signicantly modied as
the signal propagates through the channel [4], and 2) optimization of
the cost function modies the statistics of the signal at the equalizer
output, aligning them with the statistics of the signal at the channel
input [5].
One of the earliest methods of blind equalization was suggested by
Benveniste et al. [5]. Their proposed method assumed the transmitted
signal to be non-Gaussian, independent and identically distributed
(i.i.d.) sequence of a known statistical distribution. It sought to match
the distribution of its output (deconvolved sequence) with the distri-
bution of the transmitted signal and the adaptation continued until the
said objective was achieved. Another approach to this problem was
devised by Donoho [4], who dened a partial ordering, measuring
the relative Gaussianity between random variables. He suggested
to adjust the equalizer until the distribution of the deconvolved
sequence is as non-Gaussian as possible. A somewhat informal
version of Donohos method appeared in the work of Wiggins [6],
[7]. According to Wiggins, assuming the transmitted signal is a non-
Gaussian signal with certain distribution pa, the equalizer must be
S. Abrar is with the Department of Electrical Engineering, COM-
SATS Institute of Information Technology, Islamabad, Pakistan. Email:
sabrar@comsats.edu.pk. A. Ali and A. Zerguine are with the Department
of Electrical Engineering, King Fahd University of Petroleum and Minerals,
Dhahran, Saudi Arabia. A.K. Nandi is with the Department of Electronic and
Computer Engineering, Brunel University, Uxbridge, Middlesex, UK.
adjusted to make its output signal distribution resemble pa. He termed
this approach as minimum entropy deconvolution (MED) criterion.
In this work, the MED criterion is the subject of our concern
for designing (and optimizing) cost functions to equalize signals
blindly in amplitude phase shift keying (APSK) systems. The APSK
signals are very important in modern day communication systems
due to their robustness against nonlinear channel distortion and
advantageous lower peak-to-average power ratio compared to the
conventional quadrature amplitude modulation signals (refer to APSK
based systems in [8], [9], [10], [11], [12], [13], [14] and references
therein).
This paper is organized as follows: Section II discusses the
baseband communication system model, notion of Gaussianity, and
traditional blind equalizers. Section III describes the MED criteria
for channel equalization, discusses the admissibility of costs tailored
for APSK, and stochastic gradient-based optimization. Section IV
formulates the proposed adaptive MED-based blind equalization
scheme for APSK systems exploiting Newton-like update. Section
V provides a steady-state tracking performance of the proposed
algorithm in time varying scenario. Simulation results are discussed
in Section VI and conclusions are provided in Section VII. [h!]
II. SYSTEM MODEL AND TRADITIONAL BLIND EQUALIZERS
The baseband model for a typical complex-valued data communi-
cation system, as shown in Fig. 1, consists of an unknown nite-
impulse response lter hn, which represents the physical inter-
connection between the transmitter and the receiver. A zero-mean,
i.i.d., circularly symmetric, complex-valued data sequence |an is
transmitted through the channel, whose output xn is recorded by the
receiver. The input/output relationship of the channel can be written
as: xn =

k
a
nk
h
k
+n, where the additive noise n is assumed
to be stationary, Gaussian, and independent of the channel input an.
The function of equalizer at the receiver is to estimate the delayed
version of original data, an , from the received signal xn. Let
wn = [wn,0, wn,1, , wn,N1]
T
be a vector of equalizer coef-
cients with N elements and xn = [xn, xn1 , xnN+1]
T
be the
vector of channel observations (
T
denotes transpose operation). The
output of the equalizer is then given by yn = w
H
n1
xn (
H
denotes
the Hermitian conjugate operator). If tn = hnw

n1
represents the
overall channel-equalizer impulse response ( denotes convolution),
then
yn =

i
w

n1,i
xni = tn, an +

l=
t
n,l
a
nl
+

n
. .
signal + ISI + noise
(1)
which demonstrates the adverse effect of inter-symbol interference
(ISI) and additive noise. The ISI is quantied as [15]:
ISI =

i
[tn,i[
2
max
_
[tn[
2
_
max |[tn[
2

(2)
In subsequent discussions we drop the subscript n from t for
notational convenience. The idea behind a Bussgang blind equalizer
is to minimize (or maximize), through the choice of w, a certain cost
2
function J depending on yn, such that yn provides an estimate of
an up to some inherent indeterminacies, giving, yn = an , where
= [[e

C represents an arbitrary complex-valued gain. Hence,


a Bussgang blind equalizer tries to solve the following optimization
problem:
w

= arg
w
optimize J, with J = E(yn) (3)
The cost J is a function of implicitly embedded statistics of yn and
() is a real-valued function. Ideally, the cost J makes use of statis-
tics which are signicantly modied as the signal propagates through
the channel, and the optimization of cost modies the statistics of the
signal at the equalizer output, aligning them with those at channel
input. The equalization is accomplished when equalized sequence yn
acquires an identical distribution as that of the channel input an.
More formally, we have the following theorem [5]:
Theorem 1: If the transmitted signal is composed of non-Gaussian
i.i.d. samples, both channel and equalizer are linear time-invariant
lters, noise is negligible, and the probability density functions (PDF)
of transmitted and equalized signals are equal, then the channel has
been perfectly equalized.
This mathematical result is very important since it establishes the
possibility of obtaining an equalizer with the sole aid of signals statis-
tical properties and without requiring any knowledge of the channel
impulse response or training data sequence. Meanwhile, Donoho [4]
noted that, as a consequence of the central limit theorem, linear
combinations of identically distributed random variables become
more Gaussian than the individual variables. Therefore, the received
signal xn will have a distribution that is more nearly Gaussian than
the distribution of an. Any suitable objective function capable of
measuring Gaussianity or non-Gaussianity can therefore be used for
deconvolution.
One of the measures of Gaussianity is (normalized) kurtosis, ,
which is a statistic based on second and fourth-order moments. For
a circularly-symmetric complex-valued random variable X, kurtosis
is dened as [15]:
X =
E[x[
4
(E[x[
2
)
2
2 (4)
X is greater than zero, equal to zero, and less than zero for super-
Gaussian, Gaussian, and sub-Gaussian random variables, respectively.
Expectedly most of the existing blind equalizers use second and
fourth-order statistics of the equalized sequence. The rst ever method
which relies on the aforesaid statistics is the constant modulus
criterion, [16]; it is given by
min
w
_
E[yn[
4
2R
2
E[yn[
2
_
, (5)
where R is the Godard radius and R
2
= E[an[
4
/E[an[
2
is the
Godard dispersion constant [17]. For an input signal that has a
constant modulus [an[ = R, the constant modulus criterion penalizes
output samples yn that do not have the desired constant modulus
characteristics [18].
Shalvi and Weinstein [15] demonstrated that the condition of equal-
ity between the PDF of the transmitted and equalized signals, due to
Theorem 1, was excessively tight. Under the similar assumptions,
as laid in Theorem 1, they discussed the possibility to perform
blind equalization by satisfying the condition E[yn[
2
= E[an[
2
and
ensuring that a nonzero cumulant (of any order higher than 2) of an
and yn are equal. For a two dimensional signal an with four-quadrant
symmetry (i.e., Ea
2
n
= 0), they suggested to maximize the following
exemplary cost containing second and fourth-order statistics:
max
w

E[yn[
4
2
_
E[yn[
2
_
2

s.t. E[yn[
2
= E[an[
2
(6)
In the next section, we discuss equalization/deconvolution techniques
which evolved around the notion of entropy as a measure of Gaus-
sianity.
III. MINIMUM ENTROPY DECONVOLUTION
In statistics, the most powerful test of the null hypothesis against
another is one which maximizes discrimination and minimizes the
probability of accepting the null hypothesis as true when actually the
alternative is true. In information theory, a statistic that maximizes
detectability of a signal while minimizing the probability of a false
alarm is most powerful. In seismic community, however, a statistical
(rank-discriminating) test against Gaussianity, which can be used to
design a deconvolution lter, is loosely termed as MED criterion [6],
[4].
Hogg [19] developed a scale-invariant and the most powerful
rank-discriminating test for one member of the generalized Gaussian
against another by considering the following PDF:
pY (y, ) =

2
_
1

_ exp ([y[

) , [y[ < (7)


where > 0 is shape parameter, and () is the Gamma function.
To determine if a random set of samples |y1, y2, , yB is drawn
from the distribution pY (y, 2) as opposed to pY (y, 1), a ratio test
was derived (based on the procedure described in [20]) as follows:
VY (1, 2) =
_
1
B

B
i=1
[yi[

1
_
B/
1
_
1
B

B
i=1
[yi[

2
_
B/
2
=
2

=
1
(8)
where is some threshold. The larger V becomes, the more probable
it is that the sample set Y is drawn from the distribution pY (y, 2)
and not pY (y, 1). It would be interesting to look at the entropy,
H(), associated with (7); it is given as follows [21]:
H() = log
_
_
2

_
1

_
3
_

_
3

_
2
_
_
+
1

(9)
This function is illustrated in Fig. 2 for different members of
the family. The entropy is maximum for Gaussian (when = 2).
Note that ( < 2) and ( > 2) represents super-Gaussian and sub-
Gaussian cases, respectively. As 0, H() rapidly goes to
which is the entropy of the certain event; however, on the other side,
H() falls off slowly to another minimum, which is the entropy of
the uniform distribution.
From the work of Geary [22], it became known to
the statistical community that the test against Gaussianity
1
B

B
i=1
[xi[
4
/(
1
B

B
i=1
[xi[
2
)
2
is most efcient when no
information on the distribution of the random sample is available.
Remarkably, Wiggins exploited the same idea and sought to
determine the inverse channel w

that maximizes the kurtosis


of the deconvolved seismic data yn [6]. Since seismic data are
super-Gaussian in nature, given B samples of yn, Wiggins suggested
to maximize the following test (or cost):
W(yn) =
1
B

B
n=1
[yn[
4
_
1
B

B
n=1
[yn[
2
_
2
large B

E[yn[
4
(E[yn[
2
)
2
(10)
This scheme seeks the smallest number of large spikes (or impulses)
consistent with the data, thus maximizing the order, or equivalently,
minimizing the entropy or disorder in the data [23]. Wiggins coined
3
the term MED criterion for the test (10)
1
. Immediately after Wiggins,
Ooe and Ulrych [24] realized that better deconvolution results may be
obtained for super-Gaussian signals if non-Gaussianity is maximized
with some less than two; they used ( = 1) and suggested
OU(yn) =
1
B

B
n=1
[yn[
2
_
1
B

B
n=1
[yn[
_
2
large B

E[yn[
2
(E[yn[)
2
(11)
Later, Gray presented a generic MED criterion with two degrees of
freedom [25]:
G(yn) =
1
B

B
n=1
[yn[
p
_
1
B

B
n=1
[yn[
q
_
p/q
large B

E[yn[
p
(E[yn[
q
)
p/q
(12)
Note that costs (10)-(12) are members of (8) for different values of
shape parameters. In the context of digital communication, where the
underlying distribution of the transmitted (possibly pulse amplitude
modulated) data symbols are closer to a uniform density (sub-
Gaussian), we can obtain a blind equalizer by optimizing Grays cost
(12) as follows [26], [27]:
w

= max
w
G(yn), for p = 2, and q > 2. (13)
Recently, Abrar and Nandi [28] discussed the case (p, q) = (2, )
for the blind equalization of APSK signal:
AN(yn) =
1
B

B
n=1
[yn[
2
(max |[yn[)
2
large B

E[yn[
2
(max |[yn[)
2
(14)
Maximizing (14) can be interpreted as determining the equalizer
coefcients, w, which drives the distribution of its output, yn,
away from Gaussian distribution toward uniform, thus removing
successfully the interference from the received signal. Note that the
cost (14) is an optimal, scale-invariant test for APSK signal against
Gaussian (refer to for details).
A. Admissibility of the Cost AN(yn)
In this section, we discuss the admissibility of AN(yn) (14).
By admissibility, we mean that the cost AN(yn) yields consistent
estimates of the exact channel equalizer when the transmitted signal
is i.i.d. or in other words the steady-state overall impulse response
(t) is a delta function with arbitrary delay. Without loss of generality,
we can assume that the channel, equalizer and transmitted signal are
real-valued. Note that max|[yn[ = max|[an[

l
[t
l
[ < and,
owing to i.i.d. property, Ey
2
n
= Ea
2
n

l
t
2
l
. Next we can express the
cost (14) in t domain as follows (assuming large B):
AN(t) =

l
[t
l
[
2
_
l
[t
l
[
_
2
(15)
Evaluating the gradient w.r.t. k-th tap, we obtain

t
k
AN =
_
l
[t
l
[
_
2

j
t
2
j
t
k

_
l
t
2
l
_
(

j
|t
j
|)
2
t
k
_
l
[t
l
[
_
4
=
2
_
l
[t
l
[
_
2
t
k
2
_
l
t
2
l
_
_

j
[tj[
_
t
k
|t
k
|
_
l
[t
l
[
_
4
(16)
Equating the above to zero, we get [t
k
[

j
[tj[

l
t
2
l
= 0. It shows
that for a doubly innite equalizer, the stable global maxima of AN
1
The simple structure of the impulsive function led Wiggins to call his
technique MED. According to [24], however, the approach which Wiggins
actually adopts is not entropy; it is rather a variant of varimax rotation which
is widely used in obtaining a simple factor structure in factor analysis.
is along the axis, i.e.
|[t
k
[ =
kk
, k

= , 1, 0, 1, (17)
and unstable equilibria are along the diagonal, located at: |[t
k
[ =
(1/L)

jI
L

kj
, where IL(L 2) is any L-element subset of
the integer set. The surface of cost (15) is depicted in Fig. 3(a) for
a two-tap scenario; it can be seen that the cost is maximized only
for the solution specied in (17). Next, incorporating the a priori
signal knowledge := max |[an[, the cost (14) can be written in
a constrained form as follows:
w

= arg max
w
E[yn[
2
s.t. max |[yn[ . (18)
The geometry of the cost (18) is depicted in Fig. 3(b) for a two-tap
scenario; it can be seen that the cost is maximized when the two
balls,

i
[ti[
2
and

i
[ti[, coincide. The cost (18) is quadratic, and
the feasible region (constraint) is a convex set. The problem, however,
is non-convex and may have multiple local maxima. Nevertheless, we
have the following theorem (refer to [29] for proof):
Theorem 2: Assume w

is a local optimum in
w

= arg max
w
E[yn[
2
s.t. max |[yn[
and t

is the corresponding total channel equalizer impulse-response


and channel noise is negligible. Then it holds |[t
k
[ =
kk
, where
k

= , 1, 0, 1, .
Thus an equalizer which is maximizing the output energy and
constraining the largest amplitude is able to mitigate the ISI induced
by the channel.
B. Stochastic Gradient-Based Optimization of AN
When an equalizer w optimizes a cost J = E(yn) by stochastic
gradient-based adaptive method, the resulting algorithm is wn =
wn1 /w

n1
, where > 0 is the step-size, governing
the speed of convergence and the level of steady-state performance,
[30]. The positive and negative signs are for maximization and
minimization, respectively.
Note that a straightforward gradient-based adaptive implementation
of AN is not possible. The reason is that the order statistic
max |[yn[ is not a differentiable quantity. In [28], however, authors
presented an instantaneous and differentiable version of constrained
AN and obtained the following stochastic gradient-based algorithm:
wn = wn1 +fn y

n
xn,
fn =
_
1, if [yn[ < ,
, if [yn[ .
(19)
where is a constant dened as follows (refer to ):
:=
E[yn[
2
{|yn|<}
E[yn[
2
{|yn|}
(20)
The algorithm (19) was termed as beta constant modulus algorithm
(CMA). The CMA may be kept stable in mean-square sense if its
step-size satises the condition 0 < < 3/(2tr(R)), where tr()
is trace operator and R = Exnx
H
n
is autocorrelation matrix [31].
IV. ADAPTIVE NEWTON-LIKE OPTIMIZATION OF AN
In this section, we aim to optimize AN using Newton-like
adaptive method. When Newton-like optimization is used by the
equalizer, the update for a minimization scenario is given as follows
[32]:
wn = wn1
_

2

wn1w
H
n1
_
1
_

wn1
_

(21)
4
Firstly, to simplify algebraic manipulation, we suggest the following
instantaneous cost function:
w

= arg min
w
n, where n :=

fn

[yn[
2

(22)
where fn is as specied in (19). It is simple to show that solving (22)
by gradient-based method results in CMA. For the formulation of
a Newton-like scheme, we introduce exponential weights (memory)
in (22) as follows:
w

= arg min
w

,
where

:=
n

k=0

nk

f
k

w
H
n1
x
k
. .
=:y
k

(23)
and is the forgetting factor. When equals 1, we have a cost
which may be considered as a sort of (innite memory) least-squares
problem. For 0 < < 1, however, the cost has effective nite
memory (1/(1 )) and it may be optimized adaptively using (21).
We readily evaluate gradient g
n
and Hessian Hn for (23) as follows:
g
n
:=

wn1
=
n

k=0
f
k
y
k
x

nk
,
Hn :=

2

wn1w
H
n1
=
n

k=0
f
k
x
k
x
H
k

nk
(24)
Here, we encounter a problem. Note that, for the required steady-
state condition [y
k
[ implying f
k
= +1 k, however, it leads to
an undesirable situation
Hn =
1
1

Rn 0 (25)
where

Rn :=

n
k=0
x
k
x
H
k
0. So, assuming a converging
equalizer, the Hessian is found to be negative denite. It means
that such an equalizer will try to maximize the cost instead of
minimizing it. This is contrary to the problem denition and we
conclude that a straight Newton-like CMA equalizer, implementing
Hessian as specied in (24), will diverge. Simulation study is found
in conformation with this argument. One possible way to resolve this
matter is to ensure that the recursively computed Hessian remains
positive denite. Consider the following solution:
Hn =
n

k=0
[f
k
[x
k
x
H
k

nk
= Hn1 +[fn[xnx
H
n
(26)
Invoking the matrix inversion lemma, we obtain:
Pn = H
1
n
=
1

_
Pn1
Pn1xnx
H
n
P
H
n1
zn +x
H
n
Pn1xn
_
(27)
where zn = [fn[
1
. It was shown in [33], that the performance
of a Newton-like algorithm may be enhanced if zn is computed
as an iterative estimate, that is zn zn = [fn[
1
) where )
represents some averaged estimate of the enclosed entity. One of
the possibilities is zn = zn1 +
1
n+1
_
[fn[
1
zn1
_
. Further note
that g
n
= g
n1
+fnynx

n
. For close to one, the vector g
n
will
be dominated by its former estimate g
n1
, that is g
n
g
n1
, and
this leads to a simpler expression g
n
(1)
1
fnynx

n
. With this
consideration, the proposed Newton-like CMA (NL-CMA) update
is given as follows:
wn = wn1 +Pnfny

n
xn, (28)
where Pn is as evaluated in (27) and the factor (1 )
1
in g
n
is merged with . Note that the recursive calculation of Pn keeps
the computational complexity of the proposed scheme to O(N
2
) per
iteration. The proposed scheme is summarized in Table. I.
V. STEADY-STATE PERFORMANCE OF NL-CMA
In this section, we study the steady-state tracking performance of
NL-CMA in a non-stationary environment. Although the steady-
state performance essentially corresponds to only one point on the
learning curve of the adaptive lter, there are many situations where
this information is of value by itself. The addressed approach is based
on studying the energy ow through each iteration of an adaptive
lter [34], [35], and it relies on a fundamental error variance relation
that avoids the weight-error variance recursion altogether. We may
remark that although we focus on the steady-state performance of
NL-CMA, the same approach can also be used to study the transient
(i.e., convergence and stability) behaviour of this lter. These details
will be provided elsewhere.
Consider a non-stationary environment with time-varying optimal
weight-vector w
o
n
, also called Wiener lter, given by
w
o
n
= w
o
n1
+q
n
(29)
where q
n
is some random perturbation such that Eq
n
q
H
n
= Q =

2
q
I. Using the Wiener solution, the data an can be expressed as
an = (w
o
n1
)
H
xn +vn where vn is disturbance and is uncorrelated
with xn, (Ev

n
xn = 0) [36]. The purpose of the tracking analysis of
an adaptive lter is to study its ability to track such time variations.
The weight error vector wn which quanties how far away the weight
vector is from Wiener solution, is given as
wn = w
o
n
wn (30)
The so-called a posteriori and a priori error are dened as ep,n =
( wn q
n
)
H
xn and ea,n = w
H
n1
xn, respectively. First consider
a generic update expression wn = wn1 +Pn

n
xn. Subtracting
w
o
n
from both sides of this update, substituting its value from (29)
and exploiting (30), we get
wn = wn1 Pn

n
xn +q
n
(31)
Taking Hermitian transpose of (31) and post-multiplying by xn, we
obtain:
( wn q
n
)
H
xn = w
H
n1
xn x
H
n
Pnxnn (32a)
ep,n = ea,n n|xn|
2
Pn
(32b)
where |xn|
2
Pn
:= x
H
n
Pnxn is the weighted Euclidian norm. From
(32b), we have n =
1
|xn|
2
Pn
(ea,n ep,n) , xn ,= 0. Now,
substituting in the value of n in (31) we get
wn q
n
= wn1 nPn(ea,n ep,n)

xn (33)
where n :=
_
|xn|
2
Pn
_
+
is used to dene pseudo-inverse of
|xn|
2
Pn
. Taking P
1
n
weighted norm on both sides and simplifying,
we get | wnq
n
|
2
P
1
n
on the left and following quantity on the right
| wn1|
2
P
1
n
n[ea,n[
2
+ n[ep,n[
2
. This results in the energy
conservation relation and it is summarized as
| wn q
n
|
2
P
1
n
+ n[ea,n[
2
= | wn1|
2
P
1
n
+ n[ep,n[
2
(34)
Examining the expectation of the left most term of the energy
relation, we have
E| wn q
n
|
2
P
1
n
= E| wn|
2
P
1
n
+E|q
n
|
2
P
1
n
2Eq
H
n
P
1
n
wn
= E| wn|
2
P
1
n
+E|q
n
|
2
P
1
n
2
_
Eq
H
n
P
1
n
wn1
+E|q
n
|
2
P
1
n
Eq
H
n
xn

n
_
= E| wn|
2
P
1
n
E|q
n
|
2
P
1
n
(35)
where the vanished terms are a result of the following assumptions
5
about the random-walk driving-noise sequence |q
n
.
A.1) q
n
is i.i.d. and Eq
n
= 0
Q = Eq
n
q
H
n
=
2
q
I 0
|q
n
|xn, |q
n
|n
Using (35) with (34), and noting that in steady state E| wn|
2
P
1
n
=
E| wn1|
2
P
1
n
, we obtain
nE[ea,n[
2
= E|q
n
|
2
P
1
n
+ nE[ep,n[
2
(36)
This equation can now be solved for the steady-state excess mean-
square-error (EMSE), which is dened by
:= lim
n
E[ea,n[
2
(37)
We emphasize that (36) is an exact relation that holds without any
approximation or assumption, except for the assumption that the lter
is in steady state. The procedure of nding the EMSE through (36)
avoids the need for evaluating E| wn|
2
or its steady-state value
E| w|
2
. To proceed further, we make the following assumptions
(refer to [37] for similar treatment):
A.2) EPnxnx
H
n
= EPnExnx
H
n
(holds in steady state as Pn
can be replaced by EPn, see [38])
A.3) E|xn|
2
Pn
[ea,n[
2
= E|xn|
2
Pn
E[ea,n[
2
(see [38])
A.4) limnEH
1
n
= (limnEHn)
1
(justied for com-
plex Wishart distribution in [30] and shown to be reasonable
via simulations in [38])
From (32), we have ep,n = ea,n n|xn|
2
Pn
, substituting it into
(36), we obtain
2E
_
e

a,n
n
_
= E|xn|
2
Pn
[n[
2
+
1
E|q
n
|
2
P
1
n
(38)
To solve the right hand side (RHS) of (38), we employ
A.5) E|xn|
2
Pn
[n[
2
= E|xn|
2
Pn
E[n[
2
(separation principle
[38])
First, let us solve E|xn|
2
Pn
as
E|xn|
2
Pn
= tr(EPnxnx
H
n
) = tr(EPnExnx
H
n
) = tr(EPnR)
(39)
Here EPn requires further investigation. It is observed that
EHn =
n

i=1

ni
E[fn[xnx
H
n
+
n
I
= E[fn[Exnx
H
n
n

i=1

ni
+
n
I
= R
1
n
1
+
n
I for < 1
(40)
where := E[fn[. Further, exploiting A.4, we get
lim
n
EPn
_
lim
n
EHn
_
1
=
1

(1 )R
1
for < 1 (41)
by using (41) with (39), the following equation results
E|xn|
2
Pn
= tr
_
1

(1 )R
1
R
_
=
1

(1 )tr(I) =
1

(1 )N.
(42)
Proceeding to the second term on the RHS of (38), we get
E|q
n
|
2
P
1
n
= Eq
H
n
Hnq
n
= tr(Eq
n
q
H
n
Hn)
= tr(Eq
n
q
H
n
n

i=1

ni
Exnx
H
n
+
n
Eq
n
q
H
n
)
= tr(QR)
n

i=1

ni
+
n
tr(Q)
=
1
n
1
tr(QR) +
n
tr(Q) for < 1
(43)
so that
lim
n
E|q
n
|
2
P
1
n
= (1 )
1
tr(QR) for < 1
(44)
Using (42), and (44) the RHS of (38) becomes
RHS =

N(1 ) E[n[
2
+

(1 )
1
tr(QR)
(45)
The parameter is computed as follows:
:= E[fn[ = Pr [[yn[ < ] + Pr [[yn[ > ]
= 1 + ( 1) Pr [[yn[ > ]
= 1 + ( 1)
L

j=1
Mj
M
_

p([yn[; Rj) d[yn[


= 1 + ( 1) EQ1,0
_
[an[

_
.
(46)
where Qm,v(a, b) is Nuttall Q-function [39] as dened below:
Qm,v(a, b) :=
_

b
x
m
e

1
2
(x
2
+a
2
)
Iv (ax) dx, a, b > 0 (47)
where m > 1, v 0 and Iv() is the vth-order modied Bessel
function of rst kind. Now to solve E[n[
2
, where n = fnyn, we
see that fn is a piecewise function and it has got a discontinuity at
[yn[ = . We start as
E[n[
2
= Ef
2
n
[yn[
2
= E[yn[
2
{0<|yn|<}
+
2
E[yn[
2
{|yn|<}
= E[yn[
2
+
_

2
1
_
E[yn[
2
{|yn|<}
. .
=:A
(48)
where the factor A is expressed and evaluated as follows:
A := E[yn[
2
{|yn|<}
=
L

j=1
Mj
M
_

[yn[
2
p([yn[; Rj) d[yn[
=
L

j=1
Mj
M
_

[yn[
3

2
exp
_

[yn[
2
+R
2
j
2
2
_
I0
_
[yn[Rj

2
_
d[yn[
=
2
L

j=1
Mj
M
Q3,0
_
Rj

_
=
2
EQ3,0
_
[an[

_
.
(49)
To solve (48), we need to calculate statistical moments of the modulus
[yn[. Next we compute the left hand side (LHS) of (38),
LHS = 2E[e

a,n
] = 2E[(an yn)

fn yn]
= 2E[fn a

n
yn] 2Efn [yn[
2
= 2
_
Efn [an[
2
. .
=:B
E[fn a

n
ea,n]
. .
=:C
Efn [yn[
2
. .
=:D
_
(50)
The term B is computed as follows:
B = E[an[
2
{0<|yn|<}
E[an[
2
{|yn|<}
= E[an[
2
( + 1) E[an[
2
Q1,0
_
[an[

_
,
(51)
6
Next, the term D is computed as follows:
D = E[yn[
2
{0<|yn|<}
E[yn[
2
{|yn|<}
= E[yn[
2
( + 1) E[yn[
2
{|yn|<}
. .
=:A
(52)
where A is as obtained in (49). Next the term C is computed:
C = E[a

n
ea,n]
{0<|yn|<}
E[a

n
ea,n]
{|yn|<}
= E[a

n
ea,n]
. .
=0
( + 1) E[a

n
ea,n]
{|yn|<}
= ( + 1)
_
E
_
[an[
2
+
2
Q1,0
_
[an[

__

2
2
EQ3,0
_
[an[

__
.
(53)
Combining (45)-(53), we obtain the following expression to solve for
EMSE of the proposed NL-CMA,
NL.CMA
, as follows:
2

_
1

N(1 )( 1)
_
EQ3,0
__
2

[an[,
_
2

2
1 +
_
tr(QR)
(1 )
+

N(1 )( +E[an[
2
) + 2
_
+2E
_
[an[
2
_
Q1,0
__
2

[an[,
_
2

_
= 0.
(54)
where is as specied in (46).
VI. SIMULATION RESULTS
In [28], the CMA has already been compared and shown to be
better than four existing algorithms which include traditional CMA
[16] and three of its variants: the (unnormalized) relaxed CMA
(RCMA) [40], Shtrom-Fan algorithm (SFA) [41] and the generalized
CMA (GCMA) [42]. In this work, we provide performance compar-
ison of NL-CMA (as summarized in Table. I) with Newton-like
CMA (NL-CMA) [32] and recursive least square CMA (RLS-CMA)
[43]. The performances of CMA and CMA are shown as standard
benchmark. Moreover, we consider an unorthodox benchmark [44]
which is an adaptive method for blind equalization and relies on
explicit estimation of PDF using Parzen window and is termed
stochastic gradient algorithm (SQD). For reference CMA, CMA,
SQD, RLS-CMA and NL-CMA are summarized in Table II.
Here we would like to highlight the following important implemen-
tation details: First, for the SQD scheme the required compensation
factor F() is computed numerically (see [44] for details). The
compensation factor F() is a function of the constellation scheme
and hence we nd the values of F() for 16APSK(8,8) using
bisection method (see Fig. 4). The authors of [44] highlight that the
use of a smaller (e.g., 1) results in increased number of local
minima of the cost function. It is for this reason that we use a higher
value of , (i.e., = 15) in all simulations (the corresponding F()
is 1.325 as shown in Fig. 4). Secondly, to ensure the stability of
NL-CMA, we slightly modify it. According to Miranda et al. [45]
one must check the reliability of the error quantity in Hessian. If the
statistic of error quantity in Hessian is not reliable, one must adhere
to simple autocorrelation matrix. Incorporating this recommendation
and owing to sub-Gaussian nature of the transmitted signal, the
2
We have observed that the traditional root-nding methods like bisection
or line-search are good enough for solving (54) and secondly, possibly due
to the monotonicity of Nuttall Q-function, the expression (54) yielded unique
solutions in all simulation examples considered in this work.
Channel
hn
Equalizer
wn
Decision
Device
Blind
Algorithm
vn
+
xn an
yn an
Fig. 1: A typical baseband communication system.
TABLE I: NL-CMA
w0 = [0
1(N1)/2
, 1, 0
1(N1)/2
]
T
, > 1,
P0 = INN, I is identity matrix and 0 < 1
fn =
_
1, if [yn[ <
, if [yn[ .
zn = zn1 +
1
n+1
_
[fn[
1
zn1
_
Pn =
1

_
Pn1
Pn1xnx
H
n
P
H
n1
zn +x
H
n
Pn1xn
_
wn = wn1 +Pnfny

n
xn
TABLE II: Summary of Compared Algorithms
(a) CMA
fn = R
2
|yn|
2
wn = w
n1
+ fny

n
xn
(b) CMA
fn =

1, if |yn| <
, if |yn| .
wn = w
n1
+fny

n
xn
(c) SQD
K

(x) =
x

2
3
exp(x
2
/2
2
)
F(): compensation factor
Ns: Size of alphabet {a}
d
i
= (|yn|
2
F()|a
i
|
2
)
wJ(w) =
1
Ns

Ns
i=1
K

(d
i
)y

n
xn
wn = w
n1
wJ(w)
(d) RLS-CMA
P
0
= I
NN
, 0 < 1
zn = xny

n
g
n
= P
n1
zn/( +z
H
n
P
n1
zn)
Pn = (P
n1
g
n
z
H
n
P
n1
)/
fn = R
2
|yn|
2
wn = w
n1
+g
n
fn
(e) NL-CMA
P
0
= I
NN
, 0 < 1
fn = R
2
|yn|
2
zn =

(2|yn|
2
R
2
)
1
if (2|yn|
2
R
2
)
1
0
1 otherwise.
Pn =
1

P
n1

P
n1
xnx
H
n
P
H
n1
zn +x
H
n
P
n1
xn

wn = w
n1
+ (1
n
)Pnfny

n
xn
7
0 1 2 3 4 5 6
0
0.5
1
1.5

E
n
t
r
o
p
y

[
i
n

n
a
t
s
]
Fig. 2: Entropy H() versus shape parameter . For = 2, entropy
is maximum and its value is log(

2e).
t
0
t
1



1 0 1
1
0.5
0
0.5
1
t
0
2
+t
1
2
|t
0
|+|t
1
|
1
0
1
1
0
1
0.6
0.8
1
t
0

i
t
i
2
/(
j
|t
j
|)
2
t
1
(b) (a)
Fig. 3: (a): The surface of cost AN(t) for a two-tap equalizer and
(b): coincidence of unit balls.
stability of NL-CMA requires to use
zn =
_
(2[yn[
2
R
2
)
1
, if (2[yn[
2
R
2
)
1
0
1, otherwise.
0 5 10 15
1
1.1
1.2
1.3
1.4
Kernel Size ()
F
(

)


16(8,8)APSK
Fig. 4: Numerically computed F() for 16APSK(8,8)
For simulation purposes, equalizer length is set to N = 7 (un-
less otherwise noted) and w0 = [0
1(N1)/2
, 1, 0
1(N1)/2
]
T
i.e., center tap initialization is used. The data is modulated using
8APSK(4,4) and 16APSK(8,8) shown in Fig. 5(b) and 5(c). For all
experiments, the value of is obtained using (68) (this value is
1.9319 for 8APSK(4,4) and 3 for 16APSK(8,8)). ISI (as dened
in (2)) is used as index to evaluate the performance of different
equalization schemes. Experiments comparing the proposed scheme
to existing equalization methods are discussed in VI-A and VI-B.
Furthermore, additional experiments are carried to compare the
theoretical and practical EMSE of the proposed schemes for both
8APSK(4,4) and 16APSK(8,8). The details of these experiments are
discussed in VI-C.
a
I

a
R
1
1.932
3
1.586
a) b) c)
Fig. 5: a) A hypothetical dense APSK, b) practical 8APSK(4,4), and
c) practical 16APSK(8,8).
A. Experiment 1
In this experiment the performance of the proposed scheme is
tested for a channel with low eigenvalue spread. The signal is
modulated using 16APSK(8,8) and a complex-valued channel with
coefcients hR = [0.005, 0.009, 0.024, 0.854, 0.218,
0.049, 0.016] and hI = [0.004, 0.03, 0.104, 0.52,
0.273, 0.074, 0.02], where h = hR + hI is used [46]. The
eigenvalue spread of the channel is 5.83. The signal-to-noise ratio
(SNR) is set to 30 dB and stationary environment is assumed (i.e.,
q = 0, where q is the standard deviation of the perturbation
q
n
in (29)). Learning curves are obtained for 7, 000 iterations and
are ensemble averaged over 200 independent runs. The steps sizes
and forgetting factors of all schemes are adjusted for comparable
performances (i.e., the same steady state ISI) and are mentioned in
Fig. 6. It can be observed from Fig. 6 that NL-CMA shows the
fastest convergence rate followed by CMA.
0 1000 2000 3000 4000 5000 6000 7000
28
23
18
13
8
I
S
I

[
d
B
]
Iterations


CMA: = 5e5
CMA: =43e5
RLSCMA = 8e2, = 0.98
NLCMA: = 14e2, = 0.98
SQD: = 18e5
NLCMA: = 65e3, = 0.98
CMA
NLCMA
SQD
CMA
NLCMA
RLSCMA
Fig. 6: Learning curves for channel with low eigenvalue spread:
N = 7, 16APSK(8,8)
B. Experiment 2
In this experiment, a channel with higher eigenvalue spread is
considered. Generally, an increase in the eigenvalue spread of the
channel results in poor performance of equalization schemes. For
this experiment, 16APSK(8,8) modulated signal is passed through
a channel with impulse response h = [0, 0.1, 0.4, 0.8, 0.4, 0.1,
0] (the eigenvalue spread of this channel is calculated to be 65.28).
Again, SNR is set to 30 dB and stationary environment is assumed.
The step sizes and forgetting factors of all schemes are adjusted for
comparable performances and are summarized in Fig. 7. The learning
curves are obtained for 40, 000 iterations and are ensemble averaged
over 200 independent runs. It can be observed from Fig. 7 that the
8
performance of SQD, CMA and CMA degrades signicantly in
comparison with the rst experiment. However, it should be noted
that NL-CMA, RLS-CMA and NL-CMA converge relatively faster.
Further, among the three fast converging schemes, NL-CMA shows
the fastest convergence rate.
0 0.5 1 1.5 2 2.5 3 3.5 4
x 10
4
25
20
15
10
5
0
I
S
I

[
d
B
]
Iterations


CMA: = 1e4
CMA: = 48e5
RLSCMA = 14e2, = 0.98
NLCMA: = 16e2, = 0.98
SQD: = 33e5
NLCMA: = 12e2, = 0.98
CMA
SQD
CMA NLCMA
NLCMA
RLSCMA
Fig. 7: Learning curves for channel with high eigenvalue spread:
N = 7, 16APSK(8,8)
C. Experiment 3
In this experiment, analytical and empirical performance of the pro-
posed scheme is compared in non-stationary environment. In steady
state, we assume that the proposed scheme converges in the mean to
a zero forcing solution, i.e., the mean of combined channel-equalizer
response converges to a delta function with arbitrary delay and phase
rotation. We consider a non-stationary channel with q = 10
3
,
where q is the standard deviation of the perturbation q
n
in (29).
The equalizer length is set to N = 11 and the steady state EMSE
is evaluated for 20, 000 symbols in a noise-free environment. The
forgetting factor is set to 0.98 and step size is varied to obtain the
desired curves for both 8APSK(4,4) and 16APSK(8,8). A close match
between the analytic and measured EMSE for both constellation types
is observed as depicted in Fig. 8.
10
3
10
2
10
1
10
0
30
25
20
15
10
5
0
Step size:
E
M
S
E
:


[
d
B
]


Simulation
Theory
16APSK(8,8)
8APSK(4,4)
Fig. 8: Steady state EMSE using 8APSK(4,4) and 16APSK(8,8).
VII. CONCLUSION
Exploiting the notion of minimum entropy deconvolution, a cost
function was specically derived for the blind channel equalization
of amplitude phase shift keying signal in digital communication
systems. The cost was optimized adaptively using Newtons method
and it yielded a Newton-like constant modulus algorithm NL-CMA.
Tracking analysis of the proposed algorithm was performed based on
Sayed-Rupps feedback approach. Simulation results demonstrated
signicant improvement in performance compared with conventional
algorithms. Further, observations of excess mean square error demon-
strated good agreement between theoretical and practical ndings.
2
0
2
2
1
0
1
2
0
0.2
0.4
0.6
0.8
1
2
0
2
2
1
0
1
2
0
0.2
0.4
0.6
0.8
1
a) b)
Fig. 9: PDFs (not to scale) of a) hypothetical (dense) APSK and b)
Gaussian distributed received signal.
APPENDIX A: DERIVATION OF AN(yn)
Consider a continuous APSK signal, where signal alphabets y
/ = |aR + aI are uniformly distributed (theoretically) over a
circular region of radius , with the center at the origin. The joint
PDF of aR and aI is given by (refer to Fig. 9(a))
pA(y) =
_
_
_
1

2
,
_
a
2
R
+a
2
I
= [y[ ,
0, otherwise.
(55)
Now consider the transformation

] = [y[ =
_
a
2
R
+a
2
I
and
= (aR, aI), where

] is the modulus and (i, j) denotes
the angle in the range (0, 2) that is dened by the point (i, j).
The joint distribution of the modulus ] and can be obtained
as p

Y,
( y,

) = y/(
2
), y 0, 0

< 2. Since

] and
are independent, we obtain p

Y
( y : H0) = 2 y/
2
, y 0,
where H0 denotes the hypothesis that signal is distortion-free. Let

]1,

]2, ,

]B be a sequence, of size B, obtained by taking
modulus of randomly generated distortion-free signal alphabets /,
where subscript n indicates discrete time index. Let ?1, ?2, , ?B
be the order statistic of sequence |

]. Let p

Y
( y1, y2, , yB : H0)
be an B-variate density of the continuous type, then, under the
hypothesis H0, we obtain
pY( y1, , yB : H0) =
2
B

2B
B

k=1
y
k
. (56)
Next we nd scale-invariant PDF p
si
Y
( y1, y2, , yB : H0) for given
B realizations of y as follows (below is some positive scale):
p
si

Y
( y1, , yB : H0) =
_

0
p

Y
( y1, , yB : H0)
B1
d
=
2
B

2B
B

k=1
y
k
_
Ra/(z
B
z
1
)
0

2B1
d
=
2
B1
B(zB z1)
2B
B

k=1
y
k
,
(57)
where z1, z2, , zB are the order statistic of elements
y1, y2, , yB, so that z1 = min| y and zB = max| y.
Now consider the next hypothesis (H1) that signal suffers from
multi-path interference as well as with additive Gaussian noise (refer
9
to Fig. 9(b)). Thus, the in-phase and quadrature components of the
received signal are modeled as normal-distributed; owing to central
limit theorem, it is theoretically justied. It means that the modulus
of the received signal follows Rayleigh distribution,
p

Y
( y : H1) =
y

2
y
exp
_

y
2
2
2
y
_
, y 0, y > 0. (58)
The B-variate densities p

Y
( y1, , yB : H1) and p
si

Y
( y1, , yB :
H1) are obtained as
p

Y
( y1, , yB : H1) =
1

2B
y
B

k=1
y
k
exp
_

y
2
k
2
2
y
_
, (59a)
p
si

Y
( y1, , yB : H1) =
1

2B
y
B

k=1
y
k

_

0
exp
_

B
k

=1
y
2
k

2
2
y
_

2B1
d. (59b)
Substituting u =
1
2

2
y

B
k

=1
y
2
k
, we obtain
p
si

Y
( y1, , yB : H1) =
2
B
(B)

B
k=1
y
k
2
_
B
k=1
y
2
k
_
B
(60)
The scale-invariant uniformly most powerful test of p
si

Y
( y1, , yB :
H0) against p
si

Y
( y1, , yB : H1) provides us, see [20]:
O( y1) =
p
si

Y
( y1, , yB : H0)
p
si

Y
( y1, , yB : H1)
=
1
B!
_

B
k=1
y
2
k
(zB z1)
2
_
B
=
1
B!
_

B
n=1
[yn[
2
(max|[yn[ min|[yn[)
2
_
B
(61)
In the present context, where yn is the deconvolved sequence, we
have min|[yn[ = 0. Further taking Bth root of (61), ignoring
constants and some manipulations, we get (14).
APPENDIX B: EVALUATION OF FOR APSK
Expressing yn an ea,n, note that the amplitude [an[ is
perturbed by ea,n; since ea,n is assumed to be zero-mean (complex-
valued) narrowband Gaussian, the modulus [yn[ becomes Rician
distributed and its PDF (conditioned on [an[) can be expressed as
p([yn[; [an[) =
[yn[

2
exp
_

[yn[
2
+[an[
2
2
2
_
I0
_
[yn[ [an[

2
_
(62)
where
2
:= E([ea])
2
= E([ea])
2
, and I0 is zeroth order modied
Bessel function of rst kind (where [] and [] refer to real and
imaginary components of the enclosed complex-valued entity). Using
(62), a kth-order moment of modulus [yn[ can be computed as:
E[yn[
k
=
L

j=1
Mj
M
_

0
[yn[
k
p([yn[; Rj)d[yn[ (63)
Consider a distortion-free M-symbol complex-valued constellation
|a which comprises L number of unique moduli, that is [an[
|R1, R2, , RL, satisfying 0 < R1 < R2 < < RL = .
Now let Mj be the number of unique symbols on the jth modulus
Rj, this implies E[an[
2
= (1/M)

L
j=1
MjR
2
j
is the per-symbol
average energy of constellation |a.
:=
E[yn[
2
{|yn|<}
E[yn[
2
{|yn|}
=
E[yn[
2
E[yn[
2
{|yn|}
E[yn[
2
{|yn|}
(64)
Using (62), we obtain E[yn[
2
= 2
2
+E[an[
2
and E[yn[
2
{|yn|}
=

2
EQ3,0([an[/, /), where Qm,v(a, b) is Nuttall Q-function as
dened below [39]:
Qm,v(a, b) :=
_

b
x
m
e

1
2
(x
2
+a
2
)
Iv (ax) dx, a, b > 0 (65)
for v 0, m 1 and Iv() is the vth-order modied Bessel function
of rst kind. An approximate expression for Q3,0(a, b) is given as
follows [47]:
Q3,0(a, b)
2 + (a +b)
2
+b
2

8ab
exp
_

(b a)
2
2
_
+
a(3 +a
2
) +b(1 +a
2
)
4

ab
erfc
_
b a

2
_
.
(66)
Exploiting (66), we can evaluate under the limit 0. De-
noting z = ( [an[)/(

2), note that for > [an[, we have


lim0 exp
_
z
2
_
= lim0 erfc (z) 0. And for = [an[, we
have lim0 exp
_
z
2
_
= lim0 erfc (z) 1. So it is simple to
show that:
lim
0
E[yn[
2
{|yn|}
= lim
0

2
EQ3,0
_
[an[

ML
2
2M
. (67)
Finally, the asymptotic value of for APSK signal under the
condition of vanishing convolutional noise of variance 2
2
is given
by
lim
0
(2ME[an[
2
ML
2
)
_
(ML
2
). (68)
REFERENCES
[1] S. Haykin. Blind deconvolution. Prentice Hall, 1994.
[2] C.R. Johnson, Jr., P. Schniter, T.J. Endres, J.D. Behm, D.R. Brown, and
R.A. Casas. Blind equalization using the constant modulus criterion: A
review. Proc. IEEE, 86(10):19271950, 1998.
[3] Z. Ding and Y. Li. Blind Equalization and Identication. Marcel Dekker
Inc., New York, 2001.
[4] D. Donoho. On minimum entropy deconvolution. Proc. 2nd Applied
Time Series Symp., pages 565608, Mar. 1980.
[5] A. Benveniste, M. Goursat, and G. Ruget. Robust identication of a
nonminimum phase system: Blind adjustment of a linear equalizer in
data communication. IEEE Trans. Automat. Contr., 25(3):385399, 1980.
[6] R.A. Wiggins. Minimum entropy deconvolution. Proc. Int. Symp.
Computer Aided Seismic Analysis and Discrimination, 1977.
[7] R.A. Wiggins. Minimum entropy deconvolution. Geoexploration, 16:21
35, 1978.
[8] A. Morello and V. Mignone. DVB-S2: the second generation standard
for satellite broad-band services. IEEE Proc., 94(1):210227, 2006.
[9] R. De Gaudenzi, A. Guill en i F` abregas, and A. Martinez. Turbo-coded
APSK modulations design for satellite broadband communications. Intl.
Jnl. Sat. Commun. Net., 24(4):261281, 2006.
[10] C.-Y. Kao, M.-C. Tseng, and C.-Y. Chen. The performance analysis
of backward compatible modulation with higher spectrum efciency for
DAB Eureka 147. IEEE Trans. Broadcasting, 54(1):6269, 2008.
[11] C. Shaw and M. Rice. Turbo-coded APSK for aeronautical telemetry.
IEEE Aerospace and Electronic Systems Magazine, 25(4):3743, 2010.
[12] Z. Liu, Q. Xie, K. Peng, and Z. Yang. APSK constellation with Gray
mapping. IEEE Commun. Lett., 15(12):12711273, 2011.
[13] H. Wang, Y. Li, X. Yi, D. Kong, J. Wu, and J. Lin. APSK modulated
CO-OFDM system with increased tolerance toward ber nonlinearity.
IEEE Photonics Tech. Lett., 24(13):10851087, 2012.
[14] S. Fan, H. Wang, Y. Li, W. Du, X. Zhang, J. Wu, and J. Lin. Optimal 16-
ary APSK encoded coherent optical OFDM for long-haul transmission.
IEEE Photonics Tech. Lett., 25(13):11991202, 2013.
[15] O. Shalvi and E. Weinstein. New criteria for blind equalization of non-
minimum phase systems. IEEE Trans. Inf. Theory, 36(2):312321, 1990.
[16] D.N. Godard. Self-recovering equalization and carrier tracking in two-
dimensional data communications systems. IEEE Trans. Commun.,
28(11):18671875, 1980.
[17] L.R. Litwin. Blind channel equalization. IEEE Potentials, 18(4):912,
1999.
[18] J.R. Treichler and B.G. Agee. A new approach to multipath correction
of constant modulus signals. IEEE Trans. Acoust. Speech Sig. Process.,
31(2):459471, 1983.
[19] R.V. Hogg. More light on the kurtosis and related statistics. Jnl. Amer.
Stat. Assc., 67(338):422424, 1972.
10
[20] Z. Sidak, P.K. Sen, and J. Hajek. Theory of Rank Tests. Academic Press;
2/e, 1999.
[21] J.P. Marques de Sa, L.M.A. Silva, J.M.F. Santos, and L.A. Alexandre.
Minimum Error Entropy Classication. Springer-Verlag Berlin Heidel-
berg, 2013.
[22] R.C. Geary. Testing for normality. Biometrika, 34(3/4):209242, 1947.
[23] A.T. Walden. Non-Gaussian reectivity, entropy, and deconvolution.
Geophysics, 50(12):28622888, 1985.
[24] M. Ooe and T.J. Ulrych. Minimum entropy deconvolution with an ex-
ponential transformation. Geophysical Prospecting, 27:458473, 1979.
[25] W. Gray. Variable Norm Deconvolution. PhD thesis, Stan. Univ., 1979.
[26] E.H. Satorius and J.J. Mulligan. Minimum entropy deconvolution and
blind equalisation. IEE Elect. Lett., 28(16):15341535, 1992.
[27] E.H. Satorius and J.J. Mulligan. An alternative methodology for blind
equalization. Dig. Sig. Process.: A Rev. Jnl., 3(3):199209, 1993.
[28] S. Abrar and A.K. Nandi. Adaptive minimum entropy equalization
algorithm. IEEE Commun. Lett., 14(10):966968, 2010.
[29] S. Abrar, A. Zerguine, and A.K. Nandi. Digital Communication, chapter
Adaptive Blind Channel Equalization, pages 93118. C. Palanisamy
(Ed.), InTech Publishers, Rijeka, Croatia, 2012.
[30] S. Haykin. Adaptive Filtering Theory. Prentice-Hall, 1996.
[31] A.K. Nandi and S. Abrar. Adaptive blind equalization based on the
minimum entropy principle. In 5th Intl. Conf. Comp. Dev. Commun.
(CODEC), Dec. 2012.
[32] G. Yan and H.H. Fan. A Newton-like algorithm for complex variables
with applications in blind equalization. IEEE Trans. Sig. Process.,
48(2):553556, 2000.
[33] S. Abrar and A. Zerguine. Enhancing the convergence speed of a multi-
modulus blind equalization algorithm. In IEEE SCONEST, pages 4144,
2004.
[34] M. Rupp and A.H. Sayed. A time-domain feedback analysis of
ltered-error adaptive gradient algorithms. IEEE Trans. Sig. Process.,
44(6):14281439, 1996.
[35] N.R. Yousef and A.H. Sayed. A feedback analysis of the tracking
performance of blind adaptive equalization algorithms. IEEE Conf.
Decision and Control, 1:174179, 1999.
[36] T.Y. Al-Naffouri and A.H. Sayed. Adaptive lters with error nonlinear-
ities: mean-square analysis and optimum design. EURASIP Jnl. Appld.
Sig. Process., 15:192205, 2001.
[37] X. Wang and G. Feng. Performance analysis of RLS linearly constrained
constant modulus algorithm for multiuser detection. Signal Processing,
89:181186, 2009.
[38] A.H. Sayed. Fundamentals of adaptive ltering. Wiley-Interscience and
IEEE Press, 2003.
[39] A.H. Nuttall. Some integrals involving the Q-function. Naval Under-
water Systems Center Tech. Rep. 4297, April 1972.
[40] O. Tanrikulu, A.G. Constantinides, and J.A. Chambers. New normalized
constant modulus algorithms with relaxation. IEEE Sig. Process. Lett.,
4(9):256258, 1997.
[41] V. Shtrom and H.H. Fan. New class of zero-forcing cost functions in
blind equalization. IEEE Trans. Sig. Process., 46(10):2674, 1998.
[42] F.R.P. Cavalcanti, A.L. Brandao, and J.M.T. Romano. The generalized
constant modulus algorithm applied to multiuser space-time equalization.
Proc. IEEE-SPAWC, 94-97, 1999.
[43] C. Yuxin, T. Le-Ngoc, B. Champagne, and X. Changjiang. Recursive
least squares constant modulus algorithm for blind adaptive array. IEEE
Trans. Sig. Process., 52(5):14521456, May 2004.
[44] M. Lazaro, I. Santamaria, D. Erdogmus, K.E. Hild, C. Pantaleon, and
J.C. Principe. Stochastic blind equalization based on PDF tting using
parzen estimator. IEEE Trans. Sig. Process., 53(2):696704, 2005.
[45] M.D. Miranda, M.T.M. Silva, and V.H. Nascimento. Avoiding Diver-
gence in the ShalviWeinstein Algorithm. IEEE Trans. Sig. Process.,
56(11):54035413, 2008.
[46] G. Picchi and G. Prati. Blind equalization and carrier recovery using
a stop-and-go decision-directed algorithm. IEEE Trans. Commun.,
35(9):877887, 1987.
[47] S. Abrar, A. Ali, A. Zerguine, and A.K. Nandi. Tracking performance of
two constant modulus equalizers. IEEE Commun. Lett., 17(5):830833,
2013.
Anum Ali (S12) received the B.S. degree in electrical engineering (with
institute gold medal) from COMSATS Institute of Information technology,
Islamabad, Pakistan, in 2011. He is currently pursuing the M.S. degree in
electrical engineering at King Fahd University of Petroleum and Minerals,
Dhahran, Saudi Arabia. His research interests lie in signal processing and
communications. Specically, he is interested in the applications of compres-
sive sensing for communications.
Shafayat Abrar (S03, M14) was born in Karachi, Pakistan, in 1972. He
holds a B.E., an M.S., and a Ph.D. degree, all in electrical engineering,
respectively from NED University of Engineering and Technology, Karachi,
Pakistan (1996), King Fahd University of Petroleum and Minerals, Dhahran,
Saudi Arabia (2000), and University of Liverpool, Liverpool, UK (2010).
He is currently serving as a faculty member at COMSATS Institute of
Information Technology, Islamabad, Pakistan. He is a co-recipient of IEEE
Communications Society Heinrich Hertz award for best Communications
Letters for the year 2012. His research interest is signal processing for
communication systems.
Azzedine Zerguine (SM03) received the B.Sc. degree from Case Western
Reserve University, Cleveland, OH, in 1981, the M.Sc. degree from King Fahd
University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia, in
1990, and the Ph.D. degree from Loughborough University, Loughborough,
U.K., in 1996, all in electrical engineering. From 1981 to 1987, he was with
different Algerian state-owned companies. From 1987 to 1990, he was a
Research and Teaching Assistant in the Electrical Engineering Department at
KFUPM. Currently, he is Professor in the Electrical Engineering Department,
KFUPM, working in the areas of signal processing and communications.
He is an Associate Editor of the EURASIP Journal on Advances in Signal
Processing. His research interests include signal processing for communica-
tions, adaptive ltering, neural networks, multiuser detection, and interference
cancellation.
Asoke K. Nandi (F10) received the degree of Ph.D. in High Energy Physics
from the University of Cambridge (Trinity College), Cambridge, UK, in 1979.
He held several research positions in Rutherford Appleton Laboratory (UK),
European Organisation for Nuclear Research (Switzerland), Department of
Physics, Queen Mary College (London, UK) and Department of Nuclear
Physics (Oxford, UK). In 1987, he joined the Imperial College, London, UK,
as the Solartron Lecturer in the Signal Processing Section of the Electrical
Engineering Department. In 1991, he jointed the Signal Processing Division
of the Electronic and Electrical Engineering Department in the University
of Strathclyde, Glasgow, UK, as a Senior Lecturer; subsequently, he was
appointed a Reader in 1995 and a Professor in 1998. In March 1999 he moved
to the University of Liverpool, Liverpool, UK, to take up his appointment to
the David Jardine Chair of Signal Processing in the Department of Electrical
Engineering and Electronics. Professor Nandi is a Finland Distinguished
Professor. In 1983 he was a member of the UA1 team at CERN that discovered
the three fundamental particles known as W
+
, W

and Z
0
, providing the
evidence for the unication of the electromagnetic and weak forces, which
was recognized by the Nobel Committee for Physics in 1984. Currently,
he is the Head of the Signal Processing and Communications Research
Group with interests in the areas of signal processing, machine learning, and
communications research. With his group he has been carrying out research
in developments and applications of machine learning, biomedical signal
processing, bioinformatics, machine condition monitoring, communications
signal processing, and blind source separation. He has authored or co-authored
over 400 technical publications; these include two books, entitled Automatic
Modulation Recognition of Communications Signals (Boston, MA: Kluwer
Academic, 1996) and Blind Estimation Using Higher-Order Statistics (Boston,
MA: Kluwer Academic, 1999), and over 180 journal papers. The h-index of
his publications, according to the Web of Science, is 38. Professor Nandi was
awarded the Mounbatten Premium, Division Award of the Electronics and
Communications Division, of the Institution of Electrical Engineers of the
U.K. in 1998 and the Water Arbitration Prize of the Institution of Mechanical
Engineers of the U.K. in 1999. He is a Fellow of the Institute of Electrical
and Electronics Engineers (USA), the Cambridge Philosophical Society, the
Institution of Engineering and Technology, the Institute of Mathematics and its
applications, the Institute of Physics, the Royal Society for Arts, the Institution
of Mechanical Engineers, and the British Computer Society.

You might also like