Professional Documents
Culture Documents
Abstract—The aim of this paper is twofold. First, we present a In the literature, several methods have been proposed for
thorough experimental study to show the superiority of the gener- the automatic classification of ECG signals. Among the most
alization capability of the support vector machine (SVM) approach recently published works are those presented in [1]–[10]. In
in the automatic classification of electrocardiogram (ECG) beats.
Second, we propose a novel classification system based on particle greater detail, the method presented in [1] is based on a hybrid
swarm optimization (PSO) to improve the generalization perfor- fuzzy neural network that consists of a fuzzy self-organizing
mance of the SVM classifier. For this purpose, we have optimized subnetwork connected in a cascade with a multilayer percep-
the SVM classifier design by searching for the best value of the tron. The authors proposed to use high-order statistics (i.e., cu-
parameters that tune its discriminant function, and upstream by mulants of the second, third, and fourth orders) as input features
looking for the best subset of features that feed the classifier. The
experiments were conducted on the basis of ECG data from the for feeding their classifier. In [2], a neuro-fuzzy approach for the
Massachusetts Institute of Technology–Beth Israel Hospital (MIT– ECG-based classification of heart rhythms is described. Here,
BIH) arrhythmia database to classify five kinds of abnormal wave- the QRS complex signal is characterized by Hermite polyno-
forms and normal beats. In particular, they were organized so as to mials, whose coefficients feed the neuro-fuzzy classifier. In [3],
test the sensitivity of the SVM classifier and that of two reference the authors implemented two classification systems based on
classifiers used for comparison, i.e., the k-nearest neighbor (kNN)
classifier and the radial basis function (RBF) neural network clas- the support vector machine (SVM) approach. The first exploits
sifier, with respect to the curse of dimensionality and the number features based on high-order statistics, while the second uses
of available training beats. The obtained results clearly confirm the coefficients of Hermite polynomials. For improved perfor-
the superiority of the SVM approach as compared to traditional mance, the authors propose to combine the two classifiers by
classifiers, and suggest that further substantial improvements in means of a weighting mechanism, whose weights are deter-
terms of classification accuracy can be achieved by the proposed
PSO–SVM classification system. On an average, over three exper- mined according to a least square estimation method. Detec-
iments making use of a different total number of training beats tion of premature ventricular contractions (PVCs) by means of
(250, 500, and 750, respectively), the PSO–SVM yielded an overall a fuzzy-neural network classifier with features derived from a
accuracy of 89.72% on 40438 test beats selected from 20 patient quadratic spline wavelet transform is proposed in [4]. In [5], dif-
records against 85.98%, 83.70%, and 82.34% for the SVM, the ferent classification systems based on linear discriminant clas-
kNN, and the RBF classifiers, respectively.
sifiers are explored, together with different morphological and
Index Terms—Electrocardiogram (ECG) signal classification, timing features obtained from single and multiple ECG leads.
feature detection, feature reduction, generalization capability, In [6], a high-order spectral analysis method is proposed for
model selection issue, particle swarm optimization (PSO), support
vector machine (SVM). the analysis and classification of cardiac arrhythmias, based on
bispectral analysis techniques. In particular, the bispectrum is
estimated using an autoregressive model, and the frequency sup-
I. INTRODUCTION port of the bispectrum is extracted as a quantitative measure to
classify atrial and ventricular tachyarrhythmias. In [7], an auto-
OR SEVERAL years, the automatic classification of elec-
F trocardiogram (ECG) signals has received great attention
from the biomedical engineering community. This is mainly due
matic online beat segmentation and classification system based
on a Markovian approach is proposed. The system carries out
ECG signal analysis through two processing layers. In the first,
to the fact that ECG provides cardiologists with useful informa-
the ECG signal is segmented into beat waveforms by means of
tion about the rhythm and functioning of the heart. Therefore, its
a robust and precise waveform modeling with hidden Markov
analysis represents an efficient way to detect and treat different
models (HMMs). In the second, the system identifies prema-
kinds of cardiac diseases.
ture ventricular contraction beats using a simple set of rules.
In [8], a rule-based rough-set decision system is presented for
Manuscript received June 21, 2007; revised December 31, 2007 and
the development of an inference engine for disease identification
March 4, 2008. First published April 11, 2008; current version published using time-domain features. In [9], a patient-adapting heartbeat
September 4, 2008. classifier system based on linear discriminants is proposed. The
F. Melgani is with the Department of Information Engineering and Com-
puter Science, University of Trento, I-38050 Trento, Italy (e-mail: melgani@
classification system processes an incoming recording with a
disi.unitn.it). global-classifier to produce the first set of beat annotations.
Y. Bazi is with the College of Engineering, Al Jouf University, Al Jouf 2014, Then, an expert validates, and, if necessary, corrects a fraction
Saudi Arabia, (e-mail: yakoub.bazi@ju.edu.sa).
Digital Object Identifier 10.1109/TITB.2008.923147
of the beats of the recording. The system then adapts by first
training a local classifier using the newly annotated beats, and The rest of the paper is organized as follows. The basic math-
combines both local and global classifiers to form an adapted ematical formulation of SVMs for solving binary and multiclass
classification system. Finally, in [10], the authors present an classification problems is recalled in Section II. The main con-
approach for classifying beats of a large dataset by training a cepts and principles of PSO are introduced in Section III. The
neural network classifier using wavelet and timing features. The proposed PSO–SVM classification system is described in Sec-
authors found that the fourth scale of a dyadic wavelet trans- tion IV. The experimental results obtained on ECG data from
form with a quadratic spline wavelet together with the pre/post the Massachusetts Institute of Technology–Beth Israel Hospital
RR-interval ratio is very effective in distinguishing normal and (MIT–BIH) arrhythmia database [17] are reported in Sections V
PVC from other beats. and VI. Finally, conclusions are drawn in Section VII.
From these works, it appears clear that research in the field
of automatic ECG classification has reached a good level of II. SUPPORT VECTOR MACHINES
maturation. However, in the design of an ECG classification
Let us first consider, for simplicity, a supervised binary classi-
system, there are still some open issues, which, if suitably ad-
fication problem. Let us assume that the training set consists of
dressed, may lead to the development of more robust and ef-
N vectors xi ∈ d (i = 1, 2, . . . , N ) from the d-dimensional
ficient classifiers. One of these issues is related to the choice
feature space X. To each vector xi , we associate a target
of the classification approach to be adopted. In particular, we
yi ∈ {−1, +1}. The linear SVM classification approach con-
think that despite its great potential, the SVM approach has not
sists of looking for a separation between the two classes in X by
received the attention it deserves in the ECG classification lit-
means of an optimal hyperplane that maximizes the separating
erature as compared to other research fields. Indeed, the SVM
margin [11]. In the nonlinear case, which is the most commonly
classifier exhibits a promising generalization capability, thanks
used as data are often linearly nonseparable, the two classes are
to the maximal margin principle (MMP) it is based on [11]. An-
first mapped with a kernel method in a higher dimensional fea-
other important property is that it is less sensitive to the curse of
ture space, i.e., Φ(X) ∈ d (d > d). The membership decision
dimensionality than traditional classification approaches. This
rule is based on the function sign[f (x)], where f (x) represents
is explained by the fact that the MMP makes it unnecessary
the discriminant function associated with the hyperplane in the
to estimate explicitly the statistical distributions of classes in
transformed space and is defined as
the hyperdimensional feature space in order to carry out the
classification task. Thanks to these interesting properties, the f (x) = w∗ Φ(x) + b∗ . (1)
SVM classifier has proved successful in a number of different
application fields, such as 3-D object recognition [12], biomed- The optimal hyperplane defined by the weight vector w∗ ∈
d
ical imaging [13], image compression [14], and remote sens- and the bias b∗ ∈ is the one that minimizes a cost function
ing [15], [16]. Turning back to ECG classification, other issues that expresses a combination of two criteria: margin maximiza-
that need to be addressed are the following: 1) feature selection tion and error minimization. It is expressed as [11]
is not performed in a completely automatic way and 2) the se-
1 N
lection of the best free parameters of the adopted classifier is Ψ(w, ξ) = w2 + C ξi . (2)
generally done empirically (model selection issue). 2 i=1
In this paper, in order to address the aforementioned issues, in
a first step, we present a thorough experimental exploration of This cost function minimization is subject to the following
the SVM capabilities for ECG classification. In a second step, constraints:
we propose to optimize further the performances of the SVM yi (wΦ(xi ) + b) ≥ 1 − ξi , i = 1, 2, . . . , N (3)
approach in terms of classification accuracy: 1) by automati-
cally detecting the best discriminating features from the whole and
considered feature space and 2) by solving the model selection
issue. Unlike traditional feature selection methods, where the ξi ≥ 0,i = 1, 2, . . . , N (4)
user has to specify the number of desired features, the proposed where the ξi s are slack variables introduced to account for
system allows to carry out what we term “feature detection.” nonseparable data. The constant C represents a regularization
Feature selection and feature detection have the common char- parameter that allows to control the shape of the discriminant
acteristic of searching for the best discriminative features. The function. The aforementioned optimization problem can be re-
latter, however, has the advantage of determining their number formulated through a Lagrange functional, for which the La-
automatically. In other words, feature detection does not require grange multipliers can be found by means of a dual optimization
the desired number of most discriminative features from the user leading to a quadratic programming (QP) solution [11], i.e.,
a priori. The detection process is implemented through a particle
swarm optimization (PSO) framework that exploits a criterion
N
1
N
intrinsically related to SVM classifier properties, namely, the max αi − αi αj yi yj K(xi , xj ) (5)
α 2 i,j =1
number of support vectors (SVs). This framework is formulated i=1
in such a way that it also solves the model selection issue, i.e., under the constraints
to estimate the best values of the SVM classifier parameters,
which are the regularization and kernel parameters. αi ≥ 0, for i = 1, 2, . . . , N (6)
MELGANI AND BAZI: CLASSIFICATION OF ECG SIGNALS WITH SVMs AND PSO 669
and all other individuals in the same population. During the itera-
N tive search process in the d-dimensional solution space, each
αi yi = 0 (7) particle (i.e., candidate solution) will adjust its flying velocity
i=1 and position according to its own flying experience as well as
those of the other companion particles in the swarm. PSO has
where α = [α1 , α2 , . . . , αN ] is the vector of Lagrange mul-
proved promising in solving a number of engineering problems
tipliers and K(·, ·) is a kernel function. The final result is a
such as automatic control [20], antenna design [21], and inverse
discriminant function conveniently expressed as a function of
problems [22]. In the following, we will briefly describe the
the data in the original (lower) dimensional feature space X
main concepts of the basic PSO algorithm.
f (x) = αi∗ yi K(xi , x) + b∗ . (8) Let us consider a swarm of size S. Each particle Pi (i =
i∈S 1, 2, . . . , S) in the swarm is characterized by: 1) its current po-
sition pi (t) ∈ d , which refers to a candidate solution of the
The set S is a subset of the indexes {1, 2, . . . , N } corre-
optimization problem at iteration T ; 2) its velocity vi (t) ∈ d ;
sponding to the nonzero Lagrange multipliers αi s, which define
and 3) the best position pbi (t) ∈ d identified during its past
the so-called SVs. The kernel K(·,·) must satisfy the condition
trajectory. Let pg (t) ∈ d be the best global position found over
stated in Mercer’s theorem so as to correspond to some type
all trajectories traveled by the particles of the swarm. Position
of inner product in the transformed (higher) dimensional fea-
optimality is measured by means of one or more fitness func-
ture space Φ(X) [11]. A typical example of such kernels is
tions defined in relation to the considered optimization problem.
represented by the following Gaussian function:
During the search process, the particles move according to the
K(xi , x) = exp(−γxi − x2 ) (9) following equations:
where γ represents a parameter inversely proportional to the vi (t + 1) = wvi (t) + c1 r1 (t) (pbi (t) − pi (t))
width of the Gaussian kernel.
As described before, SVMs are intrinsically binary classi- + c2 r2 (t) (pg (t) − pi (t)) (10)
fiers. But, the classification of ECG signals often involves the pi (t + 1) = pi (t) + vi (t) (11)
simultaneous discrimination of numerous information classes.
In order to face this issue, a number of multiclass classifica- where r1 (·) and r2 (·) are random variables drawn from a uni-
tion strategies can be adopted [15], [18]. The most popular ones form distribution in the range [0,1] so as to provide a stochastic
are the one-against-all (OAA) and the one-against-one (OAO) weighting of the different components participating in the parti-
strategies. The former involves a reduced number of binary de- cle velocity definition. c1 and c2 are two acceleration constants
compositions (and thus, of SVMs), which are, however, more regulating the relative velocities with respect to the best global
complex. The latter requires a shorter training time, but may and local positions, respectively. In greater detail, these parame-
incur conflicts between classes due to the nature of the score ters are considered as scaling factors that determine the relative
function used for decision. Both strategies generally lead to pull of the best position of the particle and the global best posi-
similar results in terms of classification accuracy. In this pa- tion. Sometimes, it is referred to them as the cognitive and social
per, we shall consider the OAA strategy. Briefly, this strategy is rates, respectively. They are factors determining how much the
based on the following procedure. Let Ω = {w1 , w2 , . . . , wT } particle is influenced by the memory of its best location and
be the set of T possible labels (information classes) associated by the rest of the swarm, respectively. The inertia weight w
with the ECG beats that we desire to classify. First, an ensemble is used as a tradeoff between the global and local exploration
of T (parallel) SVM classifiers is trained. Each classifier aims capabilities of the swarm. Large values of this parameter per-
at solving a binary classification problem defined by the dis- mit better global exploration, while small values lead to a fine
crimination between one information class ωi (i = 1, 2, . . . , T ) search in the solution space. Equation (10) allows the compu-
against all others (i.e., Ω − {wi }). Then, in the classification tation of the velocity at iteration T + 1 for each particle in the
phase, the “winner-takes-all” rule is used to decide which label swarm by combining linearly its current velocity (at iteration
to assign to each beat. This means that the winning class is the T ) and the distances that separate the current particle position
one that corresponds to the SVM classifier of the ensemble that from its best previous position and the best global position,
shows the highest output (discriminant function value). respectively. The particle position is updated with (11). Both
(10) and (11) are iterated until convergence of the search pro-
III. PARTICLE SWARM OPTIMIZATION cess is reached. Typical convergence criteria are based on the
iterative behavior of the best value of the adopted fitness func-
PSO is a stochastic optimization technique introduced re- tion(s) or/and simply on a user-defined maximum number of
cently by Kennedy and Eberhart, inspired by the social behavior iterations.
of bird flocking and fish schooling [19]. Similar to other evo-
lutionary computation algorithms, such as genetic algorithms
(GAs) [16], PSO is a population-based search method that ex- IV. PROPOSED PSO–SVM CLASSIFICATION SYSTEM
ploits the concept of social sharing of information. This means In this section, we describe the proposed SVM system for
that each individual (called particle) of a given population the classification of ECG signals. As mentioned in the Introduc-
(called swarm) can benefit from the previous experiences of tion, the aim of this system is to optimize the SVM classifier
670 IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, VOL. 12, NO. 5, SEPTEMBER 2008
TABLE I
NUMBERS OF TRAINING AND TEST BEATS USED IN THE EXPERIMENTS
TABLE II
OVERALL (OA), AVERAGE (AA), AND CLASS PERCENTAGE ACCURACIES ACHIEVED ON THE TEST BEATS WITH
THE DIFFERENT INVESTIGATED CLASSIFIERS WITH A TOTAL NUMBER OF 500 TRAINING BEATS
C. Experiment Settings polynomial kernel. Each time we left one of the subsets out
of the training, and only used it to obtain an estimate of the
In the experiments, we considered the nonlinear SVM based
on the popular Gaussian kernel (referred to as SVM-RBF or classification accuracy. From m times of training and accuracy
simply SVM). The related parameters C and γ for this ker- computation, the AA yielded a prediction of the classification
accuracy of the considered SVM classifier. We chose the best
nel were varied in the arbitrarily fixed ranges [10−3 , 200] and
[10−3 , 2] so as to cover high and small regularization of the clas- SVM classifier parameter values to maximize this prediction.
In all experiments reported in this paper, we adopted a fivefold
sification model, and fat as well as thin kernels, respectively. In
CV. The same procedure was adopted to find the best parameters
addition, for comparison purpose, we implemented, in the first
experiment, the SVM classifier with two other kernels, which for the kNN and RBF classifiers. We recall that this empirical
parameter estimation procedure and all the classification exper-
are the linear and the polynomial kernels, leading thus to two
iments were repeated three times, each with one of the three
other SVM classifiers termed as SVM-linear and SVM-poly,
respectively. The degree d of the polynomial kernel was varied different training sets generated randomly.
As reported in Table II, the OA and AA accuracies achieved
in the range [2,5] in order to span polynomials with low and
high flexibility. The K value and the number of hidden nodes with the SVM classifier based on the Gaussian kernel (SVM–
(h) of the kNN and the RBF classifiers were tuned in the arbi- RBF) on the test set were equal to 87.76% and 87.48%, re-
spectively. These results were better than those achieved by the
trarily fixed intervals [1,15] and [10,60], respectively. The other
RBF parameters, which include the center and the width of each SVM-linear, the SVM-poly, the RBF, and the kNN classifiers.
RBF (kernel), were computed by applying the K-means clus- Indeed, the OA (and AA) accuracies were equal to 80.55%
(78.90%) for the SVM-linear classifier, 85.25% (85.75%) for
tering algorithm separately to each class [26]. Concerning the
PSO algorithm, we considered the following standard param- the SVM-poly classifier, 82.74% (82.07%) for the RBF clas-
eters: swarm size S = 40, inertia weight w = 0.4, acceleration sifier, and 81.36% (80.70%) for the kNN classifier. Note that
depending on the classifier, the most difficult classes to discrim-
constants c1 and c2 equal to the unity, and maximum number of
iterations fixed at 40. inate were the paced beat (/), the ventricular premature beat
(V ), and the atrial premature beat (A) classes, which are also
the most overlapped ones according to Fig. 1.
VI. EXPERIMENTAL RESULTS This experiment appears to confirm what was observed in
other application fields, i.e., the superiority of SVM based on
A. Experiment 1: Classification in the Whole Original the Gaussian kernel as compared to traditional classifiers when
Hyperdimensional Feature Space dealing with feature spaces of very high dimensionality. In ad-
As mentioned earlier, in this experiment, we applied the SVM dition, it provides reference classification accuracies in order to
classifier directly on the entire original hyperdimensional fea- quantify the capability of the proposed PSO–SVM classification
ture space, which is made up of 303 features. During the training system to further improve these interesting results.
phase, the SVM parameters were selected according to a m-fold
cross-validation (CV) procedure [27], first by randomly split-
ting the 500 training beats into m mutually exclusive subsets B. Experiment 2: Classification Based on Feature Reduction
(folds) of equal size, and then, by training m times an SVM In this experiment, we trained the SVM classifier based on the
classifier modeled with predefined values: C for the linear ker- Gaussian kernel, which proved in the previous experiments to
nel, (C and γ) for the Gaussian kernel, and (C and d) for the be the most appropriate kernel for ECG signal classification, in
MELGANI AND BAZI: CLASSIFICATION OF ECG SIGNALS WITH SVMs AND PSO 673
TABLE III
STATISTICAL SIGNIFICANCE OF DIFFERENCES IN CLASSIFICATION ACCURACY BETWEEN THE NINE INVESTIGATED CLASSIFIERS
EXPRESSED BY MEANS OF THE MCNEMAR’S TEST WITH A TOTAL OF (a) 500, (b) 250, AND (c) 750 TRAINING BEATS
TABLE IV and maximum numbers of features were obtained for the ven-
NUMBER OF FEATURES DETECTED FOR EACH CLASS WITH THE PSO–SVM
CLASSIFICATION SYSTEM TRAINED ON 500 BEATS
tricular premature (V ) and normal (N ) classes with 35 and 63
features, respectively.
TABLE V
OVERALL (OA), AVERAGE (AA), AND CLASS PERCENTAGE ACCURACIES ACHIEVED ON THE TEST BEATS WITH
THE DIFFERENT EXPLORED CLASSIFIERS WITH A TOTAL NUMBER OF (a) 250 AND (b) 750 TRAINING BEATS
In the second step, we fixed w = 0.6 (corresponding to the best [2] T. H. Linh, S. Osowski, and M. L. Stodoloski, “On-line heart beat recogni-
obtained accuracy) and we varied c1 and c2 in the range [1,2] tion using Hermite polynomials and neuron-fuzzy network,” IEEE Trans.
Instrum. Meas., vol. 52, no. 4, pp. 1224–1231, Aug. 2003.
(according to [19]). In this case, the OA and AA accuracies were [3] S. Osowski, T. H. Linh, and T. Markiewicz, “Support vector machine-
less affected by the variation of these parameters. Indeed, they based expert system for reliable heart beat recognition,” IEEE Trans.
fluctuated from 90.18% (91.65%) for c1 = c2 = 1.2 to 90.88% Biomed. Eng., vol. 51, no. 4, pp. 582–589, Apr. 2004.
[4] L.Y. Shyu, Y. H. Wu, and W. Hu, “Using wavelet transform and fuzzy
(92.70%) for c1 = c2 = 1. neural network for VPC detection form the Holter ECG,” IEEE Trans.
As this empirical analysis shows, the PSO optimizer appears Biomed. Eng., vol. 51, no. 7, pp. 1269–1273, Jul. 2004.
more sensitive to the inertia weight parameter than the two [5] F. de Chazal, M. O’Dwyer, and R. B. Reilly, “Automatic classification
of ECG heartbeats using ECG morphology and heartbeat interval fea-
other parameters. However, even when nonstandard parameter tures,” IEEE Trans. Biomed. Eng., vol. 51, no. 7, pp. 1196–1206, Jul.
values are adopted, the achieved accuracies keep still above 2004.
those yielded by the reference classifiers. [6] L. Khadra, A. S. Al-Fahoum, and S. Binajjaj, “A quantitative analysis
approach for cardiac arrhythmia classification using higher order spectral
techniques,” IEEE Trans. Biomed. Eng., vol. 52, no. 11, pp. 1840–1845,
Nov. 2005.
VII. CONCLUSION [7] R. V. Andreao, B. Dorizzi, and J. Boudy, “ECG signal analysis through
hidden Markov models,” IEEE Trans. Biomed. Eng., vol. 53, no. 8,
From the obtained experimental results, we can strongly rec- pp. 1541–1549, Aug. 2006.
ommend the use of the SVM approach for classifying ECG [8] S. Mitra, M. Mitra, and B. B. Chaudhuri, “A rough set-based inference
signals on account of their superior generalization capability as engine for ECG classification,” IEEE Trans. Instrum. Meas., vol. 55,
no. 6, pp. 2198–2206, Dec. 2006.
compared to traditional classification techniques. This capabil- [9] F. de Chazal and R. B. Reilly, “A patient adapting heart beat classifier
ity generally provides them with higher classification accuracies using ECG morphology and heartbeat interval features,” IEEE Trans.
and a lower sensitivity to the curse of dimensionality. Biomed. Eng., vol. 53, no. 12, pp. 2535–2543, Dec. 2006.
[10] T. Inan, L. Giovangrandi, and J. T. A. Kovacs, “Robust neural network
The main novelty of this paper is in the proposed PSO-based based classification of premature ventricular contractions using wavelet
approach, which aims at optimizing the performances of SVM transform and timing interval features,” IEEE Trans. Biomed. Eng.,
classifiers in terms of classification accuracy by detecting the vol. 53, no. 12, pp. 2507–2515, Dec. 2006.
[11] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.
best subset of available features and solving the tricky model [12] M. Pontil and A. Verri, “Support vector machines for 3D object recogni-
selection issue. The fact that it is entirely automatic makes it tion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 6, pp. 637–646,
particularly useful and attractive. The results confirm that the Jun. 1998.
[13] Y. Y. El-Naqa, M. N. Wernick, N. P. Galatsanos, and R. M. Nishikawa,
PSO–SVM classification system substantially boosts the gener- “A support vector machine approach for detection of microcalcifica-
alization capability achievable with the SVM classifier, and its tions,” IEEE Trans. Med. Imag., vol. 21, no. 12, pp. 1552–1563, Dec.
robustness against the problem of limited training beat availabil- 2002.
[14] J. Robinson and V. Kecman, “Combining support vector machine learning
ity, which may characterize pathologies of rare occurrence. An- with the discrete cosine transform in image compression,” IEEE Trans.
other advantage of the PSO–SVM approach can be found in its Neural Netw., vol. 14, no. 4, pp. 950–958, Jul. 2003.
high sparseness, which is explained by the fact that the adopted [15] F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sens-
ing images with support vector machine,” IEEE Trans. Geosci. Remote
optimization criterion is based on minimizing the number of Sens., vol. 42, no. 8, pp. 1778–1790, Aug. 2004.
SVs. This criterion favors the definition of compact discrimi- [16] Y. Bazi and F. Melgani, “Toward an optimal SVM classification system
nant functions, which are thus easy to implement on a hardware for hyperspectral remote sensing images,” IEEE Trans. Geosci. Remote
Sens., vol. 44, no. 11, pp. 3374–3385, Nov. 2006.
platform. For such purpose, the PSO–SVM classifier should first [17] R. Mark and G. Moody MIT-BIH Arrhythmia Database 1997 [Online].
be run on a PC for determining the best features for each class Available http://ecg. mit.edu/dbinfo.html.
and the discrimination model (SVs and related weights) of the [18] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support
vector machines,” IEEE Trans. Neural Netw., vol. 13, no. 2, pp. 415–425,
corresponding SVM. Mar. 2002.
Finally, it is noteworthy that, thanks to its general nature, the [19] J. Kennedy and R. C. Eberhart, Swarm Intelligence. San Mateo, CA:
proposed PSO–SVM system is applicable not only to morphol- Morgan Kaufmann, 2001.
[20] Z. L. Gaing, “A particle swarm optimization approach for optimum design
ogy and temporal features, but also to other types of features of PID controller in AVR system,” IEEE Trans. Energy Convers., vol. 19,
such as those based on wavelets and high-order statistics. Fur- no. 2, pp. 384–391, Jun. 2004.
thermore, other optimization criteria could be considered as [21] M. Donelli, R. Azzaro, F. G. B. De Natale, and A. Massa, “An innovative
computational approach based on a particle swarm strategy for adaptive
well, individually or jointly depending on the application re- phased-arrays control,” IEEE Trans. Antennas Propag., vol. 54, no. 3,
quirements. pp. 888–898, Mar. 2006.
[22] W. H. Slade, H. W. Ressom, M. T. Musavi, and R. L. Miller, “Inversion
of ocean color observations using particle swarm optimization,” IEEE
ACKNOWLEDGMENT Trans. Geosci. Remote Sens., vol. 42, no. 9, pp. 1915–1923, Sep. 2004.
[23] J. J. Wei, C. J. Chang, N. K. Shou, and G. J. Jan, “ECG data compression
The authors would like to thank Dr. C.-C. Chang and using truncated singular value decomposition,” IEEE Trans. Biomed.
Dr. C.-J. Lin for supplying the software LIBSVM (http://www. Eng., vol. 5, no. 4, pp. 290–299, Dec. 2001.
[24] R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. New
csie.ntu.edu.tw/ ˜cjlin/libsvm) used in this research. York: Wiley, 2001.
[25] A. Agresti, Categorical Data Analysis, 2nd ed. New York: Wiley, 2002.
[26] L. Bruzzone and D. Prieto, “A technique for the selection of kernel function
REFERENCES parameters in RBF neural networks for classification of remote sensing
images,” IEEE Trans. Geosci. Remote. Sens., vol. 37, no. 2, pp. 1179–
[1] S. Osowski and T. H. Linh, “ECG beat recognition using fuzzy hybrid 1184, Mar. 1999.
neural network,” IEEE Trans. Biomed. Eng., vol. 48, no. 11, pp. 1265– [27] M. Stone, “Cross-validatory choice and assessment of statistical predic-
1271, Nov. 2001. tions,” J. R. Statist. Soc. B, vol. 36, pp. 111–147, 1974.
MELGANI AND BAZI: CLASSIFICATION OF ECG SIGNALS WITH SVMs AND PSO 677
Farid Melgani (M’04–SM’06) received the State En- Yakoub Bazi (S’05–M’07) received the State Engi-
gineer degree in electronics from the University of neer and M.Sc. degrees in electronics from the Uni-
Batna, Batna, Algeria, in 1994, the M.Sc. degree in versity of Batna, Batna, Algeria, in 1994 and 2000,
electrical engineering from the University of Bagh- respectively, and the Ph.D. degree in information and
dad, Baghdad, Iraq, in 1999, and the Ph.D. degree in communication technology from the University of
electronic and computer engineering from the Uni- Trento, Trento, Italy, in 2005.
versity of Genoa, Genoa, Italy, in 2003. From 2000 to 2002, he was a Lecturer at the Uni-
From 1999 to 2002, he was with the Signal Pro- versity of M’sila, M’sila, Algeria. From January 2006
cessing and Telecommunications Group, Department to June 2006, he was a Postdoctoral Researcher at the
of Biophysical and Electronic Engineering, Univer- University of Trento. He is currently an Assistant
sity of Genoa. Since 2002, he has been with the Uni- Professor in the College of Engineering, Al Jouf Uni-
versity of Trento, Trento, Italy, where he is an Assistant Professor of Telecommu- versity, Al Jouf, Saudi Arabia. His current research interests include pattern
nications, and currently, the Head of the Intelligent Information Processing (I2P) recognition and evolutionary computation methodologies applied to remote
Laboratory, Department of Information Engineering and Computer Science. His sensing images and biomedical signal/images (change detection, classification,
current research interests include processing, pattern recognition, and machine and semisupervised learning).
learning techniques applied to remote sensing and biomedical signals/images Dr. Bazi is a Referee for several international journals.
(classification, regression, multitemporal analysis, and data fusion). He is the
author or coauthor of more than 80 scientific papers and is a referee for several
international journals.
Dr. Melgani was on the scientific committees of several international con-
ferences and is an Associate Editor of the IEEE GEOSCIENCE AND REMOTE
SENSING LETTERS.