You are on page 1of 13

Principal Component extraction

and its feature selection for ECG


beats

Student : M. Y. Li
Advisor : S. N. Yu
Date:2008/10/24
Outline
 Introduction.
 Principal Component extraction.
 Feature selection.
 Fisher linear discriminant.
 Correlation coefficients.
 Selection concept.
 Experiment results.
 Conclusions.
Introduction
 The ECG is noninvasive in nature and valuable in the diagnosis
of heart diseases.

 It is the high mortality rate of heart diseases. (i,e arrhythmias)


 faithful detection .
 Classification.
 In recent years, many algorithms have been developed for the
detection and classification of the ECG signals.
 Feature extraction.
 PCA
 Feature selection.
 Entropy selection.
 Fisher linear discriminant.
Principal Component Extraction
 PCA can be used for dimensionality reduction in a data
set
 keeping lower-order principal components
 low-order components often contain the "most
important" aspects of the data
 The pca algorithm.
 Data zero mean.
 Eigen vector decomposed.
 The eigenvectors with the largest eigenvalues correspond
to the dimensions that have the strongest correlation in the
data set.
Principal Component Extraction
 Properties and Limitations.

Assumption on Linearity.
 Assumption on the statistical importance of mean
and covariance.
 Eigenvectors of the covariance matrix and it only finds
the independent axes of the data under the Gaussian
assumption .

 When PCA is used for clustering.


 it does not account for class separability since it
makes no use of the class label of the feature vector.
Feature Selection
(Fisher linear discriminant)
 Fisher Discriminality
SB
Sk =
SW
c
SW = ∑ ∑ i ki hi
n (f - f ) 2

i =1 f ki ∈Di
c
SB = ∑ n i (f ki - f k ) 2
i =1
Where
n i is the number of features in class i , isD i the

feature set associated with class i, and are the mean


f ki fk
of feature in class and the entire feature set,
k th
Di
respectively.
Correlation coefficients
 Correlation coefficients(CC) is used to evaluate the dependency
between two random variables.
σ kl
ρ kl =
σ kσ l
 σ kl is the covariance, and σ k and σ l , are the standard
deviations of k th , and l th , features.
Selection concept
 For p-dimensional features
 There are p(p-1)/2 possible correlation coefficients
and p Fisher discrimination power.
 Select procedure.
 Sort the p(p-1)/2 correlation coefficients.
 The result were ranked correlation coefficient in
descending.
 The one with lower on fisher discrimination power
was redundant.
sorting ρ ij > ρ kl
ρ ijl1 ρ l2kl Unsort
i = k1
ρ ij ρ ij
j = k3

S k1 S k2 S k3
Experiment results
 Methods compare
 ICA (chu)
 ICA+FLD+CC.
 PCA+FLD+CC.
 PCA
 Data base
 MIT (arrhythmias)
 Lead II

 The order form m=1 to m=17


 Type
 Norm,LBBB,RBBB,PB,PVC,APB,VFW,VEB.
 Total
 Training 4900
 Testing 4900
Experiment results
 Accuracy compare
order m=1 m=4 m=9 m=13 m=15 m=17
Methods

ICA 52.59 82.85 94.52 97.26 97.80 98.19


PCA 59.61 92.34 98.02 98.77 98.77 98.73
ICA+FLD 55.53 66.33 77.31 85.95 88.41 90.40
+CC
PCA+FLD 59.61 91.98 97.89 98.34 98.59 98.73
+CC

Accuracy
Experiment results
 The other diseases compare (m=4 ,m=9,m=17)
 PB
order
m=4 m=9 m=17
method

ICA 98.90 99.65 99.85


PCA 97.00 100.00 100.00
ICA+FLD+CC 55.47 82.95 99.37
PCA+FLD+CC 98.50 99.75 100.00

 VFW
order
m=4 m=9 m=17
Method

ICA 68.47 86.99 90.63


PCA 79.23 84.74 87.71
ICA+FLD+CC 40.80 59.23 78.43
PCA+FLD+CC 73.72 86.86 88.13
Experiment results
 VEB
order
m=4 m=9 m=17
method

ICA 5.57 78.46 91.73


PCA 0.00 92.30 94.23
ICA+FLD+CC 0.00 5.385 32.69
PCA+FLD+CC 80.76 92.30 94.23
Conclusions

You might also like