Professional Documents
Culture Documents
Abstract:
Even the support vector machine (SVM) has been proved
to improve the classification performance greatly than a single
SVM, the classification result of the practically implemented
SVM is often far from the theoretically expected level because
they dont evaluate the importance degree of the output of
individual component SVMs classifier to the final decision.
This paper proposes a boosting least square support vector
machine (LS-SVM) ensemble method based on fuzzy integral
to improve the limited classification performance. In general,
the proposed method is built in 3 steps: construct the
component LS-SVM; obtain the probabilistic outputs model of
each component LS-SVM; combine the component predictions
based on fuzzy integral. The trained individual LS-SVMs are
aggregated to make a final decision. The simulating results
demonstrate that the proposed LS-SVM ensemble with
boosting outperforms a single SVM and traditional SVM (or
LS-SVM) ensemble technique via majority voting in terms of
classification accuracy.
Keywords:
LS-SVM; SVM ensemble; Boosting; Fuzzy integral;
Information fusion
1.
Introduction
function.
The LS-SVM was introduced by Suykens [4], it uses
equality constraints instead of inequality constraints and a
least squares error term in order to obtain a linear set of
equations in the dual space, which is computationally
attractive.
Despite the high performance of SVM and LS-SVM,
researchers seek to further improve them with ensemble
techniques, such as bagging or boosting [5, 6, [7]. However,
all these methods dont consider the degree of importance
of the output of component SVMs and LS-SVMs, which
plays a significant role in the classification. To deal with the
problem, we propose a support vector machines ensemble
method based on fuzzy integral fusion technique. The
proposed method consists of three phases. Firstly, we use
boosting technique to construct the component LS-SVMs.
In boosting, the training samples for each individual SVM
is chosen according to updating probability distribution
(related to error) for samples. So we obtain probabilistic
outputs model of each individual component LS-SVMs
secondly. Finally we combine the component predictions
based on fuzzy integral in which the relative importance of
the different individual component SVM is considered.
Fuzzy integral nonlinearly combines objective evidences, in
the form of a fuzzy membership function, with subjective
evaluation of the worth of the individual LS-SVMs with
respect to the decision. The experimental results confirm
the superiority of the presented method to the majority
voting technique.
The rest of this paper is organized as follows. In
section 2, some basic notions of LS-SVM is reviewed. In
Section 3, probabilistic outputs models for LS-SVMs are
provided. Boosting method for constructing the LS-SVMs
ensemble and fuzzy integral for aggregating LS-SVMs are
described in Section 4. Section 5 presents experimental
results applied to benchmark problems. Finally, a
conclusion is drawn in section 6.
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006
2.
1 a y
1T 0 b = 0 ,
where
= ( K + 1 I ), K = {kij = H ( xi , x j )}li , j =1 ,
max
H (x, x) = exp 2 x x
W ( w , b) =
1 = (1,1,L ,1)T , y = ( y1 , y2 ,L , yl )T
and
= (1 , 2 ,L , l )T .
1
(5)
1 + exp( Af (x) + B )
With parameters A and B [9]. To estimate the best
values of ( A, B ) , any subset of l training data ( N + of
them with yi = 1 , and N of them with yi = 1 ) can be
P ( y = 1| x) =
where
l
l
1
2
w + ( yi w (xi ) b) 2 (2)
2
i =1
i =1
pi =
implementing a quadratic regularization of a sumof-squares empirical risk. The solution of this problem can
be written as an expansion in terms of training patterns,
(4)
(3)
2392
and
1
, f i = f ( xi ) ,
1 + exp( Af i + B )
N+ + 1
N +2
ti = +
1
N + 2
if yi = 1
,
if yi = 1
i = 1,2L, l . (7)
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006
3.2. Multi-classes case
The K classification problems can be efficiently
solved by partitioning the original problem into a set of
K ( K 1) 2 two-class problems. Given the observation
x and the class label y , we assume that the estimated
pairwise
class
probabilities
for
rij of
uij , i.e.
P( y = i or j | x) ( K 2) P( y = i | x)
P( y =
(8)
j | x) = 1
j =1
Using
rij uij =
P ( y = i | x)
P ( y = i or j | x)
(9)
We can obtain
pi =
4.
(10)
1
( K 2)
j : j i rij
of l whole samples
classifier,
we
LS-SVM ensemble
j: j i
K
build
set
of
training
that
is
g ( A U B) = g ( A) + g ( B) + g ( A) g ( B) (11)
for all A, B Y and A I B = , and for some
> 1 .
Because of the boundary condition g (Y ) = 1 , is
determined by solving the following polynomial equation
n
+ 1 = (1 + g i ) .
i =1
Ai = { yi , yi +1 ,L , yn } . For g being a
2393
g ( A1 ) = g ({ y1}) = g 1
g ( Ai ) = g i + g ( Ai 1 ) + g i g ( Ai 1 ) ,
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006
1< i n
for
(12)
suppose
g -fuzzy measure,
h( x1 ) h( x2 ) L h( xn ) , then the so-called fuzzy
integral e can be computed by
Based
on
the
e = max[min(
h xi
, g ( Ai ))]
i
(13)
75.66
77.32
78.24
83.65
LS-SVM 1
LS-SVM 2
LS-SVM n
TrainingSet 1
Training Set 2
Traing Set n
Original Training
Set
Single SVM
SVMs ensemble via
majority voting
LS-SVMs ensemble via
majority voting
LS-SVMs ensemble via
fuzzy integral
Experimental results
84.98
86.25
87.32
89.98
2394
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006
and LS-SVM ensemble via majority voting. The traditional
SVM and LS-SVMs ensemble method based on majority
voting only predicts the label and can not give posterior
probabilities, so the fusion strategy via majority voting do
not evaluate the importance degree of component
classifiers output to the final decision, which we think is
the main reason that the accuracy of LS-SVMs ensemble
via fuzzy integral fusion strategy is higher than via the
other methods.
6.
Conclusions
2395