Professional Documents
Culture Documents
Proceedings of ICSP2000
max ~ ( a ) -
= max- 1 ixia y y (xi x >+2ai
The optimal separating hyperplane is given by
a 2 i=] j=] i=l
maximizing margin, because a large margin can make
estimate reliable on the training set, and also make (9)
estimate perform well on unseen examples. Thus we with constraints,
[a;
2 0,i= l,...ym L(wb,{,a,P)=1/2(w*w)+ czti
i=l
(15)
m
- &((Xi
i=l
w)+ b3yi - 1+ ti >- 2pisi
i=l
The second term of (8) shows that the solution vector
has an expansion in terms of a subset of the training where ai,piare the Lagrange multipliers. As before,
patterns, namely those patters whose Lagrange Lagrangian duality enables the primal problem, (16),
multiplier aiis non-zero. By the Karush-Kuhn-Tucker to be transformed to its dual problem. The dual
complementarily conditions, these training patterns problem is given by,
are the ones for which
ai(yi((xi *w)+b)-l)=O, i = l , ...,m (11)
and therefore they correspond precisely to the Support By minimum with respect to w,b,ti of the
Vectors(SV). If the data is linearly separable, all the Lagrangian L, the dual problem is,
[ )
f ( x ) = sgn C a , y , ~ ( x ~ , x ) + b
SVS
(20) 4.Conclusion
SVMs are attractive approach to data modeling.
3. Feature Space and Mapping They combine statistical learning theory with
Mechanism of Kernel Function generalization control. The formulation results in a
By the use of reproducing kernels, we can global quadratic optimization problem with convex
construct a mapping into a high dimensional feature constraints, which is readily solved by interior point
space. The idea of the kernel function is to enable methods. The kernel mapping provides a unifying
operations to be performed in the input space rather framework for most of the commonly employed
than the potentially high dimensional feature space. model architectures. A technique for choosing the
Hence the inner product does not need to be evaluated kernel function and additional capacity control
in the feature space. remains to be further research.
The following theory is based upon Reproducing
Kernel Hilbert Spaces (RKHS). If K is a symmetric REFERENCE
positive definite function, which satisfies Mercers 1.Bueges C J C. A Tutorial on Support Vector
Conditions, Machines for Pattern Recognition. Data Mining and
Knowledge Discovery, Boston, 1998.
2..Gunn S . Support Vector Machines for Classification
and Regression. Image Speed and Intelligent
Systems Technical Report. 1998.
then the kernel represents a legitimate inner product in 3.Lu Zengxiang, etc. Supervised Support Vector
feature space. Valid functions that satisfy Mercers Machines Learning Algorithm and Application.
conditions are partially given, Journal of TSINGHUA University. No.7,1999.
4.Weston J, Herbrich R. Adaptive Margin Support
Vector Machines. London: the MIT Press, 1999.
@Gaussian Radial basis function 5.Blaze V, etc. Comparison of View-based Object
Recognition Algorithms Using Realistic 3d Models.
ICAN96,1996.
6.Cortes C, Vapnik V. Support Vector networks.
@ExponentialRadial basis function Machine Learning, 1995.
7.Osuma E, etc. An improved training algorithm for
support vector machines. In proceedings of the 1997
IEEE Workshop on Networks for Signal Processing,
Principe E J, etc. 1997.
@B splines K(x,y ) = B2n+l(X - y )
. 1559 -