Neural Comp. Sign up for ETOCS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Wu, Q.
Right arrow Articles by Zhou, D.-X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wu, Q.
Right arrow Articles by Zhou, D.-X.
(Neural Computation. 2005;17:1160-1187.)
© 2005 The MIT Press


Letter

SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming

Qiang Wu

wu.qiang{at}student.cityu.edu.hk, Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong, China

Ding-Xuan Zhou

mazhou{at}cityu.edu.hk, Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong, China

Support vector machine (SVM) soft margin classifiers are important learning algorithms for classification problems. They can be stated as convex optimization problems and are suitable for a large data setting. Linear programming SVM classifiers are especially efficient for very large size samples. But little is known about their convergence, compared with the well-understood quadratic programming SVM classifier. In this article, we point out the difficulty and provide an error analysis. Our analysis shows that the convergence behavior of the linear programming SVM is almost the same as that of the quadratic programming SVM. This is implemented by setting a stepping-stone between the linear programming SVM and the classical 1-norm soft margin classifier. An upper bound for the misclassification error is presented for general probability distributions. Explicit learning rates are derived for deterministic and weakly separable distributions, and for distributions satisfying some Tsybakov noise condition.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2005 by The MIT Press.