I know that the SVM is a binary classifier. I would like to extend it to multi-class SVM. Which is the best, and maybe the easiest, way to perform it?
code: in MATLAB
u=unique(TrainLabel); N=length(u); if(N>2) itr=1; classes=0; while((classes~=1)&&(itr<=length(u))) c1=(TrainLabel==u(itr)); newClass=double(c1); tst = double((TestLabel == itr)); model = svmtrain(newClass, TrainVec, '-c 1 -g 0.00154'); [predict_label, accuracy, dec_values] = svmpredict(tst, TestVec, model); itr=itr+1; end itr=itr-1; end
How can this be improved?
There are a lot of methods for multi-class classification. Two classic options, which are not SVM-specific are:
One-vs-all (OVA) classification:
Suppose you have classes A, B, C, and D. Instead of doing a four way classification, train up four binary classifiers: A vs. not-A, B vs. not-B, C vs. not-C, and D vs. not-D. Then, pick either the positive class that’s “best” (e.g., furtherest from the margin across all four runs). If none of the classifications are positive (i.e., they’re all not-X), pick the “opposite” of class that’s worst (e.g., closest to the margin).
Train all possible pairs of classifications. Rank the classes by some factor (e.g., # of times selected), and pick the best.
Which works best has been contentious:
Duan and Keerthi have an empirical study that suggests a specific all-vs-all method, while Rifkin and Klautau argue for a one-vs-all scheme. There are even schemes where one learns error-correcting codes describing the class labels, instead of the labels themselves.
Edit: What you really want, particularly for OVA, is the posterior probability of each class. For some methods, like Naive Bayes, that’s trivial to get out. SVMs typically don’t give you probabilities, but there are ways to compute them. See John Platt’s 1999 paper “Probabilistic Outputs for Support Vector Machines…”