I’m using libsvm and I noticed that everytime I call svmtrain(), I create a new model and that there seems to be no option to put data in an existing model. Is this possible to do however? Am I just not seeing this aspect in libsvm?
It sounds like you’re looking for an “incremental” or “online” learning algorithm. These algorithms let you update a classifier with new examples, without retraining the entire thing from scratch.
It’s definitely possible with support vector machines, though I believe libSVM doesn’t presently support it. It might be worth taking a look at several other packages that do offer it, including
- Gert Cauwenbergh’s 2000 NIPS paper (with code) http://www.isn.ucsd.edu/svm/incremental/
- Pegasos (which is available by itself or as part of dlib)
- SVM Heavy http://people.eng.unimelb.edu.au/shiltona/svm/
PS: @Bogdanovist: There’s a pretty extensive literature on this. kNN is obviously and trivially incremental. One could turn (some) bayesian classifiers into incremental classifiers by storing counts instead of probabilities. STAGGER, AQ* and some (but not all) of the ID* family of decision tree algorithms are also incremental, off the top of my head.