When does Naive Bayes perform better than SVM?

In a small text classification problem I was looking at, Naive Bayes has been exhibiting a performance similar to or greater than an SVM and I was very confused.

I was wondering what factors decide the triumph of one algorithm over the other. Are there situations where there is no point in using Naive Bayes over SVMs? Can someone shed light on this?

Answer

There is no single answer about which is the best classification method for a given dataset. Different kinds of classifiers should be always considered for a comparative study over a given dataset. Given the properties of the dataset, you might have some clues that may give preference to some methods. However, it would still be advisable to experiment with all, if possible.

Naive Bayes Classifier (NBC) and Support Vector Machine (SVM) have different options including the choice of kernel function for each. They are both sensitive to parameter optimization (i.e. different parameter selection can significantly change their output) . So, if you have a result showing that NBC is performing better than SVM. This is only true for the selected parameters. However, for another parameter selection, you might find SVM is performing better.

In general, if the assumption of independence in NBC is satisfied by the variables of your dataset and the degree of class overlapping is small (i.e. potential linear decision boundary), NBC would be expected to achieve good. For some datasets, with optimization using wrapper feature selection, for example, NBC may defeat other classifiers. Even if it achieves a comparable performance, NBC will be more desirable because of its high speed.

In summary, we should not prefer any classification method if it outperforms others in one context since it might fail severely in another one. (THIS IS NORMAL IN DATA MINING PROBLEMS).

Attribution
Source : Link , Question Author : Legend , Answer Author : soufanom

Leave a Comment