I work in a problem domain where people often report

ROC-AUCorAveP(average precision). However, I recently found papers that optimize Log Loss instead, while yet others report Hinge Loss.While I understand how these metrics are calculated, I am having a hard time

understanding the trade-offsbetween them and which is good for what exactly.When it comes to ROC-AUC vs Precision-Recall, this thread discusses how

ROC-AUC-maximizationcan be seen as using a loss optimization criteria that penalizes“ranking a true negative at least as large as a true positive”(assuming that higher scores correspond to positives). Also, this other thread also provides a helpful discussion ofROC-AUCin contrast toPrecision-Recallmetrics.However, for what type of problems would

log lossbe preferred over, say,ROC-AUC,AvePor theHinge loss? Most importantly, what types ofquestionsshould one ask about the problem when choosing between these loss functions for binary classification?

**Answer**

The state-of-the-art reference on the matter is [1].

Essentially, it shows that all the loss functions you specify will converge to the Bayes classifier, with fast rates.

Choosing between these for finite samples can be driven by several different arguments:

- If you want to recover event probabilities (and not only classifications), then the logistic log-loss, or any other generalized linear model (Probit regression, complementary-log-log regression,…) is a natural candidate.
- If you are aiming only at classification, SVM may be a preferred choice, since it targets only observations at the classification buondary, and ignores distant observation, thus alleviating the impact of the truthfulness of the assumed linear model.
- If you do not have many observations, then the advantage in 2 may be a disadvantage.
- There may be computational differences: both in the stated optimization problem, and in the particular implementation you are using.
- Bottom line- you can simply try them all and pick the best performer.

[1] Bartlett, Peter L, Michael I Jordan, and Jon D McAuliffe. “Convexity, Classification, and Risk Bounds.” Journal of the American Statistical Association 101, no. 473 (March 2006): 138–56. doi:10.1198/016214505000000907.

**Attribution***Source : Link , Question Author : Josh , Answer Author : JohnRos*