It is easy to find a package calculating area under ROC, but is there a package that calculates the area under precision-recall curve?
As of July 2016, the package PRROC works great for computing both ROC AUC and PR AUC.
Assuming you already have a vector of probabilities (called
probs) computed with your model and the true class labels are in your data frame as
df$label (0 and 1) this code should work:
install.packages("PRROC") require(PRROC) fg <- probs[df$label == 1] bg <- probs[df$label == 0] # ROC Curve roc <- roc.curve(scores.class0 = fg, scores.class1 = bg, curve = T) plot(roc) # PR Curve pr <- pr.curve(scores.class0 = fg, scores.class1 = bg, curve = T) plot(pr)
PS: The only disconcerting thing is you use
scores.class0 = fg when
fg is computed for label 1 and not 0.
Here are the example ROC and PR curves with the areas under them:
The bars on the right are the threshold probabilities at which a point on the curve is obtained.
Note that for a random classifier, ROC AUC will be close to 0.5 irrespective of the class imbalance. However, the PR AUC is tricky (see What is “baseline” in precision recall curve).