Skip to main content

Precision Recall Curves

Introduction#

Precision-Recall curves are a metric used to evaluate the performance of a binary classifier where data classes are heavily imabalanced. Just like the ROC curve, PR curves provide us with graphical representation of a binary classifier's performance at different thresholds.

Before we move on, it is highly reccomended to have a grasp of what Precision and Recall are, which you can find here and the ROC-AUC curve here.

We looked at the concept of thresholds in the ROC curve section, we use the same concept here except the graph will consist of values of precision and recall rather than specificity(aka recall) and sensitivity.

Formulas Recap#

The formula for precision and recall is:

Precision=TPTP+FPPrecision = \frac{TP}{TP + FP}
Recall=TPTP+FNRecall = \frac{TP}{TP + FN}

The perfect precision-recall curve looks like this.

1.png

You can see how this is essentially a mirror image of the perfect ROC curve which you can see below.

2.png

Interpretation#

Here we have 2 precision-recall curves, the orange one is more bent towards the top right corner than the blue and hence tells us that is performance is better than the blue curve, no matter what the threshold is.

3.png

If you remember from the F1 score section, you'd know that precision is how good the model is at not making false accusations, and recall is how good the model is at making catching actual shop lifters.

We want both of these to be as high as possible.

Many times cases in real life aren't as simple as the case we saw above. Let's take a look at something trickier.

Can you tell which classifier is better? The blue or the orange one? 4.png

It's not as straight forward as the previous situation, in this region the oragen classifier is more bent towards the top right corner than the blue classifier and hence performs better.

5.png

In this region however the blue classifier performs better than the orange classifier.

6.png

There is no clear winner in this scenario, depending on your situation and the thresholds most important, you may want to use the blue classifier or the orange classifier.