Balanced Accuracy

Introduction#

Balanced Accuracy is a metric that is used to evaluate the performance of a binary classification model. It is calculated as the ratio of the Sensitivity (true positive rate) to the Specificity (true negative rate).

Balanced Accuracy \ = \frac{precision * recall}{2}

Let's look at what sensitivity and specificity are.

Sensitivity#

Sensitivity is defined by this formula:

Sensitivity \ = \frac{True \ Positives}{True \ Positives + False \ Negatives}

It is also known as the True Positive Rate, which has been explained in depth here.

Specificity#

Specificity is defined by this formula:

Specificity \ = \frac{True \ Negatives}{True \ Negatives + False \ Positives}

Using the analogy from the shoplifting in the F1 section, specificity is the ratio of customers who were not shoplifters and correctly classified as such by the model to all the cases where the model classified someone as not a shoplifter regardless if the prediction was correct or not.

The problem with F1#

The F1 score does not take into account the number of true negatives and when working on a dataset where true negatives are rare, the F1 score fail to reflect that.

Balanced accuracy on the other hand, takes into account the number of true negatives and is therefore more useful in scenarios where it is required.