4. Performance Measures¶
Here we explore various functions that evaluate the performance of a classifier. We start by defining some notations:
- C is the confusion matrix, where entry Cij is the number of objects in class i but have been as class j,
- mi is the number of objects in class i: mi=∑jCij,
- m is the total sample size: m=∑imi,
- k is the number of classes.
4.1. Naive Accuracy¶
One simple way to evaluate the overall performance of a classifier is to compute the naive accuracy rate, which is simply the total fraction of objects that have been correctly classified:
This is implemented in the function naive_accuracy()
.
4.2. Balanced Accuracy¶
However, if the dataset is imbalanced, this measure would not work well. A better approach is to use the posterior balanced accuracy. Let Ai be the accuracy rate of class i:
Before running a classifier, we know nothing of its performance, so we can assume the accuracy rate follows a flat prior distribution. In particular, the Beta distribution with parameters α=β=1 (i.e. a uniform distriubtion) seems appropriate here:
Given an accuracy rate Ai for each class i, the number of correct predictions in class i will follow a Binomial distribution with Ai as the probability of success:
In Bayesian terms, this is our likelihood. Now we know that with respect to a Binomial likelihood, the Beta distribution is conjugate to itself. Thus the posterior distribution of Ai will also be Beta with parameters:
get_beta_parameters()
is
a helper function that extracts the beta paremeters form a
confusion matrix.
4.2.1. Convultion¶
One way to define the balanced accuracy A is to take the average of the individual accuracy rates Ai:
We call (A∣C) the posterior balanced accuracy.
One nice feature of this measure is that it’s a probability distribution
(instead of a simple point estimate). This allows us to construct
confidence intervals, etc. And even though there is no closed form
solution for the density function of (A∣C), we
can still compute it by performing the convolution k times. This
is implemented in convolve_betas()
.
4.2.2. Expected Balanced Accuracy¶
We have just approximated the density of the sum of k Beta
distributions. The next step is to present our results. One measure we
can report is the expected value of the posterior balanced accuracy. This
is implemented in balanced_accuracy_expected()
.
4.2.3. Distribution of Balanced Accuracy¶
We can also construct an empirical distribution for the posterior
expected accuracy. First we need to compute the pdf of the sum of beta
distributions ∑iAi, given a subset x of the
domain. See beta_sum_pdf()
.
However we’re interested in the average of the accuracy rates, A=1k∑iAi=1kAT. We can rewrite the pdf of A as:
Differnetiating with respect to a, we’d get:
See beta_avg_pdf()
for the implementation.
To make a violin plot of the posterior balanced accuracy, we need to run
a Monte Carlo simulation, which requires us to have the inverse cdf of
A. beta_sum_cdf()
,
beta_avg_cdf()
, and
beta_avg_inv_cdf()
are used to approximate
the integral of the pdf using the trapezium rule.
4.3. Recall¶
We can also compute the recall of each class. The recall of class i is defined as:
Intuitively, the recall measures a classifier’s ability to find all the
positive samples (and hence minimising the number of false negatives).
See recall()
.
4.4. Precision¶
Another useful measure is the precision. The precision of class i is defined as:
Intuitively, the precision measures a classifier’s ability to minimise
the number of false positives. See precision()
.