4. Performance Measures¶
Here we explore various functions that evaluate the performance of a classifier. We start by defining some notations:
- \(C\) is the confusion matrix, where entry \(C_{ij}\) is the number of objects in class \(i\) but have been as class \(j\),
- \(m_i\) is the number of objects in class \(i\): \(m_i = \sum_j C_{ij}\),
- \(m\) is the total sample size: \(m = \sum_i m_i\),
- \(k\) is the number of classes.
4.1. Naive Accuracy¶
One simple way to evaluate the overall performance of a classifier is to compute the naive accuracy rate, which is simply the total fraction of objects that have been correctly classified:
This is implemented in the function naive_accuracy()
.
4.2. Balanced Accuracy¶
However, if the dataset is imbalanced, this measure would not work well. A better approach is to use the posterior balanced accuracy. Let \(A_i\) be the accuracy rate of class \(i\):
Before running a classifier, we know nothing of its performance, so we can assume the accuracy rate follows a flat prior distribution. In particular, the Beta distribution with parameters \(\alpha = \beta = 1\) (i.e. a uniform distriubtion) seems appropriate here:
Given an accuracy rate \(A_i\) for each class \(i\), the number of correct predictions in class \(i\) will follow a Binomial distribution with \(A_i\) as the probability of success:
In Bayesian terms, this is our likelihood. Now we know that with respect to a Binomial likelihood, the Beta distribution is conjugate to itself. Thus the posterior distribution of \(A_i\) will also be Beta with parameters:
get_beta_parameters()
is
a helper function that extracts the beta paremeters form a
confusion matrix.
4.2.1. Convultion¶
One way to define the balanced accuracy \(A\) is to take the average of the individual accuracy rates \(A_i\):
We call \(\big( A \mid C \big)\) the posterior balanced accuracy.
One nice feature of this measure is that it’s a probability distribution
(instead of a simple point estimate). This allows us to construct
confidence intervals, etc. And even though there is no closed form
solution for the density function of \(\big( A \mid C \big)\), we
can still compute it by performing the convolution \(k\) times. This
is implemented in convolve_betas()
.
4.2.2. Expected Balanced Accuracy¶
We have just approximated the density of the sum of \(k\) Beta
distributions. The next step is to present our results. One measure we
can report is the expected value of the posterior balanced accuracy. This
is implemented in balanced_accuracy_expected()
.
4.2.3. Distribution of Balanced Accuracy¶
We can also construct an empirical distribution for the posterior
expected accuracy. First we need to compute the pdf of the sum of beta
distributions \(\sum_i A_i\), given a subset \(x\) of the
domain. See beta_sum_pdf()
.
However we’re interested in the average of the accuracy rates, \(A = \dfrac{1}{k} \sum_i A_i = \dfrac{1}{k} A_T\). We can rewrite the pdf of \(A\) as:
Differnetiating with respect to \(a\), we’d get:
See beta_avg_pdf()
for the implementation.
To make a violin plot of the posterior balanced accuracy, we need to run
a Monte Carlo simulation, which requires us to have the inverse cdf of
\(A\). beta_sum_cdf()
,
beta_avg_cdf()
, and
beta_avg_inv_cdf()
are used to approximate
the integral of the pdf using the trapezium rule.
4.3. Recall¶
We can also compute the recall of each class. The recall of class \(i\) is defined as:
Intuitively, the recall measures a classifier’s ability to find all the
positive samples (and hence minimising the number of false negatives).
See recall()
.
4.4. Precision¶
Another useful measure is the precision. The precision of class \(i\) is defined as:
Intuitively, the precision measures a classifier’s ability to minimise
the number of false positives. See precision()
.