Evaluations¶
MSAF includes the standard evaluation metrics used in MIREX. Here we describe how to evaluate the algorithms’ results and discuss each of these metrics, classified based on the subtask they aim to assess.
These metrics are computed using the external and fantastic framework mir_eval.
How To Evaluate Results¶
The module eval.py contains the following process
function that can be called once the desired algorithms have been run on a single file or dataset:
Evaluates the estimated results of the Segmentation dataset against the ground truth (human annotated data).
process (in_path[, boundaries_id, labels_id, ...]) |
Main process to evaluate algorithms’ results. |
The return value of this function is a dictionary (or a list of dictionaries, in case of collection mode) containing all of the available metrics for the evaluated subtask(s). The keys to this dictionary, with a description of each metric are found below.
Boundary Metrics¶
Boundary Metric | Description |
---|---|
D | Information Gain |
DevE2R | Median Deviation from Estimation to Reference |
DevR2E | Median Deviation from Reference to Estimation |
DevtE2R | Median Deviation from Estimation to Reference without first and last boundaries (trimmed) |
DevtR2E | Median Deviation from Reference to Estimation without first and last boundaries (trimmed) |
HitRate_0.5F | Hit Rate F-measure using 0.5 seconds window |
HitRate_0.5P | Hit Rate Precision using 0.5 seconds window |
HitRate_0.5R | Hit Rate Recall using 0.5 seconds window |
HitRate_3F | Hit Rate F-measure using 3 seconds window |
HitRate_3P | Hit Rate Precision using 3 seconds window |
HitRate_3R | Hit Rate Recall using 3 seconds window |
HitRate_t0.5F | Hit Rate F-measure using 0.5 seconds window without first and last boundaries (trimmed) |
HitRate_t0.5P | Hit Rate Precision using 0.5 seconds window without first and last boundaries (trimmed) |
HitRate_t0.5R | Hit Rate Recall using 0.5 seconds window without first and last boundaries (trimmed) |
HitRate_t3F | Hit Rate F-measure using 3 seconds window without first and last boundaries (trimmed) |
HitRate_t3P | Hit Rate Precision using 3 seconds window without first and last boundaries (trimmed) |
HitRate_t3R | Hit Rate Recall using 3 seconds window without first and last boundaries (trimmed) |
t_measure10 | T-Measures F-measure at 10 seconds window |
t_precision10 | T-Measures Precision at 10 seconds window |
t_recall10 | T-Measures Recall at 10 seconds window |
t_measure15 | T-Measures F-measure at 15 seconds window |
t_precision15 | T-Measures Precision at 15 seconds window |
t_recall15 | T-Measures Recall at 15 seconds window |
Label Metrics¶
Label Metric | Description |
---|---|
PWF | Pairwise Frame Clustering F-measure |
PWP | Pairwise Frame Clustering Precision |
PWR | Pairwise Frame Clustering Recall |
Sf | Normalized Entropy Scores F-measure |
So | Normalized Entropy Scores Precision |
Su | Normalized Entropy Scores Recall |