Evaluating Score Files

Now, you have successfully run face recognition experiments, and the result is one ore more score files. Usually, these score files are located in sub-directories of your --result-directory and are called scores-dev and scores-eval. This section describes how to interpret these score files. So far, so good. In this section we show, what to do with these files.

Interpreting Score Files

The scores in the score files are arranged in rows. As an example, you might want to have a look at the facereclib/tests/scripts/scores-nonorm-dev, which is used for testing purposes. Usually, each score was generated by comparing one probe image to one client model. Information about this pair is contained in each row of the score file, which contains the four elements:

  1. The client id of the model. This is the identity of the enrolled client model of this model/probe pair.
  2. The client id of the probe. This is the identity shown in the probe image of this model/probe pair.
  3. The path of the probe image file, which is relative to the image database directory and without file extension.
  4. The score that was produced by comparing model and probe.

Hence, if the first two elements are identical, the score is a client (a.k.a. genuine, positive, true access, target) score, if they differ it is an impostor (a.k.a. negative, non-target) score.

Evaluation

Since all required information is available in the score file, you can use any tool (like MatLab) to evaluate the score file and compute error measures or generate plots. The FaceRecLib defines one generic script that is able to evaluate the score files, compute some error measures and generate various types of plots. For example, the plots shown in section Baseline Results are generated with this script. This script is called ./bin/evaluate.py and has the following command line options (see ./bin/evaluate.py --help for the shortcuts):

  • --dev-files: A list of files of the development set that will be evaluated.
  • --eval-files (optional): A list of files of the evaluation set. If given, please assure that there exist exactly one evaluation file for each development score file and that they are given in the same order.
  • --directory (optional): If given, the --dev-files and --eval-files have either absolute paths or are relative to the given directory.
  • --roc (optional): If given, the score files will be evaluated to compute an ROC curve (one for dev and one for eval) and the result is plotted to the given pdf file.
  • --det (optional): If given, the score files will be evaluated to compute a DET curve (one for dev and one for eval) and the result is plotted to the given pdf file.
  • --cmc (optional): If given, the score files will be evaluated to compute a CMC curve (one for dev and one for eval) and the result is plotted to the given pdf file. Please note that CMC plots are not valid for all databases.
  • --legends (optional): If given, these legends will be placed into ROC, DET and CMC plots. Otherwise the file names will be used. Please assure that there exist exactly one legend for each development score file and that they are given in the correct order.
  • --criterion (optional): If given, a threshold will be computed based on the EER or minimum HTER of each development set file, and applied to the development and evaluation files. Both results will be written to console.
  • --cllr (optional): If given, a the Cllr and the minCllr will be computed on both the development and the evaluation set. All results will be written to console.

As usual, the --verbose (i.e., -v) option exists, and it is wise to use -vv.