.. Andre Anjos .. vim: set fileencoding=utf-8 : .. Mon 11 Apr 2011 10:07:37 CEST =============== Score Toolkit =============== The Toolkit is conceived for these purposes: 1. Plot the DET curve for a particular system 2. Check the consistency between score files w.r.t. the filenames scores refer to .. _input-section: Installation ------------ To install from the command line on a machine you have access to the python installation tree (e.g., on a Windows machine): .. code-block:: sh $ easy_install trstk # or $ pip install trstk If you don't have adminstrative rights on the Python installation directory, you can create an isolated virtual environment using `virtualenv`. Follow instructions there to download and create a virtual environment and then either `easy_install` or `pip install` this package. Our PyPI page also contains a link to a Windows graphical installer. Unfortunately, it does not install the package dependencies like the command line installer does. You have to do it yourself. Here is the dependencies list: * `NumPy`_ * `Matplotlib`_ Visit those webpages for more information. Input ----- Tools in this package accept score files in one single textual format. Each line in the file refers to one single sample in the database being analyzed. Each line is composed of 4 fields separated by spaces in this order: 1. Claimed identity: a string that defines the claimed identity of the subject being analyzed 2. Model label: contains a label/reference to the data used to make the model (filename d used to make the model) 3. Real identity: a string that defines the real identity of the subject being analyzed (i.e. the output of the classification) 4. Test label: contains a label/reference to the data used to do the testing (filename d of the test file) 5. Score: a floating-point value representing the score Each of the above-mentioned fields **cannot have spaces in between**. Failing to comply will make the tools emit syntax errors pointing to the location in the file where problems seem to occur. Here is a valid example score file: .. code-block:: text 02463 02463d547 02463 02463d653 0.623265 02463 02463d547 02463 02463d655 0.920861 02463 02463d547 02463 02463d657 0.938942 02463 02463d547 02463 02463d659 0.743715 02463 02463d547 02463 02463d661 0.397660 02463 02463d547 02463 02463d663 0.615722 02463 02463d547 02463 02463d665 0.613291 02463 02463d547 02463 02463d667 0.543184 02463 02463d547 02463 02463d669 0.829777 02463 02463d547 02463 02463d671 0.869681 02463 02463d547 02463 02463d673 0.806394 02463 02463d547 02463 02463d675 1.007791 02463 02463d547 04200 04200d75 0.257423 Here is an invalid example score file: .. code-block:: text :linenos: Bob_Jones bob-file-001 Bob_Jones bob-file-004 -37.643410 Susan Smith susan-file-001 Susan Smith susan-file-001 -33.393433 Joe joe-file-030 Joe joe-file-001 -72.295616 In this case, line 2 above will fail because the real identity field and the claimed identity fields contain spaces. Lines 1 and 3 do conform to the proposed scheme and will be parsed without problems. Multi-modality Input ==================== If you have multiple modalities you should build a single text file along the lines explained before, for each modality. The order of the *tags* within each file should be respected. Example *Hypothetical face verification experiment output*: .. code-block:: text 02463 02463d547 02463 02463d675 1.007791 02463 02463d547 04200 04200d75 0.257423 02463 02463d547 04201 04201d435 0.315074 02463 02463d547 04201 04201d437 0.347413 02463 02463d547 04201 04201d439 0.296383 02463 02463d547 04201 04201d443 0.371881 02463 02463d547 04201 04201d445 0.260964 *Hypothetical speech verification experiment output*: .. code-block:: text 02463 02463d547 02463 02463d675 0.9932 02463 02463d547 04200 04200d75 0.0027 02463 02463d547 04201 04201d435 0.0144 02463 02463d547 04201 04201d437 0.0159 02463 02463d547 04201 04201d439 0.1250 02463 02463d547 04201 04201d443 0.0031 02463 02463d547 04201 04201d445 0.0002 A set of working examples is included in the ``example`` directory of this package. .. _dependence-section: Dependencies ------------ To properly run the software in this package you must have the following packages installed: * `Python`_: is the scripting language used for the programs * `Matplotlib`_: is used for plotting * `Sphinx`_: if you need to *recompile* the documentation .. _usage-section: Usage ----- We describe a few scenarios for using the Toolkit in specific cases. In Section :ref:`api-section` we exemplify how to create your own scripts that can re-use the readout functionality available in the kit. Example 1: Plotting a DET Curve =============================== The following command will plot a single DET curve for a given input score file: .. code-block:: sh $ plotDET.py test.scores This command should produce a single plot in PDF file named ``det.pdf`` calculated using the contents of the input score file ``test.scores``. The plot title will be empty. You can change the output filename and its type (we support either `.png` files or `.jpg`) or add a plot title like this: .. code-block:: sh $ plotDET.py --title="My Test DET" --output=test.png test.scores You can plot a series of overlayed DET curves in the following manner: .. code-block:: sh $ plotDET.py --title="My Test DET" --output=overlayed.pdf \ --label=devel development.scores --label=test test.scores This command will produce a single plot in a PDF file, with the overlayed DET curves generated using each of the score files given as input parameters. A legend will be drawn at a convenient location in the plot using the labels for each of the curves as determined by your input. By default the program generates black-and-white plots, but can be instructed to produce coloured plots using the ``--colour`` option (see ``plotDET.py --help`` message). Example 2: Checking score set consistency ========================================= You can check the consistency between two (or more) score sets that are supposed to provide scores for multiple biometric modalities using the ``checkModalities.py`` script. This tool will compare two input files and will stop on the first error it finds: .. code-block:: sh $ checkModalities.py faceverif.scores speechverif.scores If you sort all files before calling the program, huge score files can be checked in a much faster way as we will avoid the sorting step within the program. You can do this using the ``sort`` and ``uniq`` unix utilities to sort all score files before using ``checkModalities.py`` like this: .. code-block:: sh $ sort my-scores.txt | uniq > sorted-scores.txt $ sort other-scores.txt | uniq > other-sorted-scores.txt $ checkModalities.py --sorted sorted-scores.txt other-sorted-scores.txt .. Place your references here: .. _Python: http://www.python.org .. _Matplotlib: http://matplotlib.sourceforge.net .. _Sphinx: http://sphinx.pocoo.org .. _PyPI: http://pypi.python.org/pypi .. _NumPy: http://numpy.scipy.org/