Running Experiments

For running experiments with a defined setup, you should use the bin/palmveinverify.py script directly.

In the following sections the available command line arguments are listed. Sometimes, arguments have a long version starting with -- and a short one starting with a single -. In this section, only the long names of the arguments are listed, please refer to bin/palmveinverify.py --help (or short: bin/palmveinverify.py -h) for the abbreviations.

Required Command Line Arguments

To run a palmvein recognition experiment using the PalmveinRecLib, you have to tell the bin/palmveinverify.py script, which database, preprocessing, features, and algorithm should be used. To use this script, you have to specify at least these command line arguments (see also the --help option):

  • --database: The database to run the experiments on, and which protocol to use.
  • --preprocessing: The data preprocessing and its parameters.
  • --features: The features to extract and their options.
  • --tool: The recognition algorithm and all its required parameters.

There is another command line argument that is used to separate the resulting files from different experiments. Please specify a descriptive name for your experiment to be able to remember, how the experiment was run:

  • --sub-directory: A descriptive name for your experiment.

Managing Resources

The PalmveinRecLib is designed in a way that makes it very easy to select the setup of your experiments. Basically, you can specify your algorithm and its configuration in three different ways:

  1. You choose one of the registered resources. Just call bin/resources.py or bin/palmveinverify.py --help to see, which kind of resources are currently registered. Of course, you can also register a new resource. How this is done is detailed in section Registering your Code as a Resource.

    Example:

    $ bin/palmveinverify.py --database verapalm
    
  2. You define a configuration file or choose one of the already existing configuration files that are located in PalmveinRecLib/configurations and its sub-directories. How to define a new configuration file, please read section Adding Configuration Files.

    Example:

    $ bin/palmveinverify.py --preprocessing palmvein-preprocessor
    
  3. You directly put the constructor call of the class into the command line. Since the parentheses are special characters in the shell, usually you have to enclose the constructor call into quotes. If you, e.g., want to extract MC-MaximumCurvature features, just add a to your command line.

    Example:

    $ bin/palmveinverify.py --features lbp-linearbinarypatterns
    

Of course, you can mix the ways, how you define command line options.

For several databases, preprocessors, feature types, and recognition algorithms the PalmveinRecLib provides configuration files. They are located in the PalmveinRecLib/configurations directories. Each configuration file contains the required information for the part of the experiment, and all required parameters are preset with a suitable default value. Many of these configuration files with their default parameters are registered as resources, so that you don’t need to specify the path.

Since the default values might not be optimized or adapted to your problem, you can modify the parameters according to your needs. The most simple way is to pass the constructor call directly to the command line (i.e., use option 3). If you want to remember the parameters, you probably would write another configuration file. In this case, just copy one of the existing configuration files to a directory of your choice, adapt it, and pass the file location to the bin/palmveinverify.py script.

In the following, we will provide a detailed explanation of the parameters of the existing Databases, Preprocessors, Feature Extractors, and Recognition Algorithms.

Databases

Currently, all implemented databases are taken from Bob. To define a common API for all of the databases, the PalmveinRecLib defines the wrapper classes PalmveinRecLib.databases.DatabaseBob and PalmveinRecLib.databases.DatabaseBobZT and PalmveinRecLib.databases.DatabaseFileList for these databases. The parameters of this wrapper class are:

Required Parameters

  • name: The name of the database, in lowercase letters without special characters. This name will be used as a default sub-directory to separate resulting files of different experiments.
  • database = bob.db.<DATABASE>(original_directory=...): One of the image databases available at Idiap at GitHub. Please set the original_directory and, if required, the original_extension parameter in the constructor of that database.
  • protocol: The name of the protocol that should be used. If omitted, the protocol Default will be used (which might not be available in all databases, so please specify).

Optional Parameters

These parameters can be used to reduce the number of training images. Usually, there is no need to specify them, but in case your algorithm requires to much memory:

  • all_files_option: The options to the database query that will extract all files.
  • extractor_training_options: Special options that are passed to the query, e.g., to reduce the number of images in the extractor training.
  • projector_training_options: Special options that are passed to the query, e.g., to reduce the number of images in the projector training.
  • enroller_training_options: Special options that are passed to the query, e.g., to reduce the number of images in the enroller training.

Implemented Database Interfaces

Here we list the database interfaces that are currently available in the PalmveinRecLib. By clicking on the database name, you open one configuration file of the database, the link in <> parentheses will link to the bob.db database package documentation. If you have an image_directory different to the one specified in the file, please change the directory accordingly to be able to use the database.

For more information, please also read the FaceRecLib documentation.

Preprocessors

Currently, all preprocessors that are defined in PalmveinRecLib perform work on palmvein images and are, hence, used for palmvein recognition.

Palmvein Cropping Parameters

  • mask_h: Height of the cropping palm mask.
  • mask_w: Width of the cropping palm mask.
  • padding_offset: An offset to the paddy array to be applied arround the palmvein image.
  • padding_threshold: The pixel value of this paddy array. Defined to 0.2 to uncontrolled (low quality) palmvein databases and to 0 for controlled (high quality) palmvein databases. (By default 0.2).
  • preprocessing: The pre-processing applied to the palmvein image before palm contour extraction. By default equal to ‘None’.
  • palmcontour: The algorithm used to localize the palm contour. Options: ‘leemaskMatlab’ - Implementation based on [LLP09], ‘leemaskMod’ - Modification based on [LLP09] for uncontrolled images introduced by author, and ‘konomask’ - Implementation based on [KUU02].
  • postprocessing: The post-processing applied to the palmvein image after the palm contour extraction. Options: ‘None’, ‘HE’ - Histogram Equalization, ‘HFE’ - High Frequency Enphasis Filtering [ZTXL09], ‘CircGabor’ - Circular Gabor Filters [ZY09].

Preprocessor Classes

  • PalmveinRecLib.preprocessing.PalmCrop: Crops the palmvein image to the desired resolution, localize the palm contour and generate the palm mask region to extract features.

Note

Currently, the pre-processing is fixed to ‘None’ by default.

Feature Extractors

Several different kinds of features can be extracted from the preprocessed data. Here is the list of classes to perform feature extraction and its parameters.

  • PalmveinRecLib.features.lbp: Extracts Local Binary Patterns features [MD13] from the preprocessed data.

TODO

Recognition Algorithms

There are also a variety of recognition algorithms implemented in the PalmveinRecLib. All palm recognition algorithms are based on the PalmveinRecLib.tools.Tool base class. This base class has parameters that some of the algorithms listed below share. These parameters mainly deal with how to compute a single score when more than one feature is provided for the model or for the probe:

Here is a list of the most important algorithms and their parameters: TODO

Parallel Execution of Experiments

By default, all jobs of the palmvein recognition tool chain run sequentially on the local machine. To speed up the processing, some jobs can be parallelized using the SGE grid or using multi-processing on the local machine, using the GridTK. For this purpose, there is another option:

  • --grid: The configuration file for the grid execution of the tool chain.

Note

The current SGE setup is specialized for the SGE grid at Idiap. If you have an SGE grid outside Idiap, please contact your administrator to check if the options are valid.

The SGE setup is defined in a way that easily allows to parallelize data preprocessing, feature extraction, feature projection, model enrollment, and scoring jobs. Additionally, if the training of the extractor, projector, or enroller needs special requirements (like more memory), this can be specified as well.

Several configuration files can be found in the PalmveinRecLib/configurations/grid directory. All of them are based on the PalmveinRecLib.utils.GridParameters class. Here are the parameters that you can set:

  • grid: The type of the grid configuration; currently “sge” and “local” are supported.
  • number_of_preprocessing_jobs: Number of parallel preprocessing jobs.
  • number_of_extraction_jobs: Number of parallel feature extraction jobs.
  • number_of_projection_jobs: Number of parallel feature projection jobs.
  • number_of_enrollment_jobs: Number of parallel enrollment jobs (when development and evaluation sets are enabled, both sets will be split separately).
  • number_of_scoring_jobs: Number of parallel scoring jobs (when development and evaluation sets are enabled, or ZT-norm is computed, more scoring jobs will be generated).

If the grid parameter is set to 'sge' (the default), jobs will be submitted to the SGE grid. In this case, the SGE queue parameters might be specified, either using one of the pre-defined queues (see PalmveinRecLib/configurations/grid) or using a dictionary of key/value pairs that are sent to the grid during submission of the jobs:

  • training_queue: The queue that is used in any of the training (extractor, projector, enroller) steps.
  • ..._queue: The queue for the ... step.

If the grid parameter is set to local, all jobs will be run locally. In this case, the following parameters for the local submission can be modified:

  • number_of_parallel_processes: The number of parallel processes that will be run on the local machine.
  • scheduler_sleep_time: The interval in which the local scheduler should check for finished jobs and execute new jobs; the sleep time is given in seconds.

and the number_of_..._jobs are ignored, and number_of_parallel_processes is used for all of them.

Note

The parallel execution of jobs on the local machine is currently in BETA status and might be unstable. If any problems occur, please file a new bug at http://github.com/idiap/gridtk/issues.

When calling the bin/palmveinverify.py script with the --grid ... argument, the script will submit all the jobs by taking care of the dependencies between the jobs. If the jobs are sent to the SGE grid (grid = "sge"), the script will exit immediately after the job submission. Otherwise, the jobs will be run locally in parallel and the script will exit after all jobs are finished.

In any of the two cases, the script writes a database file that you can monitor using the bin/jman command. Please refer to bin/jman --help or the GridTK documentation to see the command line arguments of this tool. The name of the database file by default is submitted.sql3, but you can change the name (and its path) using the argument:

  • --submit-db-file

Command Line Arguments to change Default Behavior

Additionally to the required command line arguments discussed above, there are several options to modify the behavior of the PalmveinRecLib experiments. One set of command line arguments change the directory structure of the output. By default, the results of the recognition experiment will be written to directory /idiap/user/<USER>/<DATABASE>/<EXPERIMENT>/<SCOREDIR>/<PROTOCOL>, while the intermediate (temporary) files are by default written to /idiap/temp/<USER>/<DATABASE>/<EXPERIMENT> or /scratch/<USER>/<DATABASE>/<EXPERIMENT>, depending on whether the --grid argument is used or not, respectively:

  • <USER>: The Unix username of the person executing the experiments.
  • <DATABASE>: The name of the database. It is read from the database configuration.
  • <EXPERIMENT>: A user-specified experiment name (see the --sub-directory argument above).
  • <SCOREDIR>: Another user-specified name (--score-sub-directory argument below), e.g., to specify different options of the experiment.
  • <PROTOCOL>: The protocol which is read from the database configuration.

These default directories can be overwritten using the following command line arguments, which expects relative or absolute paths:

  • --temp-directory
  • --result-directory (for compatibility reasons also --user-directory can be used)

Re-using Parts of Experiments

If you want to re-use parts previous experiments, you can specify the directories (which are relative to the --temp-directory, but you can also specify absolute paths):

  • --preprocessed-data-directory
  • --features-directory
  • --models-directories

or even trained extractor, projector, or enroller (i.e., the results of the extractor, projector, or enroller training):

  • --extractor-file
  • --enroller-file

For that purpose, it is also useful to skip parts of the tool chain. To do that you can use:

  • --skip-preprocessing
  • --skip-extractor-training
  • --skip-extraction
  • --skip-enroller-training
  • --skip-enrollment
  • --skip-score-computation
  • --skip-concatenation

although by default files that already exist are not re-created. To enforce the re-creation of the files, you can use the:

  • --force

argument, which of course can be combined with the --skip... arguments (in which case the skip is preferred). To run just a sub-selection of the tool chain, you can also use the:

  • --execute-only

argument, which takes a list of options out of: preprocessing, extractor-training, extraction, projector-training, projection, enroller-training, enrollment, score-computation, or concatenation.

Sometimes you just want to try different scoring functions. In this case, you could simply specify a:

  • --score-sub-directory

In this case, no feature or model is recomputed (unless you use the --force option), but only new scores are computed.

Other Arguments

By default, the algorithms are set up to execute quietly, and only errors are reported. To change this behavior, you can – again – use the

  • --verbose

argument several times to increase the verbosity level to show:

  1. Warning messages
  2. Informative messages
  3. Debug messages

When running experiments locally, my personal preference is verbose level 2, which can be enabled by --verbose --verbose, or using the short version of the argument: -vv.

Finally, there is the:

  • --dry-run

argument that can be used for debugging purposes or to check that your command line is proper. When this argument is used, the experiment is not actually executed, but only the steps that would have been executed are printed to console.

Note

Usually it is a good choice to use the --dry-run option before submitting jobs to the SGE, just to make sure that all jobs would be submitted correctly and with the correct dependencies.