Running Experiments using Specialized Scripts¶
Testing Configurations of an Algorithm¶
Sometimes, configurations of algorithms are highly dependent on the database or even the employed protocol.
Additionally, configuration parameters depend on each other.
The FaceRecLib provides a relatively simple set up that allows to test different configurations in the same task.
For this, the ./bin/parameter_test.py
script can be employed.
This script executes a configurable series of experiments, which reuse data as far as possible.
The Configuration File¶
The most important parameter to the ./bin/parameter_test.py
is the --configuration-file
.
In this configuration file it is specified, which parameters of which part of the algorithms will be tested.
An example for a configuration file can be found in the test scripts: facereclib/tests/scripts/parameter_Test.py.
The configuration file is a common python file, which can contain certain variables:
preprocessor =
feature_extractor =
tool =
replace =
requirement =
imports =
The variables from 1. to 3. usually contain constructor calls for classes of Preprocessors, Feature Extractors and Recognition Algorithms, but also registered Resources can be used.
For any of the parameters of the classes, a placeholder can be put.
By default, these place holders start with a # character, followed by a digit or character.
The variables 1. to 3. can also be overridden by the command line options --preprocessing
, --features
and --tool
of the ./bin/parameter_test.py
script.
The replace
variable has to be set as a dictionary.
In it, you can define with which values your place holder key should be filled, and in which step of the tool chain execution this should happen.
The steps are 'preprocessing'
, 'extraction'
, 'projection'
, 'enrollment'
and 'scoring'
.
For each of the steps, it can be defined, which placeholder should be replaced by which values.
To be able to differentiate the results later on, each of the replacement values is bound to a directory name.
The final structure looks somewhat like that:
replace = {
step1 : {
'#a' : {
'Dir_a1' : 'Value_a1',
'Dir_a2' : 'Value_a2'
},
'#b' : {
'Dir_b1' : 'Value_b1',
'Dir_b2' : 'Value_b2'
}
},
step2 : {
'#c' : {
'Dir_c1' : 'Value_c1',
'Dir_c2' : 'Value_c2'
}
}
}
Of course, more than two values can be selected. Additionally, tuples of place holders can be defined, in which case always the full tuple will be replaced in one shot. Continuing the above example, it is possible to add:
...
step3 : {
('#d','#e') : {
'Dir_de1' : ('Value_d1', 'Value_e1'),
'Dir_de2' : ('Value_d2', 'Value_e2')
}
}
Note that all possible combinations of the configuration parameters are tested, which might result in a huge number of executed experiments.
Some combinations of parameters might not make any sense.
In this case, a set of requirements on the parameters can be set, using the requirement
variable.
In the requirements, any string including any placeholder can be put that can be evaluated using pythons eval
function:
requirement = ['#a > #b', '2*#c != #a', ...]
Finally, if any of the classes or variables need to import a certain python module (other than the facereclib
), it needs to be declared in the imports
variable.
If you, e.g., test, which scipy
distance function works best for your features, please add the imports (and don’t forget the facereclib
in case you use its tools):
imports = ['scipy', 'facereclib']
A complete working example, where the image resolution and LGBPHS distance function are tested, can be found in facereclib/tests/scripts/parameter_Test.py.
Further Command Line Options¶
The ./bin/parameter_test.py
script has a further set of command line options.
- The
--database
and the--protocol
define, which database and (optionally) which protocol should be used. - The
--sub-directory
is similar to the one in the./bin/faceverify.py
, see Required Command Line Arguments. - The
--preprocessing
,--features
and--tool
can be used to override thepreprocessor
,feature_extractor
andtool
fields in the configuration file (in which case the configuration file does not need to contain these variables). - The
--grid
option can select the SGE configuration (if not selected, all experiments will be run sequentially on the local machine). - The
--preprocessed-data-directory
can be used to select a directory of previously preprocessed data. This should not be used in combination with testing different preprocessing parameters. - The
--grid-database-directory
can be used to select another directory, where the submitted.sql3 files will be stored. - The
--write-commands
directory can be selected to write the executed commands into (this is useful in case some experiments fail and need to be rerun). - The
--dry-run
flag should always be used before the final execution to see if the experiment definition works as expected. - The
--skip-when-existent
flag will only exexute the experiments that have not yet finished (i.e., where the resulting score files are not produced yet). - Finally, additional options might be sent to the
./bin/faceverify.py
script directly. These options might be put after a--
separation.
Evaluation of Results¶
To evaluate a series of experiments, a special script iterates through all the results and computes EER on the development set and HTER on the evaluation set, for both the nonorm and the ztnorm directories. Simply call:
$ ./bin/collect_results.py --directory [result-base-directory] --sort
This will iterate through all result files found in [result-base-directory] and sort the results according to the EER on the development set (the sorting criterion can be modified using the --criterion
keyword).
Databases with Special Evaluation Protocols¶
Some databases provide special evaluation protocols which require a more complicated experiment design. For these databases, different scripts are provided. These databases are:
The LFW Database¶
For the Labeled Faces in the Wild (LFW) database, there is another script to calculate the experiments, strictly following the LFW protocols by computing the classification performance on view1 and/or view2.
The final result of the LFW experiment is, hence, a text file (--result-file
) containing the single results for view1
and the 10 folds fold1
... fold10
of view2
, as well as the final average and standard deviation of all folds.
In principle, the ./bin/faceveryfy.py
could be used as well, without having the classification performance.
The parameters of the ./bin/faceverify_lfw.py
script are mostly similar to the ./bin/faceverify.py
script as explained in Running non-baseline Experiments.
A few exceptions are that the default database is lfw
and the parts belonging to the ZT score normalization are missing.
Additionally, instead of the --protocol
option, the --views
option is available, which by default executes only on view1
.
The GBU Database¶
There is another script ./bin/faceverify_gbu.py
that executes experiments on the Good, Bad, and Ugly (GBU) database.
In principle, most of the parameters from above can be used.
One violation is that instead of the --models-directories
option is replaced by only --model-directory
.
When running experiments on the GBU database, the default GBU protocol (as provided by NIST) is used. Hence, training is performed on the special Training set, and experiments are executed using the Target set as models (using a single image for model enrollment) and the Query set as probe.
The GBU protocol does not specify T-Norm-models or Z-Norm-probes, nor it splits off development and test set. Hence, only a single score file is generated, which might later on be converted into an ROC curve using Bob functions.