Part 2: Generate client-independent and client-specific anti-spoofing scores¶

Some of the scripts that need to be used for generating these scores are contained in this satellite package. However, some of them are from the package antispoofing.clientspec.

Generative client-independent scores¶

Generating the generative client-idependent scores can be done in 5 steps. The values of the hyper-parameters (number of Gaussians) which are given in the commands below are optimized for the grandtest set of Replay-Attack. Please find a table at the end of the section for the parameter values optimized for other Replay-Attack protocols. Note that the models are created for features which are normalized using standard normalization and PCA reduced. The examples are given for LBP features, but generating the scores for the other features is analogous. You just have to change the number of the Gaussians using the --gaussians parameter. You can always type --help after any of the commands to see all their available options. You can also type --help after the option replay to see the available options for the database.

The scripts used for this task belong to the package antispoofing.clientspec.

Create model for Real Accesses (LBP features):
```
$ ./bin/naive_modelling.py --featname lbp --gaussians 5 --modeltype real -n -r -c -j -o path_to_output_dir/lbp/dir-models/real path_to_features replay --protocol grandtest
```
The parameter -e 0.995 should be set denoting the kept energy during PCA reduction for the MOTION features. The default values are used for the other features. The parameter --featname can be any custom name that you choose to give to your features, but pay attention to use it consistently through the calls of all the scripts. Don’t forget to change the protocol (--protocol) to the corresponding protocol of Replay-Attack that you want to use. Specifying several protocols is possible.

Create model for Attacks (LBP features):

$ ./bin/naive_modelling.py --featname lbp --gaussians 10 --modeltype attack -n -r -c -j -o path_to_output_dir/lbp/dir-models/attack path_to_features replay --protocol grandtest

Again, the parameter -e 0.995 should be set denoting the kept energy during PCA reduction for the MOTION features.

Calculate likelihoods to real access model

./bin/naive_likelihood.py --featname lbp --gaussians 5 --modeldir path_to_output_dir/lbp/dir-models/real -o path_to_output_dir/lbp/dir-likelihoods/real path_to_features replay

Calculate likelihoods to attack model

./bin/naive_likelihood.py --featname lbp --gaussians 10 --modeldir path_to_output_dir/lbp/dir-models/attack -o path_to_output_dir/lbp/dir-likelihoods/attack path_to_features replay

Calculate likelihood ratios

./bin/naive_likelihood_ratio.py --dirreal path_to_output_dir/lbp/dir-likelihoods/real/GMM-5 --dirattack path_to_output_dir/lbp/dir-likelihoods/attack/GMM-10 -o path_to_output_dir/lbp/likelihood_ratio/GMM-5/GMM-10/llr_real.vs.attack replay

Generating the likelihood ratios for other features is analogous. You just need to change the number of Gaussians in the input and output folders to the corresponding values.

Type --help after the command to see all its available options.

After this, you will have scores for all the videos of Replay-Attack in the directory path_to_output_dir/lbp/likelihood_ratio/GMM-5/GMM-10/llr_real.vs.attack (or analogous for the other features). The scores will be written as an array in .hdf files with the name of the video, and one score per frame.

The optimized values (obtained via grid search) for the number of Gaussians for each of the protocols of Replay-Atatck are given in the following table:

features	LBP		LBP-TOP		MOTION		HOG
protocol	real	attack	real	attack	real	attack	real	attack
grandtest	5	10	5	50	10	300	210	190
print	250	235	5	10	35	5	30	175
digital	15	35	10	15	100	115	235	105
video	5	20	5	30	10	60	120	110
print+digital	5	10	5	25	45	165	115	30
print+video	5	15	10	75	10	240	125	195
digital+video	5	10	5	30	100	295	290	165

Generative client-specific scores¶

Generating the client-specific results can be done in 7 steps. The values of the hyper-parameters (number of Gaussians and relevance factor) which are given in the commands below are optimized for the grandtest set of Replay-Attack. Please find a table at the end of the section for the parameter values optimized for other Replay-Attack protocols. Note that the models are created for features which are normalized using standard normalization and PCA reduced. The examples are given for LBP features, but generating the scores for the other features is analogous. You just have to change the number of the Gaussians using the --gaussians parameter and the relevance factor using the --rel parameter. You can always type --help after any of the commands to see all their available options. You can also type --help after the option replay to see the available options for the database.

The majority of the scripts used for this task belong to the package antispoofing.clientspec.

Create model for Real Accesses. This step is exactly the same as step 1 of the previous section. Just replace the values of the number of Gaussians optimized for the client-specific models, which are given in the table at the end of the section.
Create model for Attacks. This step is exactly the same as step 2. of the previous section. Just replace the values of the number of Gaussians optimized for the client-specific models, which are given in the table at the end of the section.
Enroll clients from the Real Access model using MAP adaptation:
```
$ ./bin/map_adapt_per_client.py --featname lbp --modelfile path_to_output_dir/lbp/dir-models/real/GMM-275.hdf5 -o path_to_output_dir/lbp/dir-map-models/TEST/GMM-275/reals.hdf5 --group test --rel 1 --clss enroll path_to_features replay
```
This step needs to be run three times: for the training, development and test subset. The above examples shows how to run it for the test set. The class of samples using for the MAP adaptation is specified with --clss parameter and needs to be the enrollment samples in this case. The output is an .hdf5 file where the MAP adapted models are stored for each client of the particular subset.

Generating the MAP models for the other features is analogous. Just change the number of Gaussians in the model filename and the output directory.
Create cohort models from the Attack model using MAP adaptation. As before, use the script map_adapt_per_client.py from the package antispoofing.clientspec:
```
$ ./bin/map_adapt_per_client.py --featname lbp --modelfile path_to_output_dir/lbp/dir-models/attack/GMM-25.hdf5 -o path_to_output_dir/lbp/dir-map-models/TRAIN/GMM-25/attacks.hdf5 --group train --rel 1 --clss attack path_to_features replay --protocol grandtest
```
This step needs to be run only once, because the cohorts are created from the training set. The class of samples using for the MAP adaptation is specified with --clss parameter and needs to be the attack samples in this case. Don’t forget to change the protocol (--protocol) to the corresponding protocol of Replay-Attack that you want to use. The output is an .hdf5 file where all the cohort models are stored.

Generating the cohort models for the other features is analogous. Just change the number of Gaussians in the model filename and the output directory.
Calculate likelihoods to real access client-specific models:
```
$ ./bin/naive_likelihood_clientspecmodel.py --featname lbp --mapmodelfile path_to_output_dir/lbp/dir-map-models/TEST/GMM-275/reals.hdf5 -o path_to_output_dir/lbp/dir-likelihood-clientspec/GMM-275 --group test path_to_features replay
```
This step needs to be run three times: for the training, development and test subset. The above examples shows how to run it for the test set. Generating the likelihoods for other features is analogous. Just change the number of Gaussians in the MAP model filename and the output directory.
Calculate likelihoods to attack cohort models:
```
$ ./bin/naive_likelihood_cohortmodels.py --featname lbp --cohortfile path_to_output_dir/lbp/dir-map-models/TRAIN/GMM-25/attacks.hdf5 -o path_to_output_dir/lbp/dir-likelihood-cohort/likelihood-cohort-all/GMM-25 --group test path_to_features replay
```
This step needs to be run three times: for the training, development and test subset. The above examples shows how to run it for the test set. Generating the likelihoods for other features is analogous. Just change the number of Gaussians in the MAP model filename and the output directory. Note that you can specify the number N of cohorts that you want to use to compute the likelihood, using the -s option. In such a case, the highest N cohorts will be taken into account only.
Calculate the likelihood ratio:
```
$ ./bin/naive_likelihood_ratio.py --dirreal path_to_output_dir/lbp/dir-likelihood-clientspec/GMM-275 --dirattack path_to_output_dir/lbp/dir-likelihood-cohort/likelihood-cohort-all/GMM-25 -o path_to_output_dir/lbp/likelihood_ratio/GMM-275/GMM-25/llr_clientspec.vs.cohortall replay
```
Generating the likelihood ratios for other features is analogous. You just need to change the number of Gaussians in the input and output folders to the corresponding values.

After this, you will have scores for all the videos of Replay-Attack in the directory path_to_output_dir/lbp/likelihood_ratio/GMM-275/GMM-25/llr_clientspec.vs.cohortall (or analogous for the other features). The scores will be written as an array in .hdf files with the name of the video, and one score per frame.

In addition to these steps, the likelihood to the attack models can be done on a subset of cohort models which are sorted by some criteria. They can be sorted statically and dynamically, using the scripts sort_cohort_models_kl.py and sort_cohort_models_reynolds.py, respectively. Then, the likelihood from sorted cohorts can be computed using the script naive_likelihood_sorted_cohortmodels.py.

Sorting the cohort models.

To run the static sort, run:

$ ./bin/sort_cohort_models_kl.py --featname lbp --mapmodelfile path_to_output_dir/lbp/dir-map-models/TEST/GMM-275/reals.hdf5 --cohortfile path_to_output_dir/lbp/dir-map-models/TRAIN/GMM-25/attacks.hdf5 -o path_to_output_dir/lbp/sorted_cohorts.hdf5 --gr test replay

To run the dynamic sort, run:

$ ./bin/sort_cohort_models_reynolds.py --featname lbp --mapmodelfile path_to_output_dir/lbp/dir-map-models/TEST/GMM-275/reals.hdf5 --cohortfile path_to_output_dir/lbp/dir-map-models/TRAIN/GMM-25/attacks.hdf5 --sf path_to_output_dir/lbp/sorted_cohorts.hdf5 -o path_to_output_dir/lbp/sorted_cohorts --gr test replay

This step needs to be run three times: for the training, development and test subset. The above examples shows how to run it for the test set.

Calculate likelihood to sorted cohort models:

$ ./bin/naive_likelihood_sorted_cohortmodels.py --featname lbp --cohortfile path_to_output_dir/lbp/dir-map-models/TRAIN/GMM-25/attacks.hdf5 --sort_cohort 5 -o path_to_output_dir/lbp/dir-likelihood-cohort/sorted-cohort-likelihood/likelihood-cohort-5/GMM-25 --group test path_to_features replay

This step needs to be run three times: for the training, development and test subset. The parameter --sort_cohort controls the number of cohorts to be considered when computing the likelhood.

After steps 8. and 9., step 7. can be repeated to compute the likelhood ratio.

The optimized values (via grid search) for the number of Gaussians and the MAP relevance factor for each of the protocols of Replay-Attack are given in the following table:

features	LBP			LBP-TOP			MOTION			HOG
protocol	real	attack	rel	real	attack	rel	real	attack	rel	real	attack	rel
grandtest	275	25	1	295	100	5	10	45	5	295	55	1
print	160	20	1	300	210	1	70	10	1	30	105	1
digital	250	5	4	300	35	3	100	165	1	205	255	1
video	275	15	5	295	55	5	15	230	5	15	220	5
print+digital	275	20	1	295	60	5	50	100	1	295	85	2
print+video	280	15	3	240	80	5	15	90	5	25	80	1
digital+video	250	10	3	240	85	5	45	65	2	5	110	2

Discriminative client-independent scores¶

Generating the generative client-idependent scores can be done in 2 steps.

Training the SVM
The training can be done using kernels, or using kernel approximation. The option --scikit specifies that scikit_learn will be used instead of Bob. You can always type --help``after any of the commands to see all their available options. You can also type ``--help after the option replay to see the available options for the database.

The command below shows how to do the training with RBF kernels after min-max normalization of the data:
$ ./bin/svmtrain.py -v path_to_features -d path_to_machine --min-max-normalize --kernel-type RBF --scikit replay
To do the SVM training using Chi2 kernel with Nystrom approximation, run:
./bin/svmtrain.py -v path_to_features -d path_to_machine --kernel-approx-type nystrom-chi2 replay

Calculating the scores

If the machine is trained using kernels, run:
$ ./bin/svmeval.py -i path_to_features --svmdir path_to_machine -o path_to_scores --scikit -s replay
If the machine is trained using approximation, run:
$ ./bin/svmapprox_eval.py -i path_to_features --svmdir path_to_machine -o path_to_scores -s replay

Discriminative client-specific scores¶

Generating the client-specific scores can be done in 2 steps.

The examples below are given for LBP features, but the procedure for the other features is analogous.

Training the SVM

The training can be done using kernels, or using kernel approximation. The option --scikit specifies that scikit_learn will be used instead of Bob. You can always type --help``after any of the commands to see all their available options. You can also type ``--help after the option replay to see the available options for the database.

The command below shows how to do the training with RBF kernels after standard normalization of the data:
```
./bin/svm_clientspec_train.py -f lbp -o path_to_machine --gr test --std-normalize --scikit path_to_features replay
```
To do the SVM training using Chi2 kernel with Nystrom approximation, run:
```
./bin/svmapprox_clientspec_train.py -f lbp -o path_to_machine --gr test --std-normalize --kernel-approx-type nystrom-chi2 path_to_features replay
```
These steps needs to be run three times: for the training, development and test subset. The above examples shows how to run it for the test set.
Calculating the scores

To generate the client-specific scores, run one of the following two commands, depending on whether the SVM is trained using kernel or kernel approximation:
```
$ ./bin/svm_clientspec_eval.py -f lbp --svmdir path_to_machine -o path_to_scores --gr test -s path_to_features replay
$ ./bin/svmapprox_clientspec_eval.py -f lbp --svmdir path_to_machine -o path_to_scores --gr test -s path_to_features replay
```
If you want to generate client-specific scores for zero-effort impostors for which wrong identity claim will be given to the client-specific anti-spoofing system, run one of the following two commands:
```
$ ./bin/svm_clientspec_eval_impostors.py -f lbp --svmdir path_to_machine -o path_to_scores --gr test -s path_to_features replay
$ ./bin/svmapprox_clientspec_eval_impostors.py -f lbp --svmdir path_to_machine -o path_to_scores --gr test -s path_to_features replay
```
These steps needs to be run three times: for the training, development and test subset. The above examples shows how to run it for the test set.

Computing the error rates¶

After the scores have been generated, you can use the script ./bin/score_evaluation_crossdb.py to compute the error rates. For example, to compute the error rates for the scores obtained using the client-specific SVM approach, call:

$ ./bin/score_evaluation_crossdb.py --devel-scores-dir path_to_scores --test_scores-dir path_to_scores --dev-set replay --test-set replay --attack-devel grandtest --attack-test grandtest --verbose

Type --help after the command to see all its available options. For the cross-protocol evaluation, you can specify separate protocols used for decision threshold and evaluation (use --ad and --at parameters). In such a case, most likely the values of the parameters --sd and --st will be different too.

Plotting the box plots¶

Here is an example how to plot the box plots of the scores for each users, for the scores obtained using the client-specific GMM approach:

$ ./bin/scores_box_plot.py --devel-scores-dir path_to_scores --test_scores-dir path_to_scores --dev-set replay --test-set replay --attack-devel grandtest --attack-test grandtest --normalization --reject-outlier --verbose

Type --help after the command to see all its available options. It is recommended that the scores are always normalized (--normalization option) with outliers rejected during the normalization (--reject-outlier option).

Part 2: Generate client-independent and client-specific anti-spoofing scores¶

Generative client-independent scores¶

Generative client-specific scores¶

Discriminative client-independent scores¶

Discriminative client-specific scores¶

Computing the error rates¶

Plotting the box plots¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Part 2: Generate client-independent and client-specific anti-spoofing scores¶

Generative client-independent scores¶

Generative client-specific scores¶

Discriminative client-independent scores¶

Discriminative client-specific scores¶

Computing the error rates¶

Plotting the box plots¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation