Common Functionality in all Verification Databases

The verification database interface provides an interface and assures usability of common functionality, which is described in this section.

The File class

Commonly, a verification database contains several files in a certain directory structure. This directory structure is usually based in a certain base directory, from where the files can be found using relative paths. The base directory is different for every user of the database, while the relative paths are identical for each user. Hence, the bob.db.verification.utils.File interface stores only relative paths, and even without the file name extension.

To be unique and short, every file has its own ID. Using this ID, each file can be identified. In most cases, this file_id is of an integral type, though some databases use other hashable types such as str.

Finally, each File contains the information, to which client identity in the database it belongs. Again, the client_id is usually of an integral type, but some databases use other types such as str.

The bob.db.verification.utils.File class has two functions. Since only relative paths without file name extension are stored, the bob.db.verification.utils.File.make_path() function generates a full file name by pre-pending the given base directory, and appending the given file name extension. Similarly, the bob.db.verification.utils.File.save() function takes a given data object and saves it to the given directory with the given file name extension using the bob.io.base.save() function.

The Database

A bob.db.verification.utils.Database contains information about the whole database. For example, the base directory and the file name extension of the original data files can be specified in the bob.db.verification.utils.Database constructor.

The bob.db.verification.utils.Database provides a common interface to query list of bob.db.verification.utils.File objects based on certain criteria. Some of these criteria are dependent on the database, but some criteria are common for all verification database. Database dependent criteria can usually be specified be keyword arguments, which will directly be passed to the derived class bob.db.verification.utils.Database.objects() function, see The objects function.

Each verification database defines different groups, i.e, a training set (usually called the world set) and a development set dev. Sometimes, also an independent evaluation set eval is provided. Which kind of groups are available for the database, can be queried using the bob.db.verification.utils.Database.groups() function.

Additionally, each database defines at least one evaluation protocol, which comprises information on which file_id belongs to which group. All functions of the bob.db.verification.utils.Database interface accept a protocol, which might be None in case the database defines only a single protocol, or in case some file lists are identical for each protocol.

The files for the development (and evaluation) set are usually split into two different purposes. Some of the files are used to enroll models (aka templates, targets) for a given client, while other files are used to probe (aka test, query) some or all models to compute similarity scores. Each enrolled model has a specific model id, a list of which can be queried by the bob.db.verification.utils.Database.model_ids() method. For most databases, one model is enrolled for each client, and thus, the bob.db.verification.utils.File.client_id is identical to the model_id. In any case, you can use the bob.db.verification.utils.Database.get_client_id_from_model_id() function to obtain the client_id for a given model_id.

Functions returning File lists

Several functions of the bob.db.verification.utils.Database interface return lists of bob.db.verification.utils.File objects, which can be used for several tasks of a biometric recognition experiment. These lists are in general unordered, i.e, two subsequent calls to the same function might return the same list in a different order, and unique, i.e., no to files with the same file_id are returned.

The most simple function bob.db.verification.utils.Database.all_files() will simply return a list of all files that fulfill the desired database dependent criteria.

The training set contains a list of bob.db.verification.utils.Files, which can be used to train a biometric recognition system. This file list of the training set can be obtained using the bob.db.verification.utils.Database.training_files() method. Again, database dependent criteria can be specified using specialized keyword arguments.

Again, a list of all files (including the enrollment and probe files) for the dev or eval group can be queried by the bob.db.verification.utils.Database.test_files(). The list of enrollment and probe files can be obtained through the bob.db.verification.utils.Database.enroll_files() and bob.db.verification.utils.Database.probe_files(), respectively.

Important

Both methods accept to specify an optional model_id, but the usage of the model_id is different for both cases. If the model_id is not specified (i.e. None), all files to enroll all models or all probe files are returned. Specifying a model_id in the bob.db.verification.utils.Database.enroll_files() function, only the files used to enroll the model with the given model_id are returned. In opposition, querying the bob.db.verification.utils.Database.probe_files() with a model_id will return a list of probe files, with which the model of the given model_id should be compared. In most protocols of most databases, all models are compared with all probe files and, hence, the model_id is ignored in bob.db.verification.utils.Database.probe_files().

The objects function

The most important function that needs to be implemented in each verification database is the bob.db.verification.utils.Database.objects() function. This function returns a list of objects, which are derived from the bob.db.verification.utils.File. The objects function has at least the following set of keyword parameters:

  • groups: to define different groups like world, dev or eval; accepts None, a single group or a tuple of groups
  • protocol: to define a name of an evaluation protocol; accepts only a single protocol, might also accept None
  • purposes: to define the purpose of the file; accepts at least one or both values 'enroll' (in fact, most of the databases still expect the BE spelling 'enrol') and 'probe', or can be None
  • model_ids: to limit the enroll or probe files to the given model id

In case, the database does not need those parameters, they might be simply ignored, e.g., the protocol is ignored in bob.db.atnt.Database.objects() since only a single protocol is defined in the AT&T database. Other keyword parameters might be present as well. Commonly, the other keyword parameters limit only the training files since the development and evaluation files are strictly defined by the protocol.

General functions

Some more generic functions concerning file names are defined in the Database interface as well. The bob.db.verification.utils.Database.file_names() method returns the list of file name for the given list of files, the base directory and the extension. When the original_directory and the original_extension where specified in the bob.db.verification.utils.Database constructor, the bob.db.verification.utils.Database.original_file_name() function will return the full path of the original data file for a given bob.db.verification.utils.File object, while bob.db.verification.utils.Database.original_file_names() iterates over a list of files. Both functions accept a parameter check_existence to check, whether the original data file exists, which is True by default.

The bob.db.verification.utils.Database.annotations() method will return a dictionary of annotations of any kind for the given bob.db.verification.utils.File, see Annotations. In case, no annotation is available for the given file, or the database does not define any annotations, None is returned.

In some special cases (like the bob.db.frgc.Database), a protocol requires that a single File object contains several actual data files. In this case, the bob.db.frgc.Database.provides_file_set_for_protocol() method returns True, while most other databases use the default implementation bob.db.verification.utils.Database.provides_file_set_for_protocol(), which returns False.

Functions for ZT score normalization

Several databases inside Bob provide a special subset of the training set that is used for score normalization, particularly ZT score normalization as described in [Auck00]. For a given protocol, these files can be obtained using the bob.db.verification.utils.ZTDatabase, which is derived from the bob.db.verification.utils.Database. All keyword arguments in the constructor of bob.db.verification.utils.ZTDatabase (i.e., original_directory and original_extension) are directly passed to the bob.db.verification.utils.Database constructor.

The ZT database adds three query functions. The method bob.db.verification.utils.ZTDatabase.t_model_ids() returns the list of model ids for T-Norm, where the list of files to enroll a T-Norm model of a given model id can be obtained with the bob.db.verification.utils.ZTDatabase.t_enroll_files() method. Finally, bob.db.verification.utils.ZTDatabase.z_probe_files() returns the list of probe files that are used for Z-Norm.

[Auck00]Roland Auckenthaler, Michael Carey, Harvey Lloyd-Thomas Score Normalization for Text-Independent Speaker Verification Systems, Digital Signal Processing, Pages 42-54, 2000,

SQLite Databases

Several database interfaces rely on SQLite to generate and query a local SQL database file, which stores information about the files, clients, protocols and – if applicable – annotations. To simplify the handling of the SQLite query, the bob.db.verification.utils.SQLiteDatabase is provided, which derives from bob.db.verification.utils.Database In the constructor if the bob.db.verification.utils.SQLiteDatabase, simply specify the local SQLite database file. For technical reasons, also the class derived from bob.db.verification.utils.File needs to be given as a parameter. All other keyword arguments (i.e., original_directory and original_extension) are passed directly to the bob.db.verification.utils.Database constructor.

The most important function in this class is bob.db.verification.utils.SQLiteDatabase.query(), which is heavily used in derived classes to query objects from the local SQL database file. Commonly, such a query looks somewhat like:

self.query(File).join(...).filter().order_by(...)

to retrieve a list (in fact, an iterator) of File objects that fulfill your requirements. Internally, it will first check that the database bob.db.verification.utils.SQLiteDatabase.is_valid(), i.e., that the local SQL database file exists and a session is opened for reading.

Three additional methods bob.db.verification.utils.SQLiteDatabase.files(), bob.db.verification.utils.SQLiteDatabase.paths() and bob.db.verification.utils.SQLiteDatabase.reverse() exist in this database interface. Both are mainly used in the command line interface of the databases, using the ./bin/bob_dbmanage.py command set up in the bob.db.base.driver.Interface.

Annotations

Many biometric databases come with annotations for each original data file. For face biometrics, these annotations usually contain hand-labeled locations of several feature points in the face (so-called facial landmarks). Most commonly, at least the locations of the two eyes are annotated. For a given file object, the bob.db.verification.utils.Database.annotations() method will return a dictionary of those annotations, which might differ from database to database. Commonly, for face image biometrics the returned annotations look somewhat like:

{
  'reye' : (re_y, re_x),
  'leye' : (le_y, le_x),
  ...
}

where 'reye' and 'leye' refer to the right and left eye of the subject shown in the image, where left and right is in the perspective of the subject. (re_y, re_x) contain the y and x coordinate of the right eye, the left eye is alike.

Annotations are stored in different format. Some databases like the bob.db.banca.Database store annotations as part of the SQLite database, i.e., as bob.db.banca.Annotation.

Other databases read annotations from files, where usually one annotation file exists for each original data file. Since most of the file formats are consistent between databases, we here provide the function bob.db.verification.utils.read_annotation_file() that can read various types of annotation files. Currently, it supports three annotation_types:

  • 'eyecenter': Four coordinates of the eyes are stored in a single line in the file, in the order: re_x re_y le_x le_y

  • 'named': Each file contains a list of named annotations, one annotation per line.

    The names of the annotation will be used as keys in the annotations dictionary (e.g., ‘reye’). The value of the annotation is either two floats (i.e., each line contains 3 items: name name_x name_y), or a single string (e.g., gender female).

  • 'idiap': A special format for the 22pt facial image point annotations from Idiap.

    Additionally to the 22 labeled landmarks, also the 'reye' and 'leye' landmarks will be estimated by averaging the coordinated of the according inner and outer eye corners.

If your database stores annotations in a different way, e.g. as in bob.db.multipie.Database, you need to write your own annotation file IO.