Python API¶
This section includes information for using the Python API of bob.db.base
.
This section contains the reference guide for bob.db.base
.
The db package contains simplified APIs to access data for various databases that can be used in Biometry, Machine Learning or Pattern Classification.
-
class
bob.db.base.
Database
¶ Bases:
object
Low-level Database API to be used within bob.
-
check_parameter_for_validity
(parameter, parameter_description, valid_parameters, default_parameter=None)[source]¶ Checks the given parameter for validity
Ensures a given parameter is in the set of valid parameters. If the parameter is
None
or empty, the value indefault_parameter
will be returned, in case it is specified, otherwise aValueError
will be raised.This function will return the parameter after the check tuple or list of parameters, or raise a
ValueError
.Parameters: - parameter – str The single parameter to be checked. Might be a string or None.
- parameter_description – str A short description of the parameter. This will be used to raise an exception in case the parameter is not valid.
- valid_parameters – [str] A list/tuple of valid values for the parameters.
- default_parameters – [str] or None The default parameter that will be returned in case parameter is None or empty. If omitted and parameter is empty, a ValueError is raised.
-
check_parameters_for_validity
(parameters, parameter_description, valid_parameters, default_parameters=None)[source]¶ Checks the given parameters for validity.
Checks a given parameter is in the set of valid parameters. It also assures that the parameters form a tuple or a list. If parameters is ‘None’ or empty, the default_parameters will be returned (if default_parameters is omitted, all valid_parameters are returned).
This function will return a tuple or list of parameters, or raise a ValueError.
Parameters: - parameters – str, [str] or None The parameters to be checked. Might be a string, a list/tuple of strings, or None.
- parameter_description – str A short description of the parameter. This will be used to raise an exception in case the parameter is not valid.
- valid_parameters – [str] A list/tuple of valid values for the parameters.
- default_parameters – [str] or None The list/tuple of default parameters that will be returned in case parameters is None or empty. If omitted, all valid_parameters are used.
-
convert_names_to_highlevel
(names, low_level_names, high_level_names)[source]¶ Converts group names from a low level to high level API
This is useful for example when you want to return
db.groups()
for thebob.bio.base
. Your instance of the database should already havelow_level_names
andhigh_level_names
initialized.
-
convert_names_to_lowlevel
(names, low_level_names, high_level_names)[source]¶ Same as convert_names_to_highlevel but on reverse
-
file_names
(files, directory, extension) → paths[source]¶ Returns the full path of the given File objects.
Parameters:
- files : [
- The list of file object to retrieve the file names for.
- directory : str
- The base directory, where the files can be found.
- extension : str
- The file name extension to add to all files.
bob.db.base.File
]Returns:
- paths : [str]
- The paths extracted for the files, in the same order.
-
original_file_name
(file)[source]¶ This function returns the original file name for the given File object.
Keyword parameters:
- file :
- The File objects for which the file name should be retrieved
- Return value : str
- The original file name for the given File object
bob.bio.base.database.BioFile
or a derivative
-
original_file_names
(files) → paths[source]¶ Returns the full path of the original data of the given File objects.
Parameters:
- files : [
- The list of file object to retrieve the original data file names for.
bob.db.base.File
]Returns:
- paths : [str]
- The paths extracted for the files, in the same order.
-
sort
(files) → sorted[source]¶ Returns a sorted version of the given list of File’s (or other structures that define an ‘id’ data member). The files will be sorted according to their id, and duplicate entries will be removed.
Parameters:
- files : [
- The list of files to be uniquified and sorted.
bob.bio.base.database.BioFile
]Returns:
- sorted : [
- The sorted list of files, with duplicate BioFile.ids being removed.
bob.bio.base.database.BioFile
]
-
-
class
bob.db.base.
File
(path, file_id=None)¶ Bases:
object
Abstract class that define basic properties of File objects. Your file instance should have at least the self.id and self.path properties.
-
load
(directory=None, extension='.hdf5')[source]¶ Loads the data at the specified location and using the given extension. Override it if you need to load differently.
Keyword Parameters:
- data
- The data blob to be saved (normally a
numpy.ndarray
). - directory
- [optional] If not empty or None, this directory is prefixed to the final file destination
- extension
- [optional] The extension of the filename - this will control the type of output and the codec for saving the input blob.
-
make_path
(directory=None, extension=None)[source]¶ Wraps the current path so that a complete path is formed
Keyword Parameters:
- directory
- An optional directory name that will be prefixed to the returned result.
- extension
- An optional extension that will be suffixed to the returned filename. The
extension normally includes the leading
.
character as in.jpg
or.hdf5
.
Returns a string containing the newly generated file path.
-
save
(data, directory=None, extension='.hdf5', create_directories=True)[source]¶ Saves the input data at the specified location and using the given extension. Override it if you need to save differently.
Keyword Parameters:
- data
- The data blob to be saved (normally a
numpy.ndarray
). - directory
- [optional] If not empty or None, this directory is prefixed to the final file destination
- extension
- [optional] The extension of the filename - this will control the type of output and the codec for saving the input blob.
-
-
class
bob.db.base.
SQLiteDatabase
(sqlite_file, file_class)¶ Bases:
bob.db.base.Database
This class can be used for handling SQL databases.
It opens an SQL database in a read-only mode and keeps it opened during the whole session.
Parameters: - sqlite_file – str The file name (including full path) of the SQLite file to read or generate.
- file_class – a class instance
The
File
class, which needs to be derived frombob.db.base.File
. This is required to be able toquery()
the databases later on.
-
all_files
(**kwargs)[source]¶ Returns the list of all File objects that satisfy your query.
For possible keyword arguments, please check the implemention’s
objects()
method.
-
check_parameter_for_validity
(parameter, parameter_description, valid_parameters, default_parameter=None)¶ Checks the given parameter for validity
Ensures a given parameter is in the set of valid parameters. If the parameter is
None
or empty, the value indefault_parameter
will be returned, in case it is specified, otherwise aValueError
will be raised.This function will return the parameter after the check tuple or list of parameters, or raise a
ValueError
.Parameters: - parameter – str The single parameter to be checked. Might be a string or None.
- parameter_description – str A short description of the parameter. This will be used to raise an exception in case the parameter is not valid.
- valid_parameters – [str] A list/tuple of valid values for the parameters.
- default_parameters – [str] or None The default parameter that will be returned in case parameter is None or empty. If omitted and parameter is empty, a ValueError is raised.
-
check_parameters_for_validity
(parameters, parameter_description, valid_parameters, default_parameters=None)¶ Checks the given parameters for validity.
Checks a given parameter is in the set of valid parameters. It also assures that the parameters form a tuple or a list. If parameters is ‘None’ or empty, the default_parameters will be returned (if default_parameters is omitted, all valid_parameters are returned).
This function will return a tuple or list of parameters, or raise a ValueError.
Parameters: - parameters – str, [str] or None The parameters to be checked. Might be a string, a list/tuple of strings, or None.
- parameter_description – str A short description of the parameter. This will be used to raise an exception in case the parameter is not valid.
- valid_parameters – [str] A list/tuple of valid values for the parameters.
- default_parameters – [str] or None The list/tuple of default parameters that will be returned in case parameters is None or empty. If omitted, all valid_parameters are used.
-
convert_names_to_highlevel
(names, low_level_names, high_level_names)¶ Converts group names from a low level to high level API
This is useful for example when you want to return
db.groups()
for thebob.bio.base
. Your instance of the database should already havelow_level_names
andhigh_level_names
initialized.
-
convert_names_to_lowlevel
(names, low_level_names, high_level_names)¶ Same as convert_names_to_highlevel but on reverse
-
file_names
(files, directory, extension) → paths¶ Returns the full path of the given File objects.
Parameters:
- files : [
- The list of file object to retrieve the file names for.
- directory : str
- The base directory, where the files can be found.
- extension : str
- The file name extension to add to all files.
bob.db.base.File
]Returns:
- paths : [str]
- The paths extracted for the files, in the same order.
-
files
(ids, preserve_order=True)[source]¶ Returns a list of
File
objects with the given file idsParameters: - ids – list, tuple The ids of the object in the database table “file”. This object should be a python iterable (such as a tuple or list).
- preserve_order – bool If True (the default) the order of elements is preserved, but the execution time increases.
Returns: a list (that may be empty) of
File
objects.Return type:
-
original_file_name
(file)¶ This function returns the original file name for the given File object.
Keyword parameters:
- file :
- The File objects for which the file name should be retrieved
- Return value : str
- The original file name for the given File object
bob.bio.base.database.BioFile
or a derivative
-
original_file_names
(files) → paths¶ Returns the full path of the original data of the given File objects.
Parameters:
- files : [
- The list of file object to retrieve the original data file names for.
bob.db.base.File
]Returns:
- paths : [str]
- The paths extracted for the files, in the same order.
-
paths
(ids, prefix=None, suffix=None, preserve_order=True)[source]¶ Returns a full file paths considering particular file ids
Parameters: - ids – list, tuple The ids of the object in the database table “file”. This object should be a python iterable (such as a tuple or list).
- prefix – str or None The bit of path to be prepended to the filename stem
- suffix – str or None The extension determines the suffix that will be appended to the filename stem.
- preserve_order – bool If True (the default) the order of elements is preserved, but the execution time increases.
Returns: A list (that may be empty) of the fully constructed paths given the file ids.
Return type:
-
reverse
(paths, preserve_order=True)[source]¶ Reverses the lookup from certain paths, returns a list of
File
‘sParameters: - paths – [str] The filename stems to query for. This object should be a python iterable (such as a tuple or list)
- preserve_order – True If True (the default) the order of elements is preserved, but the execution time increases.
Returns: A list (that may be empty).
Return type:
-
sort
(files) → sorted¶ Returns a sorted version of the given list of File’s (or other structures that define an ‘id’ data member). The files will be sorted according to their id, and duplicate entries will be removed.
Parameters:
- files : [
- The list of files to be uniquified and sorted.
bob.bio.base.database.BioFile
]Returns:
- sorted : [
- The sorted list of files, with duplicate BioFile.ids being removed.
bob.bio.base.database.BioFile
]
-
uniquify
(file_list)[source]¶ Sorts the given list of File objects and removes duplicates from it.
Parameters: file_list – [ File
] A list of File objects to be handled. Also other objects can be handled, as long as they are sortable.Returns: - A sorted copy of the given
file_list
with the duplicates - removed.
Return type: list - A sorted copy of the given
-
bob.db.base.
read_annotation_file
(file_name, annotation_type)¶ This function provides default functionality to read annotation files.
Parameters: - file_name – str The full path of the annotation file to read
- annotation_type –
str The type of the annotation file that should be read. The following annotation_types are supported:
eyecenter
: The file contains a single row with four entries:re_x re_y le_x le_y
named
: The file contains named annotations, one per line, e.g.:reye re_x re_y
orpose 25.7
idiap
: The file contains enumerated annotations, one per line, e.g.:1 key1_x key1_y
, and maybe some additional annotations like gender, age, ...
Returns: - A python dictionary with the keypoint name as key and the
position
(y,x)
as value, and maybe some additional annotations.
Return type:
Database Handling Utilities¶
Some utilities shared by many of the databases.
-
bob.db.base.utils.
apsw_is_available
()[source]¶ Checks lock-ability for SQLite on the current file system
-
class
bob.db.base.utils.
SQLiteConnector
(filename, readonly=False, lock=None)[source]¶ Bases:
object
An object that handles the connection to SQLite databases.
Parameters: -
APSW_IS_AVAILABLE
= False¶
-
-
bob.db.base.utils.
session
(dbtype, dbfile, echo=False)[source]¶ Creates a session to an SQLite database
-
bob.db.base.utils.
session_try_readonly
(dbtype, dbfile, echo=False)[source]¶ Creates a read-only session to an SQLite database.
If read-only sessions are not supported by the underlying sqlite3 python DB driver, then a normal session is returned. A warning is emitted in case the underlying filesystem does not support locking properly.
Raises: NotImplementedError
– if the dbtype is not supported.
-
bob.db.base.utils.
create_engine_try_nolock
(dbtype, dbfile, echo=False)[source]¶ Creates an engine connected to an SQLite database with no locks.
If engines without locks are not supported by the underlying sqlite3 python DB driver, then a normal engine is returned. A warning is emitted if the underlying filesystem does not support locking properly in this case.
Raises: NotImplementedError
– if the dbtype is not supported.
-
bob.db.base.utils.
session_try_nolock
(dbtype, dbfile, echo=False)[source]¶ Creates a session to an SQLite database with no locks.
If sessions without locks are not supported by the underlying sqlite3 python DB driver, then a normal session is returned. A warning is emitted if the underlying filesystem does not support locking properly in this case.
Raises: NotImplementedError
– if the dbtype is not supported.
-
bob.db.base.utils.
connection_string
(dbtype, dbfile, opts={})[source]¶ Returns a connection string for supported platforms
Parameters:
-
bob.db.base.utils.
resolved
(x)¶
-
bob.db.base.utils.
safe_tarmembers
(archive)[source]¶ Gets a list of safe members to extract from a tar archive
This list excludes:
- Full paths outside the destination sandbox
- Symbolic or hard links to outside the destination sandbox
Code came from a StackOverflow answer: http://stackoverflow.com/questions/10060069/safely-extract-zip-or-tar-using-python
Deploy it like this:
ar = tarfile.open("foo.tar") ar.extractall(path="./sandbox", members=safe_tarmembers(ar)) ar.close()
Parameters: archive (tarfile.TarFile) – An opened tar file for reading Returns: A list of tarfile.TarInfo
objects that satisfy the security criteria imposed by this function, as denoted above.Return type: list
Driver API¶
This module defines, among other less important constructions, a management interface that can be used by Bob to display information about the database and manage installed files.
-
class
bob.db.base.driver.
Interface
[source]¶ Bases:
object
Base manager for Bob databases
You should derive and implement an Interface object on every
bob.db
package you create.-
name
()[source]¶ The name of this database
Returns: a Python-conforming name for this database. This must match the package name. If the package is named bob.db.foo
, then this function must returnfoo
.Return type: str
-
files
()[source]¶ List of meta-data files for the package to be downloaded/uploaded
This function should normally return an empty list, except in case the database being implemented requires download/upload of metadata files that are not kept in its (git) repository.
Returns: A python iterable with all metadata files needed. The paths listed by this method should correspond to full paths (not relative ones) w.r.t. the database package implementing it. This is normally achieved by using pkg_resources.resource_filename()
.Return type: list
-
version
()[source]¶ The version of this package
Returns: The current version number defined in setup.py
Return type: str
-
type
()[source]¶ The type of auxiliary files you have for this database
Returns: A string defining the type of database implemented. You can return only two values on this function, either sqlite
ortext
. If you returnsqlite
, then we append special actions such asdbshell
onbob_dbmanage
automatically for you. Otherwise, we don’t.Return type: str
-
setup_parser
(parser, short_description, long_description)[source]¶ Sets up the base parser for this database.
Parameters: Returns: a subparser, ready so you can add commands on
Return type:
-
add_commands
(parser)[source]¶ Adds commands to a given
argparse.ArgumentParser
This method, effectively, allows you to define special commands that your database will be able to perform when called from the common driver like for example
create
orcheckfiles
.You are not obliged to overwrite this method. If you do, you will have the chance to establish your own commands. You don’t have to worry about stock commands such as
files()
orversion()
. They will be automatically hooked-in depending on the values you return fortype()
andfiles()
.Parameters: parser (argparse.ArgumentParser) – An instance of a parser that you can customize, i.e., call argparse.ArgumentParser.add_argument()
on.
-