Extending Python with C/C++ code¶
Bob massively relies on a mixture between the user-friendly and easy-to-develop Python interface, and a fast implementation of identified bottlenecks using C++. Typically, Bob‘s packages include both a pure C++ library, which can be included and linked in pure C++ code, as well as Python bindings for the C++ code. Sometimes, even a C-API for the python bindings is available. But, let’s go step by step...
C or C++/Python Packages¶
Creating C++/Python bindings should be rather straightforward. Only few adaptations need to be performed to get the C/C++ code being compiled and added as an extension. For simplicity, we created an example package that includes a simple example of a C++ extension. You can check it out by:
$ wget https://gitlab.idiap.ch/bob/bob.extension/raw/master/bob/extension/data/bob.example.extension.tar.bz2
$ tar -xjf bob.example.extension.tar.bz2
$ cd bob.example.extension
One difference to pure Python packages is that now an additional file requirements.txt
can be found in the root directory of the package.
In this file, all packages that are directly required to compile the C/C++ code in your package are listed.
For our example, this is only the bob.blitz
package.
Indirectly required packages will be downloaded and installed automatically.
The second big difference comes in the setup.py
.
To be able to import bob.extension
and bob.blitz
in the setup.py, we need to include some code:
setup_packages = ['bob.extension', 'bob.blitz']
bob_packages = []
from setuptools import setup, find_packages, dist
dist.Distribution(dict(setup_requires = setup_packages + bob_packages))
We keep the setup_packages
and bob_packages
in separate variables since we will need them later.
The bob_packages
contain a list of bob packages that this extension directly depends on.
In our example, we only depend on bob.blitz
, and we can leave the list empty.
As the second step, we need to add some lines in the header of the file to tell the setuptools
system to compile our library with our Extension
class:
# import the Extension class and the build_ext function from bob.blitz
from bob.blitz.extension import Extension, build_ext
# load the requirements.txt for additional requirements
from bob.extension.utils import load_requirements
build_requires = setup_packages + bob_packages + load_requirements()
In fact, we don’t use the extension from bob.extension.Extension
, but the one from bob.blitz.extension
, which is a derivation of this package.
The difference is that in bob.blitz.extension.Extension
all header files and libraries for the Blitz++
library are added.
Third, we have to add an extension using the Extension
class, by listing all C/C++ files that should be compiled into the extension:
# read version from version.txt file
version = open("version.txt").read().rstrip()
setup(
...
setup_requires = build_requires,
install_requires = build_requires,
...
ext_modules = [
Extension("bob.example.extension._library",
[
# the pure C++ code
"bob/example/extension/Function.cpp",
# the Python bindings
"bob/example/extension/main.cpp",
],
version = version,
bob_packages = bob_packages
),
... #add more extensions if you wish
],
...
)
These modifications will allow you to compile extensions that are linked against our core Python-C++ bridge bob.blitz
(be default).
You can specify any other pkg-config
module and that will be linked in (for example, boost
or opencv
) using the packages
parameter.
For boost
packages, you might need to define, which boost modules are required.
By default, when using boost you should at least add the system
module, i.e., by:
setup(
...
ext_modules = [
Extension(
...
packages = ['boost'],
boost_modules = ['system'],
),
...
],
...
)
Other modules and options can be set manually using the standard options for Python extensions.
Most of the bob packages come with pure C++ code and Python bindings, where we commonly use the Python C-API for the bindings.
When your library compiles and links against the pure C++ code, you can simply use the bob_packages
as above.
This will automatically add the desired include and library directories, as well as the libraries and the required preprocessor options.
Note
Usually we provide one extension version
that deals with versioning.
One example of such a version
extension can be found in our example.
In our example, we have defined a small C++ function, which also shows the basic bridge between numpy.ndarray
and our C++ pendant Blitz++
.
Basically, there are two C++ files for our extension.
bob/example/extension/Function.cpp
contains the pure C++ implementation of the function.
In bob/example/extension/main.cpp
, we define the Python bindings to that function, including the creation of a complete Python module called _library
.
Additionally, we give a short example of how to use our documentation classes provided in this module (see below for more details).
Finally, the function reverse
from the module _library
is imported into our module in the bob/example/extension/__init__.py
file.
Note
In the bindings of the reverse
function in bob/example/extension/main.cpp
, we make use of some C++ defines that makes the life easier.
We use a
BOB_TRY
andBOB_CATCH_FUNCTION
block around the function call, as explained in Helper utilities.Warning
By choosing
debug = true
in yourbuildout.cfg
(which is the default, see below), the C++ exception handling will be disabled (in order to support debuggers likegdb
orgdb-python
to handle these exceptions properly). This will result in any C++ exception to be handled by the default C++ exception handler, which reports the exception in the console and stop the program (including any running python shells).We use a
bob::extension::FunctionDoc
to generate a proper function documentation in Python, as explained in Documenting your C/C++ Python Extension.
To compile your C++ Python bindings and the pure C++ libraries, you can follow the same instructions as shown above:
$ python bootstrap-buildout.py
...
$ ./bin/buildout
...
Note
By default, we compile the source code (of this and all dependent packages, both the ones installed as eggs
, and the ones developed using mr.developer
) in debug mode.
If you want to change that, switch the according flag in the buildout.cfg
to debug = False
, and the compilation will be done with optimization flags and C++ exception handling enabled.
Now, we can use the script ./bin/reverse.py
(that we have registered in the setup.py
) to reverse a list of floats, using the C++ implementation of the reverse
function:
$ ./bin/reverse.py 1 2 3 4 5
[1.0, 2.0, 3.0, 4.0, 5.0] reversed is [ 5. 4. 3. 2. 1.]
We can also see that the function documentation has made it into the module, too:
$ ./bin/python
>>> import bob.example.extension
>>> help(bob.example.extension)
and that we can list version and the dependencies of our package:
>>> print (bob.example.extension.version)
0.0.1a0
>>> print (bob.example.extension.get_config())
...
Pure C/C++ Libraries Inside your Package¶
If you want to provide a library with pure C++ code in your package as well, you can use the bob.extension.Library
class.
It will automatically compile your C/C++ code using CMake into a shared library that you can import in your own C/C++-Python bindings, as well as in other packages.
Again, a complete example can be downloaded via:
$ wget https://gitlab.idiap.ch/bob/bob.extension/raw/master/examples/bob.example.library.tar.bz2
$ tar -xjf bob.example.library.tar.bz2
$ cd bob.example.library
To generate a Library, simply add it in the list of ext_modules
:
...
# import the Extension and Library classes and the build_ext function from bob.blitz
from bob.blitz.extension import Extension, Library, build_ext
...
setup(
ext_modules = [
# declare a pure C/C++ library just the same way as an extension
Library("bob.example.library.bob_example_library",
# list of pure C/C++ files compiled into this library
[
"bob/example/library/cpp/Function.cpp",
],
version = version,
bob_packages = bob_packages,
),
# all other extensions will automatically link against the Library defined above
Extension("bob.example.library._library",
# list of files compiled into this extension
[
# the Python bindings
"bob/example/library/main.cpp",
],
version = version,
bob_packages = bob_packages,
),
... #add more Extensions if you wish
],
cmdclass = {
'build_ext': build_ext
},
...
)
Again, we use the overloaded library class
bob.blitz.extension.Library
instead of the
bob.extension.Library
, but the parameters are identical, and
identical to the ones of the bob.extension.Extension
. To avoid
later complications, you should follow the guidelines for libraries in bob
packages:
- The name of the C++ library need to be identical to the name of your package (replacing the ‘.’ by ‘_’).
Also, the package name need to be part of it.
For example, to create a library for the
bob.example.library
package, it should be calledbob.example.library.bob_example_library
. In this way it is assured that the libraries are found by thebob_packages
parameter (see above). - All header files that your C++ library should export need to be placed in the directory
bob/example/library/include/bob.example.library
. Again, this is the default directory, where thebob_packages
expect the includes to be. This is also the directory that is added to your own library and to your extensions, so you don’t need to specify that by hand. - The include directory should contain a
config.h
file, which contains C/C++ preprocessor directives that contains the current version of your C/C++ API. With this, we make sure that the version of the library that is linked into other packages is the expected one. One such file is again given in ourbob.example.library
example. - To avoid conflicts with other functions, you should put all your exported C++ functions into an appropriate namespace.
In our example, this should be something like
bob::example::library
.
The newly generated Library will be automatically linked to all other Extensions in the package. No worries, if the library is not used in the extension, the linker should be able to figure that out...
You can also export your Python bindings to be used in other libraries. Unfortunately, this is an extremely tedious process and is not explained in detail here. As an example, you might want (or maybe not) to have a look into bob.blitz/bob/blitz/include/bob.blitz/capi.h.
Compiling your Library and Extension¶
As shown above, to compile your C++ Python bindings and the pure C++ libraries, you can follow the simple instructions:
$ python bootstrap-buildout.py
...
$ ./bin/buildout
...
This will automatically check out all required bob_packages
and compile them locally.
Afterwards, the C++ code from this package will be compiled, using a newly created build
directory for temporary output.
After compilation, this directory can be safely removed (re-compiling will re-create it).
To get the source code compiled using another build directory, you can define a BOB_BUILD_DIRECTORY
environment variable, e.g.:
$ python bootstrap-buildout.py
...
$ BOB_BUILD_DIRECTORY=/tmp/build_bob ./bin/buildout
...
The C++ code of this package, and the code of all other bob_packages
will be compiled using the selected directory.
Again, after compilation this directory can be safely removed.
Note
For Idiapers, the Note from above applies again.
Another environment variable enables parallel compilation of C or C++ code.
Use BOB_BUILD_PARALLEL=X
(where X
is the number of parallel processes you want) to enable parallel building.
Documenting your C/C++ Python Extension¶
One part of this package are some functions that makes it easy to generate a proper Python documentation for your bound C/C++ functions.
For the API documentation of the package, please read C++ API of the Documentation classes.
One example for a function documentation can be found in the file bob/example/library/main.cpp
, which you have downloaded above.
This documentation can be used after:
#include <bob.extension/documentation.h>
Function documentation¶
To generate a properly aligned function documentation, you can use the bob::extension::FunctionDoc
:
bob::extension::FunctionDoc description(
"function_name",
"Short function description",
"Optional long function description"
);
Note
If you want to document a member function of a class, you should use set fourth boolean option to true. This is required since the default Python class member documentation is indented four more spaces, which we need to balance:
bob::extension::FunctionDoc member_function_description(
"function_name",
"Short function description",
"Optional long function description",
true
);
Using this object, you can add several parts of the function that need documentation:
description.add_prototype("variable1, variable2", "return1, return2");
can be used to add function definitions (i.e., ways how to use your function). This function needs to be called at least once. If the function does not define a return value, it can be left out (in which case the default"None"
is used).description.add_parameter("variable1, variable2", "datatype", "Variable description");
should be defined for each variable that you have used in the prototypes.description.add_return("return1", "datatype", "Return value description");
should be defined for each return value that you have used in the prototypes.
Note
All these functions return a reference to the object, so that you can use them in line, e.g.:
static auto description = bob::extension::FunctionDoc(...)
.add_prototype(...)
.add_parameter(...)
.add_return(...)
;
A complete working exemplary function documentation from the reverse
function in bob.example.library
package would look like this:
static bob::extension::FunctionDoc reverse_doc = bob::extension::FunctionDoc(
"reverse",
"This is a simple example of bridging between blitz arrays (C++) and numpy.ndarrays (Python)",
"Detailed documentation of the function goes here."
)
.add_prototype("array", "reversed")
.add_parameter("array", "array_like (1D, float)", "The array to reverse")
.add_return("reversed", "array_like (1D, float)", "A copy of the ``array`` with reversed order of entries")
;
Finally, when binding you function, you can use:
description.name()
to get the name of the functiondescription.doc()
to get the aligned documentation of the function, properly indented and broken at 80 characters (by default). This call will check that all parameters and return values are documented, and add a.. todo::
directive if not.description.kwlist(index)
to get the list of keyword arguments for the given prototypeindex
that can be passed as thekeywords
parameter to thePyArg_ParseTupleAndKeywords()
function.
which can be used during the binding of the function. In our example, it would look like:
PyMethodDef methods[] = {
...
{
reverse_doc.name(),
(PyCFunction)PyBobExampleLibrary_Reverse,
METH_VARARGS|METH_KEYWORDS,
reverse_doc.doc()
},
...
{NULL} // Sentinel
};
Sphinx directives like .. note::
, .. warning::
or .. math::
will be automatically detected and aligned, when they are used as one-line directive, e.g.:
"(more text)\n\n.. note:: This is a note\n\n(more text)"
Also, enumerations and listings (using the *
character to define a list element) are handled automatically:
"(more text)\n\n* Point 1\n* Point 2\n\n(more text)"
Note
Please assure that directives are surrounded by double \n
characters (see example above) so that they are put as paragraphs.
Otherwise, they will not be displayed correctly.
Note
The .. todo::
directive seems not to like being broken at 80 characters.
If you want to use .. todo::
, please call, e.g., description.doc(10000)
to avoid line breaking.
Note
To increase readability, you might want to split your documentation lines, e.g.:
"(more text)\n"
"\n"
"* Point 1\n"
"* Point 2\n"
"\n"
"(more text)"
Leading white-spaces in the documentation string are handled correctly, so you can use several layers of indentation.
Class documentation¶
To document a bound class, you can use the bob::extension::ClassDoc
to align and wrap your documentation.
Again, during binding you can use the functions description.name()
and description.doc()
as above.
Additionally, the class documentation has a function to add constructor definitions, which takes an bob::extension::FunctionDoc
object.
The shortest way to get a proper class documentation is:
auto my_class_doc =
bob::extension::ClassDoc("class_name", "Short description", "Long Description")
.add_constructor(
bob::extension::FunctionDoc("class_name", "Constructor Description")
.add_prototype("param1", "")
.add_parameter("param1", "type1", "Description of param1")
)
;
Note
The second parameter ""
in add_prototype
prevents the output type (which otherwise defaults to "None"
) to be written.
Note
For constructor documentations, there is no need to declare them as member functions. This is done automatically for you.
Note
You can use the bob::extension::ClassDoc::kwlist()
function to retrieve the kwlist
of the constructor documentation.
Currently, the bob::extension::ClassDoc
allows to highlight member functions or variables at the beginning of the class documentation.
This highlighting is still under development and might not work as expected.
Possible speed issues¶
In order to speed up the loading time of the modules, you might want to reduce the amount of documentation that is generated (though I haven’t experienced any speed differences).
For this purpose, just compile your bindings using the "-DBOB_SHORT_DOCSTRINGS"
compiler option, e.g. by simply define an environment variable BOB_SHORT_DOCSTRINGS=1
before invoking buildout
.
In any of these cases, only the short descriptions will be returned as the doc string.