Bob Satellite Package Development and Maintenance¶
Note
This guide assumes that you have installed Bob following the instructions on https://www.idiap.ch/software/bob/install.
This tutorial explains how to build and distribute Python-based working environments for Bob. By following these instructions you will be able to:
- Download and install Bob packages to build a global or local working environment including Bob;
- Install python packages to augment your virtual work environment capabilities – e.g., to include a new python package for a specific purpose covering functionality that does not necessarily exists in Bob or any available Satellite Package;
- Implement your own satellite package including either pure Python code, a mixture of C/C++ and Python code, and even pure C/C++ libraries with clean C/C++ interfaces that might be used by other researchers;
- Distribute your work to others in a clean and organized manner.
These instructions heavily rely on the use of Python distutils and zc.buildout. One important advantage of using zc.buildout is that it does not require administrator privileges for setting up any of the above. Furthermore, you will be able to create distributable environments for each project you have. This is a great way to release code for laboratory exercises or for a particular publication that depends on Bob.
Note
The core of our strategy is based on standard tools for defining and deploying Python packages.
If you are not familiar with Python’s setuptools
, distutils
or PyPI, it can be beneficial to learn about those before you start.
Python’s Setuptools and Distutils are mechanisms to define and distribute Python code in a packaged format, optionally through PyPI, a web-based Python package index and distribution portal.
zc.buildout is a tool to deploy Python packages locally, automatically setting up and encapsulating your work environment.
Anatomy of a buildout Python package¶
The best way to create your package is to download one of the skeletons that are described in this tutorial and build on it, modifying what you need. Fire-up a shell window and than do this:
$ wget https://gitlab.idiap.ch/bob/bob.extension/raw/master/bob/extension/data/bob.example.project.tar.bz2
$ tar -xjf bob.example.project.tar.bz2
$ cd bob.example.project
We now recommend you read the file README.rst
, which is written in reStructuredText format (see also reStructuredText Primer), situated at the root of the just downloaded material.
It contains important information on other functionality such as document generation and unit testing, which will not be covered on this introductory material.
The anatomy of a minimal package should look like the following:
.
+-- MANIFEST.in # extras to be installed, besides the Python files
+-- README.rst # a description of the package, in reStructuredText format
+-- bootstrap-buildout.py # stock script downloaded from zc.buildout's website
+-- buildout.cfg # buildout configuration
+-- setup.py # installation + requirements for this particular package
+-- version.txt # the (current) version of your package
+-- doc # documentation directory
| +-- conf.py # Sphinx configuration
| +-- index.rst # Documentation starting point for Sphinx
+-- bob # Python package (a.k.a. "the code")
| +-- example
| | +-- project
| | | +-- script
| | | | +-- __init__.py
| | | | +-- version.py
| | | +-- __init__.py
| | | +-- test.py
| | +-- __init__.py
| +-- __init__.py
Our example that you just downloaded contains these files and a few extra ones useful for this tutorial.
Inspect the package so you are aware of its contents.
All files are in text format and should be heavily commented.
The most important file that requires your attention is setup.py
.
This file contains the basic information for the Python package you will be creating.
It defines scripts the package provides and dependencies it requires for execution.
To customize the package to your needs, you will need to edit this file and modify it accordingly.
Before doing so, it is suggested you go through all of this tutorial so you are familiar with the whole environment.
The example package, as it is distributed, contains a fully working example.
In the remainder of this document, we explain how to setup the setup.py
and the buildout.cfg
so you can work in different operational modes - the ones which are more common development scenarios.
Pure-Python Packages¶
Pure-Python packages are the most common. They contain code that is exclusively written in Python. This contrasts to packages that are written in a mixture of Python and C/C++, which are explained in more detail below.
The package you cloned above is a pure-Python example package and contains all
elements to get you started. It defines a single library module called
bob.example.project
, which declares a simple script, called version.py
that prints out the version of the dependent library Blitz++/Python Arrays. When
you clone the package, you will not find any executable as buildout
needs
to check all dependencies and install missing ones before you can execute
anything. Particularly, it inspects the setup.py
file in the root
directory of the package, which contains all required information to build the
package, all of which is contained in the setup
function:
setup(
name = 'bob.example.project',
version = open("version.txt").read().rstrip(),
...
packages = find_packages(),
...
install_requires = [
'setuptools',
'bob.blitz'
],
...
entry_points = {
'console_scripts' : [
'version.py = bob.example.project.script.version:main',
],
},
)
In detail, it defines the name and the version of this package, which files belong to the package (those files are automatically collected by the find_packages
function), other packages that we depend on, namespaces (see below) and console scripts.
The full set of options can be inspected in the Setuptools documentation.
To be able to use the package, we first need to build it. Here is how to go from nothing to everything:
$ python bootstrap-buildout.py
Creating directory '/home/user/bob.example.project/bin'.
Creating directory '/home/user/bob.example.project/parts'.
Creating directory '/home/user/bob.example.project/eggs'.
Creating directory '/home/user/bob.example.project/develop-eggs'.
Generated script '/home/user/bob.example.project/bin/buildout'.
$ ./bin/buildout
Getting distribution for 'bob.buildout'.
Got bob.buildout 2.0.0.
Getting distribution for 'zc.recipe.egg>=2.0.0a3'.
Got zc.recipe.egg 2.0.1.
Develop: '/home/user/bob.example.project/.'
...
Installing scripts.
Getting distribution for 'bob.extension'.
Processing bob.blitz-2.0.0.zip
...
Got bob.blitz 2.0.0.
...
Note
The Python shell used in the first line of the previous command set determines the Python interpreter that will be used for all scripts developed inside this package.
To build your environment around a different version of Python, just make sure to correctly choose the interpreter you wish to use.
If you just want to get things rolling, using python bootstrap-buildout.py
will, in most cases, do the right thing.
Note
When you have installed an older version of Bob – i.e. Bob v1.x, you might need to uninstall it first, see https://www.idiap.ch/software/bob/install.
Warning
Using Bob 2.0 at Idiap
At Idiap, we provide a pre-installed version of the latest stable version of the packages. To use it, refer to: Using Bob at Idiap
Using buildout¶
Buildout (see Using zc.buildout) has set up you local environment with packages that it finds from different sources.
It is initialized by the buildout.cfg
file, which is part of the package that you unzipped above.
Let’s have a look inside it:
; vim: set fileencoding=utf-8 :
[buildout]
parts = scripts
develop = .
eggs = bob.example.project
extensions = bob.buildout
newest = false
verbose = true
debug = false
[scripts]
recipe = bob.buildout:scripts
dependent-scripts = true
It is organized in several sections, which are indicated by []
, where the default section [buildout]
is always required.
Some of the entries need attention.
- The first entry are the
eggs
. In there, you can list all python packages that should be installed, additionally to the ones specified in theinstall_requires
section of thesetup.py
(see below). These packages can contain other Bob packages (like any database package such asbob.db.mobio
), but also other Python packages such asgridtk
. These packages will be available to be used in your environment. At least, the current package needs to be in theeggs
list. - The
extensions
list includes all extensions that are required in the buildout process. By default, onlybob.buildout
is required, but more extensions can be added (seemr.developer
below). - The next entry is the
develop
list. There, you can list directories that contain Python packages, which will be build in exactly the order that you specified there. With this option, you can tell buildout particularly, in which directories it should look for some packages. Note that thedevelop
packages are not automatically included into theeggs
. Of course, you need to develop the current package, which is stored in.
, i.e, the current directory.
The remaining options define, how the packages are build.
For example, the debug
flag defined, how the C++ code in all the packages is built.
The verbose
options handles the verbosity of the build.
When the newest
flag is set to true
, buildout will install all packages in the latest versions, even if an older version is already available.
Using mr.developer¶
One extension that is regularly used in most of Bob‘s packages is mr.developer.
It can be used to automatically check out packages from git repositories, and places them into the ./src
directory.
It can be simply set up:
[buildout]
...
extensions = bob.buildout
mr.developer
auto-checkout = *
develop = src/bob.blitz
.
[sources]
bob.blitz = git https://gitlab.idiap.ch/bob/bob.blitz
...
A new section called [sources]
appears, where the package information for mr.developer is initialized, for more details, please read it’s documentation.
Again, mr.developer
does not automatically place the packages into the develop
list (and neither in the eggs
), so you have to do that yourself.
Running buildout¶
Finally, running buildout is a two-step process, which is detailed above.
The command line ./bin/buildout
will actually run buildout and build your local environment.
All options in the buildout.cfg can be overwritten on command line, by specifying buildout:option=...
, where option
can be any entry in the buildout.cfg
.
Finally, buildout will perform the following steps:
- It checks out the packages that you specified using
mr.developer
. - It develops all packages in the
develop
section. - It will go through the list of
eggs
and search for according packages in the following order:- In one of the already developed directories.
- In the python environment, e.g., packages installed with
pip
. - Online, i.e., in the
find-links
directory, or by default on PyPI.
4. It will populate the ./bin
directory with all the console_scripts
that you have specified in the setup.py
.
In our example, this is ./bin/version.py
.
Note
One thing to note in package development is that when you
change the entry points in setup.py
of a package, you need to
run ./bin/buildout
again.
Your local environment¶
After buildout has finished, you should now be able to execute ./bin/version.py
:
$ ./bin/version.py
bob.blitz: 2.0.5 [api=0x0201] ([PATH]/eggs/bob.blitz-2.0.5-py2.7-linux-x86_64.egg)
* C/C++ dependencies:
- Blitz++: 0.10
- Boost: 1.55.0
- Compiler: {'version': '4.9.2', 'name': 'gcc'}
- NumPy: {'abi': '0x01000009', 'api': '0x00000009'}
- Python: 2.7.9
* Python dependencies:
- bob.extension: 2.0.7 ([PATH]/bob.example.project/eggs/bob.extension-2.0.7-py2.7.egg)
- numpy: 1.8.2 (/usr/lib/python2.7/dist-packages)
- setuptools: 15.1 ([PATH]/bob.example.project/eggs/setuptools-15.1-py2.7.egg)
Also, when using the newly generated ./bin/python
script, you can access all packages that you have developed, including your own package:
$ ./bin/python
>>> import bob.example.project
>>> print (bob.example.project)
<module 'bob.example.project' from '[PATH]/bob/example/project/__init__.py'>
>>> print (bob.example.project.get_config())
bob.example.project: 0.0.1a0 ([PATH]/bob.example.project)
* Python dependencies:
- bob.blitz: 2.0.5 ([PATH]/eggs/bob.blitz-2.0.5-py2.7-linux-x86_64.egg)
- bob.extension: 2.0.7 ([PATH]/bob.example.project/eggs/bob.extension-2.0.7-py2.7.egg)
- numpy: 1.8.2 (/usr/lib/python2.7/dist-packages)
- setuptools: 15.1 ([PATH]/bob.example.project/eggs/setuptools-15.1-py2.7.egg)
Everything is now setup for you to continue the development of this package. Modify all required files to setup your own package name, description and dependencies. Start adding files to your library (or libraries) and, if you wish, make this package available in a place with public access to make your research public. We recommend using Gitlab or GitHub. Optionally, drop-us a message talking about the availability of this package so we can add it to the growing list of Satellite Packages.
Python Package Namespace¶
We like to make use of namespaces to define combined sets of functionality that go well together.
Python package namespaces are explained in details here together with implementation details.
For bob packages, we usually use the bob
namespace, using several sub-namespaces such as bob.io
, bob.ip
, bob.learn
, bob.db
or (like here) bob.example
.
In particular, if you are creating a database access API, please consider putting all of your package contents inside the namespace bob.db.<package>
, therefore declaring two namespaces: bob
and bob.db
.
All standard database access APIs follow this strategy.
Just look at our currently existing database satellite packages for examples.
Creating Database Satellite Packages¶
Database satellite packages are special satellite packages that can hook-in Bob‘s database manager bob_dbmanage.py
.
Except for this detail, they should look exactly like a normal package.
To allow the database to be hooked to the bob_dbmanage.py
you must implement a non-virtual Python class that inherits from bob.db.base.driver.Interface
.
Your concrete implementation should then be described at the setup.py
file with a special bob.db
entry point:
# bob database declaration
'bob.db': [
'example = bob.db.example.driver:Interface',
],
At present, there is no formal design guide for databases. Nevertheless, it is considered a good practice to follow the design of currently existing database satellite packages. This should ease migration in case of future changes.
Documentation Generation and Unit Testing¶
If you intend to distribute your newly created package, please consider carefully documenting and creating unit tests for your package. Documentation is a great starting point for users and unit tests can be used to check functionality in unexpected circumstances such as variations in package versions.
Documentation¶
To write documentation, use the Sphinx Documentation Generator.
A template has been setup for you under the doc
directory.
Get familiar with Sphinx and then unleash the writer in you.
Once you have edited both doc/conf.py
and doc/index.rst
you can run the documentation generator executing:
$ ./bin/sphinx-build -n doc sphinx
...
This example generates the output of the sphinx processing in the directory sphinx
.
You can find more options for sphinx-build
using the -h
flag:
$ ./bin/sphinx-build -h
...
Note
If the code you are distributing corresponds to the work described in a publication, don’t forget to mention it in your doc/index.rst
file.
Unit Tests¶
Writing unit tests is an important asset on code that needs to run in different platforms and a great way to make sure all is OK. Test units are run with nose. To run the test units on your package call:
$ ./bin/nosetests -sv
bob.example.library.test.test_reverse ... ok
----------------------------------------------------------------------
Ran 1 test in 0.253s
OK
Distributing Your Work¶
To distribute a package, we recommend you use PyPI.
The Hitchhiker’s Guide to Packaging contains details and good examples on how to achieve this.
Particularly, you should edit your README.rst
file to have a proper description of your package.
This file will be used to generate the front page of your package on PyPI and will, hence, be the first contact point of the world with your package.
Note
If you are writing a package to extend Bob, you might want to follow the README structure of all Bob packages.
The README.rst
of this package (bob.extension
) is a good example, including all the badges that show the current status of the package and the link to relevant information.
To ease up your life, we also provide a script to run all steps to publish your package.
Please read the following paragraphs to understand the steps in the ./bin/bob_new_version.py
script that will be explained at the end of this section.
Version Numbering Scheme¶
We recommend you follow Bob‘s version numbering scheme using a 3-tier string: M.m.p
.
The value of M
is a number starting at 1.
This number is changed in case of a major release that brings new APIs and concepts to the table.
The value of m
is a number starting at 0.
Every time a new API is available (but no conceptual modifications are done to the platform)
that number is increased.
Finally, the value of p represents the patch level, starting at 0.
Every time we need to post a new version of Bob that does not bring incompatible API modifications, that number is increased.
For example, version 1.0.0 is the first release of Bob.
Version 1.0.1 would be the first patch release.
Note
The numbering scheme for your package and Bob‘s may look the same, but should be totally independent of each other.
Bob may be on version 3.4.2 while your package, still compatible with that release could be on 1.4.5.
You should state on your setup.py
file which version of Bob your package is compatible with, using the standard notation defined for setuptools installation requirements for packages.
You may use version number extenders for alpha, beta, and candidate releases with the above scheme, by appending aN
, bN
or cN
to the version number.
The value of N
should be an integer starting at zero.
Python’s setuptools package will correctly classifier package versions following this simple scheme.
For more information on package numbers, consult Python’s PEP 386.
Here are lists of valid Python version numbers following this scheme:
0.0.1
0.1.0a35
1.2.3b44
2.4.99c32
Release Methodology for Satellite Packages¶
Here is a set of steps we recommend you follow when releasing a new version of your satellite package:
First decide on the new version number your package will get. If you are making a minor, API preserving, modification on an existing stable package (already published on PyPI), just increment the last digit on the version. Bigger changes may require that you signal them to users by changing the first digits of the package. Alpha, beta or candidate releases don’t need to have their main components of the version changed, just bump-up the last digit. For example
1.0.3a3
would become1.0.3a4
;In case you are making an API modification to your package, you should think if you would like to branch your repository at this position. You don’t have to care about this detail with new packages, naturally.
If required, branching will allow you to still make modifications (patches) on the old version of the code and develop on the
master
branch for the new release, in parallel. It is important to branch when you break functionality on existing code - for example to reach compatibility with an upcoming version of Bob. After a few major releases, your repository should look somewhat like this:----> time initial commit o---------------o---------o-----o-----------------------> master | | | | | | v2.0.0 | | +---x----------> 2.0 | | | | v1.1.0 v1.1.1 | +-x-------x------> 1.1 | | v1.0.0 v1.0.1a0 +---x-------x-------> 1.0
The
o
‘s mark the points in which you decided to branch your project. Thex
‘s mark places where you decided to release a new version of your satellite package on PyPI. The-
‘s mark commits on your repository. Time flies from left to right.In this fictitious representation, the
master
branch continue under development, but one can see older branches don’t receive much attention anymore.Here is an example for creating a branch at gitlab (many of our satellite packages are hosted there). Let’s create a branch called
1.1
:$ git branch 1.1 $ git checkout 1.1 $ git push origin 1.1
When you decide to release something publicly, we recommend you tag the version of the package on your repository, so you have a marker to what code you actually published on PyPI. Tagging on gitlab would go like this:
$ git tag v1.1.0 $ git push && git push --tags
Notice use prefix tag names with
v
.Finally, after branching and tagging, it is time for you to publish your new package on PyPI. When the package is ready and you have tested it, just do the following:
$ ./bin/python setup.py register #if you modified your setup.py or README.rst $ ./bin/python setup.py sdist --formats zip upload
Note
You can also check the .zip file that will be uploaded to PyPI before actually uploading it. Just call:
$ ./bin/python setup.py sdist --formats zip
and check what was put into the
dist
directory.Note
To be able to upload a package to PyPI you have to register at the web page using a user name and password.
Announce the update on the relevant channels.
Upload Additional Documentation to PythonHosted.org¶
In case you have written additional sphinx documentation in your satellite package that you want to share with the world, there is an easy way to push the documentation to PythonHosted.org. More detailed information are given here, which translates roughly into:
$ ./bin/python setup.py build_sphinx --source-dir=doc --build-dir=build/doc --all-files
$ ./bin/python setup.py upload_docs --upload-dir=build/doc/html
The link to the documentation will automatically be added to the PyPI page of your package. Usually it is a good idea to check the documentation after building and before uploading.
Change the Version of your Satellite Package¶
It is well understood that it requires quite some work to understand and follow the steps to publish (a new version) of your package. Especially, when you want to update the .git repository and the version on PyPI at the same time. In total, 5 steps need to be performed, in the right order. These steps are:
- Adding a tag in your git repository, possibly after changing the version of your package.
- Running buildout to build your package.
- Register and upload your package at PyPI.
- Upload the documentation of your package to PythonHosted.org.
and, finally, to keep track of new changes:
- Switch to a new version number.
All these steps are combined in the ./bin/bob_new_version.py
script.
This script needs to be run from within the root directory of your package.
By default, it will make an elaborate guess on the version that you want to upload.
Please run:
$ ./bin/bob_new_version.py --help
to see a list of options.
Detailed information of what the script is doing, you can get when using the --dry-run
option (a step that you always should consider before actually executing the script):
$ ./bin/bob_new_version.py -vv --dry-run
Satellite Packages Available¶
Look here for our growing list of Satellite Packages.