Manual
++++++

.. contents ::

Differences To Other Software
=============================

BuildKit is really 4 pieces of software in one. It has code for

* creating a template from a directory structure that can be used to generate similar directory structures
* generating an empty Python package set up for use with ``pip``, ``sphinx`` and ``pypi`` with tests, doctests and packagable as a ``.deb`` file
* setting up packages to use the ``git-flow`` methodology and hosting those projects
* automatically generating ``.deb`` packages for a Python package and all its dependencies
* running Python tests and generating both Python ``.egg`` and source releases for a package
* setting up and managing a sophisticated test and release infrastructure using Debian packages, Debian apt repositories and KVM virtualisation

Other tools also do some of these things. For example, you may also be interested in:

``python-apt``

    Tools for managing built packages in an ``apt`` environment

``alien``

    Can automatically build a ``.deb`` from a Python package
    build with ``python setup.py bdist``.

Buildkit also used to do:

* building a sdist release, installing into a clean virtualenv and running tests in isolation
* maintaining a website with different versions of different packages' documentation available and links to the downloads

These features will be added back in a future release.

Buildkit Tutorial
=================

.. caution ::

   Buildkit only works on Ubuntu 10.04 LTS. Any other platform is untested and vitually guaranteed to break. Only use on Ubuntu 10.04.

   Also note that for the VM funtionality to work, you will need virtualisation CPU extensions. You can check you have the necessary support like this:

   ::
       $ sudo apt-get install kvm
       $ kvm-ok
       INFO: Your CPU supports KVM extensions
       INFO: /dev/kvm exists
       KVM acceleration can be used

    You can create a VM without KVM support but you won't be able to run it.

BuildKit is installed as a Debian package so the first thing you need to do is
to use it to create a debian package from the source.

Unzip the source distribution and change to the same directory as ``setup.py``.

Now install all the dependencies listed in
``buildkit_deb/DEBIAN/control.template``. At the time of writing you can do so
like this:

::

    sudo apt-get install ubuntu-vm-builder python-vm-builder gawk kvm sed findutils rsync apache2 reprepro gnupg wget dh-make devscripts build-essential fakeroot alien cdbs python-pip python-virtualenv subversion mercurial git-core apt-proxy kvm-pxe uml-utilities

When the email configuration pops up choose "Internet Site" then accept the
hostname suggested (or choose your own).

Then build the buildkit ``.deb`` files like this:

::

    PACKAGEVERSION=01
    mkdir -p dist/buildkit
    python -m buildkit.run pkg nonpython -p "$PACKAGEVERSION" -o dist/buildkit --deb buildkit_deb
    python -m buildkit.run pkg python -p "$PACKAGEVERSION" -o dist/buildkit --author-email james@pythonweb.org --deb .

You'll then get two ``.deb`` files in ``dist/buildkit`` which you can install:

::

    sudo dpkg -i dist/buildkit/*.deb

You'll most likely see this as part of the install:

::

    Creating key in /var/lib/buildkit/key with name BuildKit-Automatic-Packaging, email buildkit@example.com, passphrase buildkit and comment BuildKitkey ...
    Working to create /var/lib/buildkit/key ...
    gpg: directory `/home/james/.gnupg' created
    gpg: new configuration file `/home/james/.gnupg/gpg.conf' created
    gpg: WARNING: options in `/home/james/.gnupg/gpg.conf' are not yet active during this run
    gpg: keyring `/home/james/.gnupg/secring.gpg' created
    gpg: keyring `/home/james/.gnupg/pubring.gpg' created
    gpg: Generating a basic OpenPGP key for buildkit, THIS CAN TAKE A FEW MINUTES if there is not enough entropy ...
    gpg: skipping control `%no-protection' ()
    .+++++++++++++++.++++++++++..++++++++++++++++++++.++++++++++.++++++++++++++++++++++++++++++.++++++++++.++++++++++++++++++++..+++++.++++++++++>.++++++++++.....................................+++++

    Not enough random bytes available.  Please do some other work to give
    the OS a chance to collect more entropy!  (Need 280 more bytes)


You just need to do some other work for a minute or two. Perhaps type on the
keyboard or copy a file, check email etc. Eventually gpg will collect enough
entropy and generate you a key which it uses to sign your packages.

QUESTION: Does the buildkit install leave you at a root prompt by mistake?

The default install assumes you will be setting the "host.buildkit" hostname to
whichever system you will host your repository on and run VMs on. In this case
this will be localhost so edit ``/etc/hosts`` to add the "host.buildkit" hostname
to 127.0.0.1:

::

    127.0.0.1       localhost host.buildkit

At this point your repository will be running at http://host.buildkit and
apt-proxy will be installed and running at http://host.buildkit:9999/ . The
latter will give you an error about not enough slashes in the URL if you visit
it because it only expects to be visited with a full package path.

If you want git-flow support you'll now need to run:

::

    sudo buildkit-gitflow-installer

Check you have support for KVM:

::

    $ kvm-ok
    INFO: Your CPU supports KVM extensions
    INFO: /dev/kvm exists
    KVM acceleration can be used

You can create a VM without KVM support but you won't be able to run it. Here's
how you create one (the --proxy argument should be the IP address of the system
running apt-proxy, in this case your local machine):

::

    IP=`/sbin/ifconfig $NETWORK_DEVICE | grep 'inet addr:' | cut -d: -f2 | awk '{ print $1}' | grep -v "127.0.0.1" | grep -v "192.168.100."`
    sudo buildkit vm create --proxy $IP -o /var/lib/buildkit/vm/ 10

You can check that apt-proxy has been used like this:

::

    sudo ls /var/cache/apt-proxy/ubuntu/pool/main/

If the directory exists and is populated, the files from here will be used next
time you create a VM. The creation takes nearly as long though because files
are still pulled in over HTTP, just served from apt-proxy rather than direclty.
It does save bandwidth though.

In reality it is usually easier to just copy the ``.qcow2`` VM disk file to create
a new VM. Let's keep this one as a base VM:

::

    export IMAGE=`sudo ls /var/lib/buildkit/vm/buildkit10/ | awk '{print $0}' | grep -v "run.sh" | grep -v "disk.raw"`
    sudo cp -p /var/lib/buildkit/vm/buildkit10/${IMAGE} /var/lib/buildkit/vm/base.qcow2

You can always just copy the VM manually too, you just have to find out what
the image name is in the ``buildkit10`` directory.

Whenever you want a new VM you can then just run:

::

    sudo -u buildkit qemu-img convert -f qcow2 -O raw /var/lib/buildkit/vm/base.qcow2 /var/lib/buildkit/vm/new/disk.raw

This converts from the small .qcow2 file to a fresh ``disk.raw`` image.

Now let's start it (change eth1 for your network interface):

::

    sudo buildkit-vm-start eth1 qtap0 512M 1 /var/lib/buildkit/vm/buildkit10/disk.raw

Now you can connect from the host to the guest over SSH:

::

    ssh ubuntu@192.168.100.10

The  username and password for the VM are both ``ubuntu``. You can also use
``sudo -s`` with the password  ``ubuntu`` to get root access. You may want to
change the password with ``passwd``.


Example: Building and Testing the CKAN Package Install
======================================================

CKAN is an open source metadata catalogue that powers sites like data.gov.uk
and which uses buildkit for its package install. In this section we'll walk
through how to use buildkit to package it.

Setting up
----------

First you need to get the source code for the version you want to package:

::

    hg clone -r release-v1.5 https://bitbucket.org/okfn/ckan/

Next you need to install buildkit, either from source (as described above) or
from an apt-repository where it is hosted. Once it is installed you'll have an
apt repository running on your local machine as well as the ``buildkit``
command and the ability to boot virtual machines for testing. (You'll need to
build a base VM using the ``buildkit vm create`` command as described above).

The individual buildkit commands that are needed to build CKAN are specified in
the ``build.sh`` script so you should take a look at that.

Creating an apt repository
--------------------------

The ``build.sh`` script exports all the ``.deb`` files that are created to an
apt repository on your local machine that is hosted by Apache and set up as
part of the buildkit install. Before you can run the script you need to create
the repository that will be used:

::

    sudo -u buildkit buildkit repo clone /var/lib/buildkit/repo/base_lucid ckan-1.5

Check that there are no packages in the repository yet:

::

    sudo -u buildkit buildkit repo list /var/lib/buildkit/repo/ckan-1.5

There shouldn't be any output.

Now on to the packaging itself.

Packaging
---------

First edit ``build.sh`` to set the environment variables relevant to you.

Run the build (not as root) like this:

::

    ./build.sh

At the end of the build you'll be prompted for your password so that ``sudo``
can import the packages into the buildkit repository on your local machine to
serve.

You should end up with a set of packages the buildkit repository accessible
from your apt repository as well as a set in ``ckan/dist/buildkit``.

You can now test the build.

Testing
-------

If you've followed the buildkit tutorial and created a base VM, you can now
create a new virtual machine like so:

::

    sudo -u buildkit mkdir /var/lib/buildkit/vm/ckan
    sudo -u buildkit qemu-img convert -f qcow2 -O raw /var/lib/buildkit/vm/base.qcow2 /var/lib/buildkit/vm/ckan/disk.raw

After a few moments you can start your VM (tip: be sure to specify the correct network interface that the VM should use to access the internet, in this case I've used ``eth1``, yours might be ``eth0``).

::

    sudo buildkit-vm-start eth1 qtap0 1024M 4 /var/lib/buildkit/vm/ckan/disk.raw

Here I'm giving the VM 1024M and letting it use 4 CPUs. For a production CKAN
you should have at least 1.5Gb of RAM.

.. tip ::

    If a QEMU window appears but nothing happens after a few seconds it is
    likely your CPU doesn't support virtualisation extensions needed by KVM. Run
    the ``kvm-ok`` command mentioned earlier to check.

    If KVM isn't supported you could try using virtualbox instead. Start by
    installing VirtualBox:

    ::

        sudo apt-get install virtualbox-ose
        sudo rmmod kvm-intel
        # Or if you have an AMD machine:
        # sudo rmmod kvm-amd

    Then convert the disk image to a ``.vdi`` file:

    ::

        sudo -u buildkit qemu-img convert -f qcow2 -O vdi /var/lib/buildkit/vm/base.qcow2 /var/lib/buildkit/vm/ckan/disk.vdi

    Then use the interface to create a new Ubuntu 10.04 machine with this disk
    image as its base. The networking setup will be different if you use virtualbox
    and you'll need to edit the various ``/etc/hosts`` files yourself to be able to
    test your CKAN install, but if you are a virtualbox expert, it should be
    possible.

    See here for a port forwarding approach that is useful: http://jimmyg.org/blog/2008/ssh-to-a-debian-etch-virtual-machine-in-virtualbox.html

    The alternative is just to install CKAN onto your host machine for testing
    and not worry about VMs at all.

Assuming the ``buildkit-vm-start`` command worked you can now connect from the
host to the guest over SSH:

::

    ssh ubuntu@192.168.100.10

Or if you have installed buildkit as standard and not changed any network
settings you can use the ``default.vm.buildkit`` hostname that buildkit set up
for you when it was installed:

::

    ssh ubuntu@default.vm.buildkit

The  username and password for the VM are both ``ubuntu``. You can also use
``sudo -s`` with the password  ``ubuntu`` to get root access. You may want to
change the password with ``passwd``.

Optionally, you might want to install some common software at this point such
as vim, screen, elinks or any other software you commonly use:

::

    sudo apt-get update
    sudo apt-get install vim-nox screen elinks

If it has been a while since you created the base VM you may also want to
upgrade the core packages at this point:

::

    sudo apt-get update
    sudo apt-get upgrade -y

At this point you can install the ckan package from within the VM (or on your
local machine if you prefer). When you start the VM, the hostame
``host.buildkit`` is set up to point to the host server. The Apache
configuration for the host server is set up serve the apt repo from the
``host.buildkit`` server alias so the commands below will set up access the
host repo. The ``sudo`` password is ``ubuntu`` by default as already mentioned.
Run the commands now:

::

    sudo apt-get update
    sudo apt-get install -y wget
    echo "deb http://host.buildkit/ckan-1.5 lucid universe" | sudo tee /etc/apt/sources.list.d/okfn.list
    wget -qO- "http://host.buildkit/packages_public.key" | sudo apt-key add -
    sudo apt-get update
    sudo apt-get install -y ckan postgresql-8.4 solr-jetty

.. caution ::

    The last line in the commands above installs CKAN, the PostgreSQL database
    engine, and the Solr search index server. If you intend to connect to a PostgreSQL or
    Solr server that is running on a different machine you don't need to
    install them. In that case, when you run the ``ckan-create-instance`` command later,
    choose ``"no"`` as the third parameter to tell the install command not to
    set up or configure the PostgreSQL database for CKAN. You'll then need to perform any
    database creation and setup steps manually yourself.

If you ever want to upgrade CKAN you can run:

::

    sudo apt-get update
    sudo apt-get upgrade

Sometimes a new CKAN release comes with extra packages. This is considered by
Ubuntu to be a "dist upgrade". In this case run:

::

    sudo apt-get update
    sudo apt-get dist-upgrade

CKAN-specific instructions
--------------------------

In this section we'll look at preciesly how the rest of CKAN is set up. This
serves as a useful example of how you might design your own software to be set
up.

The install will whirr away, downloading over 180Mb of packages (on a clean
install) and take a few minutes, then towards the end you'll see this:

::

    Setting up solr-jetty (1.4.0+ds1-1ubuntu1) ...
     * Not starting jetty - edit /etc/default/jetty and change NO_START to be 0 (or comment it out).

You'll need to configure Solr for use with CKAN. You can do so like this:

::

    sudo ckan-setup-solr

This changes the Solr schema to support CKAN, sets Solr to start automatically
and then starts Solr. You shouldn't be using the Solr instance for anything
apart from CKAN because the command above modifies its schema.

You can now create CKAN instances as you please using the
``ckan-create-instance`` command. It takes these arguments:

Instance name

    This should be a short letter only string representing the name of the CKAN
    instance. It is used (amongst other things) as the basis for:

    * The directory structure of the instance in ``/var/lib/ckan``, ``/var/log/ckan``, ``/etc/ckan`` and elsewhere
    * The name of the PostgreSQL database to use
    * The name of the Solr core to use

Instance Hostname/domain name

    The hostname that this CKAN instance will be hosted at. It is
    used in the Apache configuration virutal host in
    ``/etc/apache2/sites-available/<INSTANCE_NAME>.common`` so that Apache can resolve
    requests directly to CKAN.

    If you install more than one CKAN instance you'll need to set different
    hostnames for each. If you ever want to change the hostname CKAN responds on
    you can do so by editing ``/etc/apache2/sites-available/<INSTANCE_NAME>.common`` and
    restarting apache with ``sudo /etc/init.d/apache2 restart``.

Local PostgreSQL support (``"yes"`` or ``"no"``)

    If you specify ``"yes"``, CKAN will also set up a local database user and
    database and create its tables, populating them as necessary and saving the
    database password in the config file. You would normally say ``"yes"`` unless
    you plan to use CKAN with a PostgreSQL on a remote machine.

For production use the second argument above is usually the domain name of the
CKAN instance, but in our case we are testing, so we'll use the default
hostname buildkit sets up to the server which is ``default.vm.buildkit`` (this
is automatically added to your host machine's ``/etc/hosts`` when the VM is
started so that it will resovle from your host machine - for more complex
setups you'll have to set up DNS entries instead).

Create a new instance like this:

::

    sudo ckan-create-instance std default.vm.buildkit yes

You'll need to specify a new instance name and different hostname for each CKAN
instance you set up.

You can now access your CKAN instance from your host machine as http://default.vm.buildkit/

.. tip ::

    More detailed CKAN instructions are available via the "Package Documentation"
    link at http://pypi.python.org/pypi/ckan/.

Potential Packaging Issues
==========================

There are some gotchas to be aware of with ``buildkit`` so far:

* The packaging process occasionally strips ``__init__.py`` files of all their
  content. It is therefore best to never have information in ``__init__.py``
  files which is why, for extensions, we now have plugins implemented in
  ``plugin.py`` rather than ``__init__.py``.
* Packaging sometimes strips our key directories, such as any named ``dist``,
  they just won't be present in the packaged version.

A future implementation of the packaging may be able to address these
deficiencies. I also have some ideas for other possible future CKAN
enhancements:

* Creating a new instance could also automatically restore from any latest
  dumps that existed for that instance
* When "conflict" functionality is used in the Python packaging, the code is copied
  directly into the main project. At the moment it is the packager's
  responsibility to ensure that the licenses of those conflicting modules are
  copied into the main license for the overall package. It would be nice if the
  packaging code either gave a warning about this or automatically added the
  licenses.

Other ideas:

* Make the buildkit-vm-create command part of the buildkit command
* Swap apt-proxy for something that also caches downoads from virutal machines
  (it currently gives bad header lines which seems to be a known, yet
  unresolved issue) so there is no caching of install packages used in the
  VMs.

More buildkit help
==================

More documentation to come, at the moment you can work out most of what you
need by browsing the online help starting at:

::

    buildkit --help