Packaging distributions

You can use distil to package distributions. When using distutils or setuptools / distribute, you specify what will be packaged by passing arguments to the setup() function in setup.py. However, we are moving away from executable code and towards declarative metadata. Accordingly, distil uses declarative metadata in a file, package.json, that it uses instead of setup.py to describe how to package distributions. A description of this metadata is provided in Packaging metadata, but you can see how the metadata looks for most distributions on PyPI by using distil to download them:

$ distil download -d /tmp config
Downloading http://www.red-dove.com/config-0.3.7.tar.gz to /tmp/config-0.3.7
    31KB @ 461 KB/s 100 % Done: 00:00:00
Unpacking ... done.
$ cat /tmp/config-0.3.7/package.json
{
  "source": {
    "modules": [
      "config"
    ]
  },
  "version": "1",
  "metadata": {
    "maintainer": "Vinay Sajip",
    "name": "config",
    "license": "Copyright (C) 2004-2007 by Vinay Sajip. All Rights Reserved. See LICENSE for license.",
    "author": "Vinay Sajip",
    "home-page": "http://www.red-dove.com/python_config.html",
    "summary": "A hierarchical, easy-to-use, powerful configuration module for Python",
    "version": "0.3.7",
    "maintainer-email": "vinay_sajip@red-dove.com",
    "author-email": "vinay_sajip@red-dove.com",
    "description": "This module allows a hierarchical configuration scheme with support for mappings\nand sequences, cross-references between one part of the configuration and\nanother, the ability to flexibly access real Python objects without full-blown\neval(), an include facility, simple expression evaluation and the ability to\nchange, save, cascade and merge configurations. Interfaces easily with\nenvironment variables and command-line options. It has been developed on python\n2.3 but should work on version 2.2 or greater."
  }
}

This metadata is automatically generated from distributions which are on PyPI. If a particular distribution you download using distil comes without a package.json file, there could be a number of reasons for this:

  • The distribution has recently been uploaded to PyPI, and the processing machinery hasn’t got around to it yet. Try again in a few hours.
  • The processing machinery has failed to process the distribution on PyPI, which could be due to a bug in the automatic processing code or a bug in the distribution’s setup.py.

Source distributions

You can use distil to build source distributions in .tar.gz, .tar.bz2 or .zip formats.

Let’s consider a simple distribution called frobozz. The source tree looks like this:

.
├── docs
│   ├── _build
│   ├── conf.py
│   ├── index.rst
│   ├── make.bat
│   ├── Makefile
│   ├── _static
│   └── _templates
├── frobozz.py
├── MANIFEST
├── README
└── setup.py

The setup.py looks like this:

from distutils.core import setup

setup(
    name='frobozz',
    version='0.1',
    py_modules=['frobozz'],
    author='Distlib User',
    author_email='distlib.user@dummy.org',
)

Let’s replace the setup.py with the equivalent package.json:

{
  "source": {
    "include": [
      "README"
    ],
    "modules": [
      "frobozz"
    ]
  },
  "version": 1,
  "metadata": {
    "version": "0.1",
    "name": "frobozz",
    "author-email": "distlib.user@dummy.org",
    "author": "Distlib User"
  }
}

If we’re in the root directory of the frobozz project, we can package it by simply issuing the command:

$ distil package
The following packages were built:
  frobozz-0.1.tar.gz

By default, a .tar.gz source archive in dist is built. Let’s look at its contents:

$ tar tzvf dist/frobozz-0.1.tar.gz
drwxrwxr-x vinay/vinay       0 2013-03-21 12:46 frobozz-0.1/
-rw-rw-r-- vinay/vinay       0 2013-03-20 09:58 frobozz-0.1/README
-rw-rw-r-- vinay/vinay       0 2013-03-20 09:57 frobozz-0.1/frobozz.py
-rw-rw-r-- vinay/vinay     257 2013-03-21 12:40 frobozz-0.1/package.json

To build other formats, you can specify them in a --formats parameter:

$ distil package --formats=gztar,bztar,zip
The following packages were built:
  frobozz-0.1.tar.bz2
  frobozz-0.1.tar.gz
  frobozz-0.1.zip

Here is the complete help for distil‘s package command:

$ distil help package
usage: distil package [-h] [--formats {gztar,bztar,zip,wheel}] [-d DESTDIR]
                      [DIR]

Create source distributions or wheels.

positional arguments:
  DIR                   The directory containing the software to package. If
                        not specified, the current directory is used.

optional arguments:
  -h, --help            show this help message and exit
  --formats {gztar,bztar,zip,wheel}
                        The formats to produce packages in.
  -d DESTDIR, --destination DESTDIR
                        Location to write the packaged distributions to.

Binary distributions

Currently, distil only supports building binary distributions using the Wheel format (see PEP 427).

The method for creating wheels is just the same as for source distributions:

$ distil package --formats=wheel
The following packages were built:
  /home/vinay/projects/frobozz/dist/frobozz-0.1-py27-none-any.whl
$ unzip -l dist/frobozz-0.1-py27-none-any.whl
Archive:  dist/frobozz-0.1-py27-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2013-03-21 17:50   frobozz.py
      314  2013-03-21 17:50   frobozz-0.1.dist-info/METADATA
       89  2013-03-21 17:50   frobozz-0.1.dist-info/WHEEL
      263  2013-03-21 17:50   frobozz-0.1.dist-info/RECORD
---------                     -------
      666                     4 files

Building wheels for dependencies

When you build a wheel using the package command, only the package itself is built - not its dependencies. If you need to build dependencies of your package, use distil‘s pip command (see below).

Using pip when building wheels

Sometimes, the distribution you want to build a wheel for uses custom code in setup.py to set things up correctly for the build. In such cases, you may need to use pip to build your wheel. For this purpose, distil provides the pip command. This works from requirements rather than source directories, and is intended to be used for PyPI-hosted dependencies of your package rather than for your package itself (use distil package for that).

When using distil pip, note that pip is used to do a customised installation from which distil then builds the wheel using distlib. This means that you should use a clean venv when running distil pip, because pip won’t install any already-installed distributions, and a clean venv won’t have any of those, minimising problems in your workflow. Here’s an example of running distil pip:

$ distil -e d2 pip Flask
Checking requirements for Flask (0.9) ... done.
Pipping Jinja2==2.6 ...
Pipping Werkzeug==0.8.3 ...
Pipping Flask==0.9 ...
The following wheels were built:
  Jinja2-2.6-py27-none-any.whl
  Werkzeug-0.8.3-py27-none-any.whl
  Flask-0.9-py27-none-any.whl

The wheels are written in the current directory by default.

You can also use requirements files, just as with distil install. Here is the complete help for distil‘s pip command:

$ distil help pip
usage: distil pip [-h] [-r REQTFILE [REQTFILE ...]] [-d DESTDIR] [--no-deps]
                  [REQT [REQT ...]]

Build wheels using pip. Use this when distil's build logic doesn't work
because of code in setup.py which needs to be run.

positional arguments:
  REQT                  A requirement using a distribution on PyPI.

optional arguments:
  -h, --help            show this help message and exit
  -r REQTFILE [REQTFILE ...]
                        Get requirements from specified file(s)
  -d DESTDIR, --destination DESTDIR
                        Location to write the wheels to. Defaults to the
                        current directory.
  --no-deps             Don't build wheels for dependencies when building
                        wheels. The default behaviour is to build wheels for
                        all dependencies.

Packaging metadata

The metadata discussed in PEP 426, and its precursor PEPs, is a (potentially) small subset of the total metadata relating to a distribution. If we call the PEP 426 metadata “index metadata”, then the overall metadata for a distribution would comprise (the list may be incomplete):

  • The index metadata (PEP 426 et al)
  • Metadata about how to build a source distribution from a source tree
  • Metadata about how to build a binary distribution from a source tree/source distribution
  • Metadata about things a distribution exports for use by other distributions
  • Metadata used by installers to install the distribution

We’re attempting in the PEPs to standardise the next revision of the first of these, but the other categories haven’t been considered at all (from a standardisation point of view). At present, in the distribute / setuptools / distutils world, they are provided by a mixture of MANIFEST.in files and a bunch of keyword arguments passed to setup(). This, coupled with the command-class design of distutils, has led to a lot of ad hoc approaches to extending distutils where it fell short – monkey-patching, custom command classes etc., which has led to the present less than ideal situation.

A declarative approach is now generally considered better than setup.py: it allows for multiple, competing implementations which should be interoperable. The distutils2 approach was to focus on the declarative setup.cfg, and the new wheel format is also essentially declarative in nature.

A flat key-value structure for representing the other types of metadata doesn’t seem ideal. It might seem heretical, but backward compatibility aside, it’s not clear why we aren’t thinking about JSON as a metadata format. It has mature support in the stdlib, handles Unicode, and allows more meaningful structuring of the metadata.

Examples of JSON metadata can be seen in the following examples:

The index metadata is a small part of the overall metadata (it appears at key metadata in the top-level dict expressed in the JSON). You will most likely find metadata for distributions of interest to you by using the URI scheme indicated by the above examples.

Using this type of metadata, distil can:

  • Build a source archive from the metadata and the source archive which is essentially the same as the source.
  • Install into a virtualenv such that the venv layout after installation is identical to that following an installation with pip.

So, while the metadata schema used is provisional and can be improved, it brings across what can be brought across from setup(), such that it can be used to install software identically to pip, for a large number of distributions currently on PyPI. Such a declarative solution can work, provided there isn’t custom code which runs at installation time. Where code is called at installation time from setup.py, because the effects of that clearly can’t be brought over into a declarative format, it may not be possible to install affected distributions correctly. (For such distributions, distil offers the pip command to create wheels which can then be installed using distil‘s install command. See Using pip when building wheels for more information.)

The other things that structured metadata makes possible is that it’s relatively easy to transform into useful forms. For example, when distlib does dependency resolution, it can make effective use of selected metadata across all versions of a project. For example, you can see all the versions of a project, what the download URLs are and their digests and sizes, and the distribution dependencies – just by rearranging and aggregating the data.

Examples:

This allows distlib-using code to resolve dependencies without ever downloading a distribution, as pip has to do. The overall effect is more like RPM and apt-get: You get told before downloading any distribution what other dependencies will be installed, and whether any existing distributions will get upgraded.

Schema for the extended JSON metadata

The schema for the extended metadata is given by the following sample JSON (not valid JSON, due to the comments). The example may not be exhaustive, but gives a flavour of what is covered by the metadata:

{
  "version": 1, # version of this schema
  "exports": {
    # equivalent to setuptools' entry points
    "frobozz.processors": [
      "name1 = frobozz.sub.package:do_nothing",
      "name2 = frobozz.sub.package:do_something"
    ],
    "scripts": {
      "console": [
        "frobozz = frobozz.cli:main"
      ],
      "gui": [
        "frobozzw = frobozz.gui:main"
      ]
    }
  },
  "requirements": {
    "install": [
      # list of requirements needed post-install
      "foo (>= 1.0)"
    ]
    "setup": [
      # list of requirements needed for setup
      "bar (>= 2.0)"
    ]
    "test": [
      # list of requirements needed for testing
      "nose (>= 1.2)"
    ],
    "extras": {
      # dict of requirements needed for extras
      # list of requirements keyed by extra name
      "i18n": [
        "Babel (>= 0.8)"
      ]
    }
  },
  "source": {
    "include-package-data": true,
    "data-files": [
        # a list of lists. Each entry in the outer list is
        # a directory followed by a list of data files in
        # that directory.
      [
        "dir1",
        [
          "file1_in_dir1.ext",
          "file2_in_dir1.ext"
        ]
      ],
      [
        "dir2",
        [
          "file1_in_dir2.ext",
          "file2_in_dir2.ext"
        ]
      ]
      # and so on
    ],
    "include": [
      # e.g. scripts
      "bin/script1",
      "bin/script2"
    ],
    "packages": [
      # Python packages
      "frobozz.foo",
      "frobozz.foo.bar",
      "frobozz.foo.bar.baz"
    ],
    "modules": [
      "mod1",
      "mod2",
      "mod3"
    ],
    "manifest": [
      "include data/global.dat",
      "include data/localedata/*.dat",
      "include doc/api/*.*",
      "include doc/*.html"
    ]
  },
  "extensions": {
    # C extensions
    "frobozz.foo.ext_one": {
      "extra_link_args": [],
      "swig_opts": [],
      "language": null,
      "define_macros": [],
      "extra_objects": [],
      "runtime_library_dirs": [],
      "libraries": [],
      "sources": [
        "foo/extension/module_one.c"
      ],
      "depends": [],
      "export_symbols": [],
      "extra_compile_args": [],
      "undef_macros": [],
      "include_dirs": [],
      "library_dirs": [],
      "name": "frobozz.foo.ext_one"
    },
    "frobozz.foo.ext_two": {
      "extra_link_args": [],
      "swig_opts": [],
      "language": null,
      "define_macros": [],
      "extra_objects": [],
      "runtime_library_dirs": [],
      "libraries": [],
      "sources": [
        "foo/extension/module_two.c"
      ],
      "depends": [],
      "export_symbols": [],
      "extra_compile_args": [],
      "undef_macros": [],
      "include_dirs": [],
      "library_dirs": [],
      "name": "frobozz.foo.ext_two"
    }
  },
  "scripts": [
    # list of scripts to install
    "bin/script1",
    "bin/script2"
  ],
  "metadata": {
    # index metadata (PKG-INFO / METADATA)
    "maintainer-email": "some.user@some.domain.com",
    "maintainer": "Some User",
    "name": "frobozz",
    "license": "MIT",
    "author": "Frobozz Developers",
    "home-page": "http://frobozz.com/",
    "summary": "An example project",
    "version": "1.3.0",
    "classifiers": [
      "Programming Language :: Python :: 2.6",
      "Programming Language :: Python :: 2.7"
    ],
    "author-email": "some.user@some.domain.com",
    "description": "An example description.\n"
  },
  "test": {
    # test specifications
    "test-suite": "frobozz.tests.suite",
    "test-runner": "frobozz.tests.runner"
  },
  "build": {
    # build specifications
    "use-2to3": true,
    "use-2to3-fixers": [
      "custom_fixers"
    ]
  }
}

You may find it useful to locate metadata for actual distributions on PyPI which you may be familiar with. You can start exploring here.

Creating an initial version of metadata for new projects

The distil init command creates an initial package.json file in a specified directory. You can provide some command-line default values, as shown in the complete command-line help for distil init:

$ distil help init
usage: distil init [-h] [--name NAME] [--projver PROJVER] [--author AUTHOR]
                   [--email EMAIL] [--home-page HOMEPAGE]
                   PATH

Create minimal metadata for a project that you can then add to during
development.

positional arguments:
  PATH                  A directory of a local project where the metadata file
                        is to be created.

optional arguments:
  -h, --help            show this help message and exit
  --name NAME           The name of the project.
  --projver PROJVER     The version of the project.
  --author AUTHOR       The author of the project.
  --email EMAIL         The author's email address.
  --home-page HOMEPAGE  The home page URL of the project.