Extending Datapkg¶

Datapkg has been designed to be easily extensible. At the present time you can write your own implementations of:

Commands - extend datapkg command line interface with new commands

Indexes - add new Indexes with which datapkg can communicate

Distribution - add new Distribution types (either for reading or writing or both)

(Package) Resource downloader - add support for downloading different types of resources

Uploader (via OFS) - upload to different storage backends

Commands¶

It is easy to add your own custom commands to the set of commands available from the datapkg command line interface.

To provide a new command named ‘mycommand’:

Create a new command class inheriting from datapkg.cli.Command. This may be called anything you want. Assume it is called ‘MyNewCommand’ in package mynewpackage.command
In the setup.py of your new python package (containing the new command) add to the datapkg.cli entry poing section and entry named ‘mycommand’:
[datapkg.cli]
mycommand = mynewpackage.command:MyNewCommand

Command Base Class¶

class datapkg.cli.Command¶

Base command class that all datapkg Commands should inherit from.

An inheriting class provide a run method and can define the following class level attributes (documented below):

name
summary
usage
min_args
max_args

max_args¶: Maximum number of args to the command (not used if set to None)

min_args¶: Minimum number of args to the command (not used if set to None)

name¶: The name of the command as used on the command line and in help

run(options, args)¶

This is the method inheriting classes should override to implement their command functionality.

Inheriting classes should not call super to this method – they should just override it.

summary¶: one line summary of this command (used in printing help)

usage¶: A multiline detailed description of the command

Index¶

To provide a new Index for datapkg to use (e.g. in datapkg search and datapkg download commands) you must:

Create a new Index class inheriting from datapkg.index.IndexBase (see below)

Add an entry point for your Index class in the [datapkg.index] section of your setup.py entry_points.
NB: the index will be available in datapkg commands (such as search) via the entry point name. E.g. if the entry point section looks like:
[datapkg.index]
mynewindex = mypackage.customindex:CustomIndex
then the can be used in datapkg commands as follows:
$ datapkg search mynewindex:// {search-tem}

Index Base class¶

class datapkg.index.base.IndexBase¶

Base class for Index objects, all Index implementations should implement the API defined here.

get(name)¶: Get package with name name.

has(name)¶: Check if package with name name is in Index.

list()¶: Return an iterator over all items in the Index

register(package)¶: Register package in the Index.

search(query)¶: Return an iterator over search results corresponding to query.

update(package)¶: Update package in the Index.

Distributions¶

To provide a new Distribution (either for reading, writing or both) for datapkg to use you must:

Create a new Distribution class inheriting from datapkg.distribution.DistributionBase (see below)

Add an entry point for your Index class in the [datapkg.distribution] section of your setup.py entry_points.

Distribution Base class¶

class datapkg.distribution.DistributionBase(package=None)¶

classmethod load(path)¶

Load a L{Package} object from a path to a package distribution.

@return: the Distribution object.

stream(path)¶: Return a fileobj stream for material at path.

write(path, **kwargs)¶: Write this distribution to disk at path.

Resource Downloader¶

class datapkg.download.ResourceDownloaderBase¶

Base class for (package) resource downloaders which handle the downloading or accessing of (package) resources (i.e. files containing package data, APIs to package data etc).

To create a new resource downloader and have it used by datapkg:

1. Create a new class inheriting from datapkg.download.ResourceDownloaderBase

2. Add an entry point in the [datapkg.resource_downloader] entry_points section of your setup.py pointing to this class.

Many downloaders can be installed to handle different types of resources. Installed downloaders are called in turn with the first one to match being used. The order of calling is determined by order ot pkg_resources.iter_entry_points for the datapkg.resource_downloader entry point.

download(resource, dest_path)¶

Download the supplied resource.

Should be overriden (and not called) by inheriting classes.

This method should return True if and only if the class can handle (and therefore has handled) the downloaded resource and should return False otherwise (thereby allowing subsequent downloaders to be tried).

Uploading¶

datapkg utilizes the pluggable blobstore library OFS (http://bitbucket.org/okfn/ofs).

To add a new storage backend just extend OFS and this new backend will be automatically available to datapkg.

Extending Datapkg¶

Commands¶

Command Base Class¶

Index¶

Index Base class¶

Distributions¶

Distribution Base class¶

Resource Downloader¶

Uploading¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Extending Datapkg¶

Commands¶

Command Base Class¶

Index¶

Index Base class¶

Distributions¶

Distribution Base class¶

Resource Downloader¶

Uploading¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation