Datapkg has been designed to be easily extensible. At the present time you can write your own implementations of:
- Commands - extend datapkg command line interface with new commands
- Indexes - add new Indexes with which datapkg can communicate
- Distribution - add new Distribution types (either for reading or writing or both)
- (Package) Resource downloader - add support for downloading different types of resources
- Uploader (via OFS) - upload to different storage backends
It is easy to add your own custom commands to the set of commands available from the datapkg command line interface.
To provide a new command named ‘mycommand’:
Create a new command class inheriting from datapkg.cli.Command. This may be called anything you want. Assume it is called ‘MyNewCommand’ in package mynewpackage.command
In the setup.py of your new python package (containing the new command) add to the datapkg.cli entry poing section and entry named ‘mycommand’:
[datapkg.cli] mycommand = mynewpackage.command:MyNewCommand
Base command class that all datapkg Commands should inherit from.
An inheriting class provide a run method and can define the following class level attributes (documented below):
This is the method inheriting classes should override to implement their command functionality.
Inheriting classes should not call super to this method – they should just override it.
To provide a new Index for datapkg to use (e.g. in datapkg search and datapkg download commands) you must:
- Create a new Index class inheriting from datapkg.index.IndexBase (see below)
- Add an entry point for your Index class in the [datapkg.index] section of your setup.py entry_points.
NB: the index will be available in datapkg commands (such as search) via the entry point name. E.g. if the entry point section looks like:
[datapkg.index] mynewindex = mypackage.customindex:CustomIndexthen the can be used in datapkg commands as follows:
$ datapkg search mynewindex:// {search-tem}
Base class for Index objects, all Index implementations should implement the API defined here.
To provide a new Distribution (either for reading, writing or both) for datapkg to use you must:
- Create a new Distribution class inheriting from datapkg.distribution.DistributionBase (see below)
- Add an entry point for your Index class in the [datapkg.distribution] section of your setup.py entry_points.
Load a L{Package} object from a path to a package distribution.
@return: the Distribution object.
Base class for (package) resource downloaders which handle the downloading or accessing of (package) resources (i.e. files containing package data, APIs to package data etc).
To create a new resource downloader and have it used by datapkg:
1. Create a new class inheriting from datapkg.download.ResourceDownloaderBase
2. Add an entry point in the [datapkg.resource_downloader] entry_points section of your setup.py pointing to this class.
Many downloaders can be installed to handle different types of resources. Installed downloaders are called in turn with the first one to match being used. The order of calling is determined by order ot pkg_resources.iter_entry_points for the datapkg.resource_downloader entry point.
Download the supplied resource.
Should be overriden (and not called) by inheriting classes.
This method should return True if and only if the class can handle (and therefore has handled) the downloaded resource and should return False otherwise (thereby allowing subsequent downloaders to be tried).
datapkg utilizes the pluggable blobstore library OFS (http://bitbucket.org/okfn/ofs).
To add a new storage backend just extend OFS and this new backend will be automatically available to datapkg.