Use cases for DataPkg (or: reasons to use it in the first place)
These use cases are not necessarily all implemented but are a guide to what we are trying to do. The first two were the two original use cases at the start of the project (and were heavily inspired by debian).
The steps involved:
$ datapkg index-add file:///....
$ datapkg update
$ datapkg search "military spending"
some-id Military Spending 1890-1914
some-id-2 Military Spending 1890-1914 (normalized)
$ datapkg install some-id
...
$ datapkg plot some-id
What data?
Normalize data * Cross country and then convert to standard (e.g. US$, GBP)
- Exchange rates
- Cost of living
- Changes across time and then do real present value
[Plot two different data sources again each other.] * [Government expenditure in different sectors?]
Example code:
$ datapkg install pkg-a
$ datapkg install pkg-b
$ datapkg create merged
# manual merge
# e.g. PPP, GDP
$ datapkg register my-merged-package
Revist basic discovery and usage of data from above.
Install datapkg
Search remote registry/repo for a package
Download package on to local disk and unpack:
$ datapkg get [url|name] [path]If specifying name (using a Registry) then:
- get metadata from registry
- locate the distribution URL
Basic steps:
- Discover at URL: targz/zip file, version controlled repo, URL page with links (ask user which one)
- download the compressed distribution to temp dir (progress bar)
- unpack it to destination path
Future: maybe need to build/compile data
- Explore package
- Package a csv file
- Register the package to the remote repo.
- Upload the package distribution to the remote repo.