Table Of Contents

Previous topic

API reference

Next topic

Finding dependencies

This Page

Input and output data

The datastore module provides an abstraction layer around data storage, allowing different methods of storing simulation/analysis results (local filesystem, remote filesystem, database, etc.) to provide a common interface.

The interface is built around three types of object: a DataStore may contain many DataItems, each of which is identified by a DataKey.

There is a single DataKey class. DataStore and DataItem are abstract base classes, and must be subclassed to provide different functionality.

Base classes

class sumatra.datastore.DataKey(path, digest, creation, **metadata)

Identifies a DataItem, and may be used to retrieve a DataItem from a DataStore.

May also be used to store metadata (e.g. file size, mimetype) and be used as a proxy for the DataItem on a system where the actual data is not available.

path

a token used to retrieve a DataItem. For filesystem-based DataStores, this will be a relative path. For database-backed stores (none of which have been implemented yet :-) it could be a primary key or an object encapsulating a query.

digest

the SHA1 digest of the contents of the associated DataItem. This attribute is calculated on creation of the DataKey.

metadata

a dict containing metadata, such as file size and mimetype.

next()
class sumatra.datastore.base.DataItem

Base class for data item classes, that may represent files or database records.

digest

docstring

generate_key()

Generate a DataKey uniquely identifying this data item.

get_content(max_length=None)

Return the contents of the data item as a string.

If max_length is specified, return that number of bytes, otherwise return the entire content.

next()
save_copy(path)

Save a copy of the data to a local file.

If path is an existing directory, the data item path will be appended to it, otherwise path is treated as a full path including filename, either absolute or relative to the working directory.

Return the full path of the final file.

sorted_content()

Return the contents of the data item, sorted by line.

class sumatra.datastore.base.DataStore

Base class for data storage abstractions.

contains_path(path)

Does the store contain a data item with the given path?

copy()
delete(*keys)

Delete the files corresponding to the given keys.

find_new_data(timestamp)

Finds newly created/changed data items

generate_keys(*paths)

Given a number of “paths”, return a list of keys enabling the data at those paths to be retrieved from this store later.

get_content(key, max_length=None)

Return the contents of a file identified by a key.

If max_length is given, the return value will be truncated.

get_data_item(key)

Return the file that matches the given key.

next()
required_attributes = (u'find_new_data', u'get_data_item', u'delete')

Storing data on the local filesystem

class sumatra.datastore.FileSystemDataStore(root)

Bases: sumatra.datastore.base.DataStore

Represents a locally-mounted filesystem. The root of the data store will generally be a subdirectory of the real filesystem.

root

The absolute path on the underlying file system to the root directory of the data store.

class sumatra.datastore.filesystem.DataFile(path, store, creation=None)

Bases: sumatra.datastore.base.DataItem

A file-like object, that represents a file in a local filesystem.

path

path relative to the FileSystemDataStore root

full_path

absolute path relative to the underlying filesystem.

size

file size in bytes

name

file name

extension

file extension

mimetype

if the mimetype cannot be guessed, this will be None

Automatic archiving of data written to the local filesystem

class sumatra.datastore.ArchivingFileSystemDataStore(root, archive=u'.smt/archive')

Bases: sumatra.datastore.filesystem.FileSystemDataStore

Represents a locally-mounted filesystem that archives any new files created in it. The root of the data store will generally be a subdirectory of the real filesystem.

archive_store

Directory within which data will be archived.

class sumatra.datastore.archivingfs.ArchivedDataFile(path, store, creation=None)

Bases: sumatra.datastore.base.DataItem

A file-like object, that represents a file inside a tar archive

Mirroring data to a remote webserver

class sumatra.datastore.MirroredFileSystemDataStore(root, mirror_base_url)

Bases: sumatra.datastore.filesystem.FileSystemDataStore

Represents a locally-mounted filesystem whose contents are mirrored on a webserver, so that the files can be accessed via an HTTP URL.

mirror_base_url

URL to which the file path will be appended to obtain the final URL of a file

class sumatra.datastore.mirroredfs.MirroredDataFile(path, store, creation=None)

Bases: sumatra.datastore.base.DataItem

A file-like object, that represents a file existing both on a local file system and on a webserver.