| Home | Trees | Indices | Help |
|---|
|
|
A client that oversees the implementation of the Pairtree FS specification version 0.1.
>>> from pairtree import PairtreeStorageClient >>> store = PairtreeStorageClient(store_dir='data', uri_base="http://")
This will create the following on disc in a directory called 'data' if it doesn't already exist:
$ ls -R data/ data/: pairtree_prefix pairtree_root pairtree_version0_1 data/pairtree_root:
Where
This directory conforms to Pairtree Version 0.1. Updated spec: http://www.cdlib.org/inside/diglib/pairtree/pairtreespec.html
Note, if data *had* already existed and was a pairtree store, the uri_base would have been read from the prefix file and override the one supplied above.
Also, if you try to create a store over a directory that already exists, but which isn't a pairtree store that it can recognise, it will raise a NotAPairtreeStoreException.
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
Inherited from |
|||
|
|||
|
Inherited from |
|||
|
|||
Constructor
|
The identifier string is cleaned of characters that are expected to occur rarely in object identifiers but that would cause certain known problems for file systems. In this step, every UTF-8 octet outside the range of visible ASCII (94 characters with hexadecimal codes 21-7e) [ASCII] (Cerf, “ASCII format for network interchange,” October 1969.), as well as the following visible ASCII characters: " hex 22 < hex 3c ? hex 3f * hex 2a = hex 3d ^ hex 5e + hex 2b > hex 3e | hex 7c , hex 2c must be converted to their corresponding 3-character hexadecimal encoding, ^hh, where ^ is a circumflex and hh is two hex digits. For example, ' ' (space) is converted to ^20 and '*' to ^2a. In the second step, the following single-character to single-character conversions must be done:
/ -> =
: -> +
. -> ,
These are characters that occur quite commonly in opaque identifiers but present special problems for filesystems. This step avoids requiring them to be hex encoded (hence expanded to three characters), which keeps the typical ppath reasonably short. Here are examples of identifier strings after cleaning and after ppath mapping:
id: ark:/13030/xt12t3
-> ark+=13030=xt12t3
-> ar/k+/=1/30/30/=x/t1/2t/3/
id: http://n2t.info/urn:nbn:se:kb:repos-1
-> http+==n2t,info=urn+nbn+se+kb+repos-1
-> ht/tp/+=/=n/2t/,i/nf/o=/ur/n+/n/bn/+s/e+/kb/+/re/p/os/-1/
id: what-the-*@?#!^!?
-> what-the-^2a@^3f#!^5e!^3f
-> wh/at/-t/he/-^/2a/@^/3f/#!/^5/e!/^3/f/
(From section 3 of the Pairtree specification)
|
This decodes a given identifier from its pairtree filesystem encoding, into its original form:
|
Internal - method for discovering the pairtree identifier for a given directory path. E.g. pairtree_root/fo/ob/ar/+/ --> 'foobar:'
|
Internal - walks a directory chain and builds a list of the directory shorties relative to the pairtree_root
|
Internal - method for turning an identifier into a pairtree directory tree of shorties.
|
Internal - method for turning an identifier into a list of pairtree directory tree of shorties.
|
Initialise the store if the directory doesn't exist. Create the basic structure needed and write the prefix to disc. If the store directory exists, one of two things can happen:
|
Walk the store, and build a list of pairtree conformational objects in the store. This will return objects in 'split-ends' and will function correctly as long as non-shortie directorys are just that; non-shortie directories must have longer labels than the shorties - e.g:
ab -- cd -- ef -- foo.txt
| |
| ---- gh
| |
| ---- foo.txt
|
---- e -- foo.txt
This method will return ['abcdef', 'abcde', 'abcdefgh'] as ids in this
store.
TODO: Need to make sure this corresponds to pairtree spec. Currently, it ignores the possibility of a split end being 'shielded' by a /obj/ folder Returns a generator, not a plain list since version 0.4.12
|
Internal - create an object. If the object already exists, raise a ObjectAlreadyExistsException
|
List all the parts of the given identifer's parts (excluding shortie directories belonging to other objects) If path is supplied, the parts in that subdirectory are returned. If the subpath doesn't exist, a ObjectNotFoundException will be raised. >>> store.list_parts('foobar:1', 'data/images') [ 'image001.tif', 'image.... ]
|
Returns True or False depending on whether the path is a file or not. If the file doesn't exist, False is returned.
|
Returns True or False depending on whether the path is a subdirectory or not. If the path doesn't exist, False is returned.
|
Store a stream of bytes into a file within a pairtree object. Can be either a string of bytes, or a filelike object which supports bytestream.read(buffer_size) - useful for very large files.
|
Reads a filehandle for a pairtree object. This is a "wb+" opened file and so can be appended to and obeys 'seek' >>> with store.get_appendable_stream('foobar:1','data/images', 'image001.tif') as stream: # Do something with the C{stream} handle pass stream is closed at the end of a
|
Reads a file from a pairtree object - If streamable is set to True,
this returns the filehandle for that file, which must be
>>> with store.get_stream('foobar:1','data/images', 'image001.tif', True) as stream: # Do something with the C{stream} handle pass stream is closed at the end of a
|
Delete a file from a pairtree object. Leaves no trace, be careful.
|
Delete a subpath from an object, and can do so recursively (optional) If the path is found to be not "empty" (ie has not parts in it) and recursive is not True, then it will raise a PathIsNotEmptyException
|
Delete's an object from the pairtree store, including any parts and subpaths There is no undo...
|
Answers the question "Does object or object subpath/file 'xxxxxxx' exist?"
|
Inbuilt method to randomly generate an id, if one is not given to either get_object or create_object. Simply returns a random 14 digit long (base 10) number, not fantastically useful but at least makes sure it is unique in the store.
|
Returns an pairtree object with identifier If the object at >>> bar = client.get_object('bar') # the object with id 'bar' will be retrieved and created if necessary. Setting this flag to False, will cause it to raise an exception if it cannot find an object. >>> fake = client.get_object('doesnotexist', create_if_doesnt_exist=False) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "build/bdist.linux-i686/egg/pairtree/pairtree_client.py", line 231, in get_object pairtree.storage_exceptions.ObjectNotFoundException (note that fake = client.get_object('doesnotexist', False) is equivalent to the above line)
|
Creates a new object with identifier >>> bar = client.create_object('bar') >>> Note that reissuing that command again will raise an ObjectAlreadyExistsException: >>> bar = client.create_object('bar') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "build/bdist.linux-i686/egg/pairtree/pairtree_client.py", line 235, in create_object pairtree.storage_exceptions.ObjectAlreadyExistsException
|
| Home | Trees | Indices | Help |
|---|
| Generated by Epydoc 3.0.1 on Wed Jun 2 18:26:53 2010 | http://epydoc.sourceforge.net |