Key transformers

Vlermv uses magic transformer by default. You can switch it for one of the other included transformers or for your own transformer.

The magic transformer

vlermv.transformers.magic

When I first wrote Vlermv, the magic transformer was even more important to me than the dictionary interface. The magic transformer comes up with a meaningful filename for any given key, combining the features of the other transformers.

One main principle of the magic transformer is that slashes should be handled only by splitting into directories; they should not be replaced by other characters, as that can be confusing.

Aside from strings and string-like objects, you can use iterables of strings; these indices both refer to the file foo/bar/baz:

vlermv[('foo','bar','baz')]
vlermv[['foo','bar','baz']]

(This is like the tuple transformer.)

If you pass a relative path to a file, it will be broken up as you’d expect; that is, strings get split on slashes and backslashes.

vlermv['foo/bar/baz']
vlermv['foo\\bar\\baz']

(This is like the slash and backslash transformers.)

If you pass a URL, it will also get broken up in a reasonable way.

# /tmp/a-directory/http/thomaslevine.com/!/?foo=bar#baz
vlermv['http://thomaslevine.com/!/?foo=bar#baz']

# /tmp/a-directory/thomaslevine.com/!?foo=bar#baz
vlermv['thomaslevine.com/!?foo=bar#baz']

Dates and datetimes get converted to YYYY-MM-DD format.

import datetime

# /tmp/a-directory/2014-02-26
vlermv[datetime.date(2014,2,26)]
vlermv[datetime.datetime(2014,2,26,13,6,42)]

And you can mix these formats!

# /tmp/a-directory/http/thomaslevine.com/open-data/2014-02-26
vlermv[('http://thomaslevine.com/open-data', datetime.date(2014,2,26))]

Key collisions

Because the magic transformer accepts a wide range of formats, it is possible that two keys will collide. For example, the two keys below map to the same filename.

datetime.date(2015, 1, 2)
(2015, 1, 2)

And you won’t be able to save both of the two keys below.

'https://thomaslevine.com/'
'https://thomaslevine.com/index.html'

The former creates a file https/thomaslevine.com, and the latter expects this file to be a directory so that it may create the file https/thomaslevine.com/index.html.

If collisions become an issue for you, consider using a different directory structure. The magic transformer usually works pretty well, but it is intended only to be a good default.

Other transformers

In addition to the magic transformer, the following transformers are included in Vlermv.

vlermv.transformers.base64

File name is the base 64 encoding of the key.

vlermv.transformers.tuple

Key must be a tuple; the right most element becomes a file name, and the preceding elements are directories.

vlermv.transformers.simple

Key is used as the file name inside the vlermv directory. It must be a str without slashes. Writing to subdirectories is not allowed.

vlermv.transformers.slash

Like simple, except that slashes may be used to separate directories

vlermv.transformers.backslash

Like simple, except that backslashes may be used to separate directories

Creating a transformer

A transformer converts keys to paths and paths to keys, where keys are things that we use to index a Vlermv object and paths Vlermv’s internal representation of file paths.

In this section I define a “key” and a “path” and then explain how to implement a transformer for translating between keys and paths.

Keys

In the following query, 234 is the key.

Vlermv('tmp')[234]

And in this one,

Vlermv('tmp')[('a', (1, 2))]

('a'), (1, 2)) is the key.

Paths

Internally in Vlermv, paths get represented as tuples of directory and file names. Here are some examples of how the mapping works.

Vlermv tuple path Ordinary string path
('./x', 'y', 'z') x/y/z
('x', 'y', 'z') x/y/z
('', 'x', 'y', 'z') x/y/z
('/', 'usr', 'bin') usr/bin

Aside from the basic conversion between strings and tuples, the main thing that is going on here is sandboxing the paths to be descendants of the vlermv directory; there is no path that you can specify that will let you read or write outside of the vlermv directory. Here are two examples that use the magic transformer.

vlermv['/foo/bar/baz'] # Saves to ./foo/bar/baz
vlermv['C:\\foo\\bar\\baz'] # Saves to ./c/foo/bar/baz
                            # (lowercase "c")

All paths are relative the vlermv root, and absolute directories are converted to relative paths.

When tuple paths are created from file names in vlermv.Vlermv.keys or vlermv.Vlermv.items, they contain none of the elements: '/', '.', or '..'. That is, they are normal and relative. For example, a path ./a/b/c becomes ('a', 'b', 'c').

Empty paths

Some paths are not allowed. An attempt to use empty paths, paths resolving to ./, and relative paths outside of the vlermv root will raise an error. Here are more complex examples.

Vlermv tuple path Ordinary string path
('a', '..', 'b', 'c') b/c
('..', '..', 'bin', 'sh') (Not allowed)
('/', '..') (Not allowed)
('./', 'd') d
('./',) (Not allowed)
('', '', '') (Not allowed)
tuple() (Not allowed)

Transformer API

Now on to the transformer itself! A transformer is a Python object with the following methods.

vlermv.hypothetical_transformer.to_path(key:str) → path:tuple

Convert a key to a path.

vlermv.hypothetical_transformer.from_path(path:tuple) → key:str

Convert a path to a key.

For example, this is what the simple transformer looks like.

import os

error_msg = '''The default transformer cannot handle slashes (subdirectories);
try another transformer in vlermv.transformers.'''

def to_path(key):
    if '/' in key or '\\' in key or os.path.sep in key:
        raise ValueError(error_msg)
    return (key,)

def from_path(path):
    if len(path) != 1:
        raise ValueError(error_msg)
    return path[0]

Tip: Keep your transformers simple

How much logic should you put inside a transformer, and how much logic should go in the outside code? I suggest that you make your transformers very small.

Transformers are called several layers down in Vlermv’s functions, and this can make it harder to debug than an ordinary function not associated with Vlermv. If you have to do especially complex key transformations, such as splitting up a filename into several parts to parse data from it, you should start by using identity transformer and do the manipulations outside the transformer; turn it into a transformer only after you trust that your implementation is correct and stable.