3. Phyles API

3.1. API Overview

The phyles API has several functions and classes to facilitate the construction of medium-sized utilites. These functions and classes are divided into four categories:

3.2. Classes for Configurations and Schemata

3.3. Functions for Configurations and Schemata

3.4. Functions for Files and Directories

  • phyles.last_made

    returns the most recently created file in a directory

  • phyles.get_home_dir

    returns the users home directory in a representation native to the host OS

  • phyles.get_data_path

    returns the absolute path to a data directory within a package

  • phyles.package_spec

    reads and returns the contents a schema specification somewhere in a package as YAML text

  • phyles.prune

    recursively deletes files matching specified unix sytle patterns

  • phyles.zipdir

    uses python zipfile package to create a zip archive of a directory

3.5. Functions for User Interaction

3.6. Functions for a One-Size-Fits-All Runtime

  • phyles.set_up

    sets up the runtime with an argparse.ArgParser, loads a schema and validates a config with config override, and prepares state for graceful recovery from user error

  • phyles.run_main

    trivial try-except block for graceful recovery from anticipated types of user error

  • phyles.mapify

    function decorator that converts a function taking any arbitrary set of arguments into a funciton taking as a single argument a mapping object keyed with the names of the original arguments

3.7. API Details

phyles: A package to simplify authoring utilites. Copyright (C) 2013 James C. Stroud All rights reserved.

class phyles.Schema(*args, **kwds)

An OrderedDict subclass that provides identity for schemata.

A Schema has an implicit structure and is created by the load_schema() or read_schema() functions. See the documentation in load_schema() for a detailed explanation.

Warning

Creating instances of Schema through its class constructor (i.e. '__init__') is not yet advised or supported and may break forward compatibility.

read_config(*args, **kwargs)

This is a wrapper for read_config() (see documentation therein).

Comparison of usage with read_config():

phyles.read_config(schema, config)
schema.read_config(config)
sample_config(*args, **kwargs)

This is a wrapper for sample_config() (see documentation therein).

Comparison of usage with sample_config():

phyles.sample_config(schema)
schema.sample_config()
validate_config(*args, **kwargs)

This is a wrapper for validate_config() (see documentation therein).

Comparison of usage with validate_config():

phyles.validate_config(schema, config)
schema.validate_config(config)
class phyles.Configuration(config=None)

An OrderedDict subclass that encapsulates configurations and also remembers the original input.

A Configuration has a specific structure and is creatd by validate_config() or read_config() functions, usually by the latter. The Configuration class is exposed in the API purely for purposes of documentation.

Attributes:
original: the original config as an OrderedDict,

allowing the remembering of user input while also allowing conversion

>>> colors = {'red': 'ff0000',
...           'green': '00ff00',
...           'blue': '0000ff'}
>>> c = Configuration({'color': 'blue'})
>>> c['color'] = colors[c['color']]
>>> c['color']
>>> '0000ff'
>>> c.original['color']
'blue'

Warning

Creating instances of Configuration through its class constructor (i.e. with phyles.Configuration()) is not yet advised or supported and may break forward compatibility.

phyles.read_schema(yaml_file, converters=None)

Loads the schema specified in the file named in yaml_file. This function simply opens and reads the yaml_file before sending its contents to load_schema() to produce the Schema.

Args:

yaml_file: name of a yaml file holding the specification

converters:
a dict of converters keyed by config entry names, as described in load_schema()
Returns:
a Schema as described in load_schema()
phyles.load_schema(spec, converters=None)

Creates a Schema from the specification, spec.

Note

If the schema specification is in a YAML file, then use phyles.read_schema(), which is a convenience wrapper around load_schema().

Args:

spec:

Can either be YAML text, a list of 2-tuples with unique first elements, a mapping object (e.g. dict), or None (which would be equivalent to an empty dict. If the schema is in a YAML file, then use phyles.read_schema(). The values of the items of spec are:

  1. converter
  2. example value
  3. help string
  4. default value (optional)

Example YAML specification as a complete YAML file:

%YAML 1.2
---
!!omap
- 'pdb model' : [str, my_model.pdb, null]
- 'reset b-facs' : 
      - float
      - -1
      - "New B factor (-1 for no reset)"
      - -1
- 'cell dimensions' : [get_cell, [200, 200, 200], null]

The same example as a python specification via a list of 2-tuples with unique first elements:

[('pdb model',
  ['str', 'my_model.pdb', None]),
 ('reset b-facs',
  ['float', -1, 'New B factor (-1 for no reset)', -1]),
 ('cell dimensions',
  [get_cell, [200, 200, 200], None])]

For completeness, the same example as a dict:

{'pdb model': ['str', 'my_model.pdb', None],
 'reset b-facs':
   ['float', -1, 'New B factor (-1 for no reset)', -1],
 'cell dimensions': [get_cell, [200, 200, 200], None]}

Note

The python structure (list of 2-tuples) of this example specification is simply the result of parsing the YAML with the PyYAML parser. Because of isomorphism between a list of 2-tuples with unique first elements, OrderedDicts, dicts, and other mapping types, the specification may take any of these forms.

The following YAML representation of a config conforms to the preceding schema specification:

pdb model : model.pdb
reset b-facs : 20
cell dimensions : [59, 95, 159]
converters:

A dict of callables keyed by converter name (which must match the converter names in spec), The callables convert values from the actual config.

Converters that correspond to several native python types and YAML types do not need to be explicitly specified. The names that these converters take in a schema specification and the corresponding python types produced by these converters are:

Note

Except where noted, these types are encoded according to the YAML types specification in a YAML representation of a config.

Returns:
A fully constructed schema in the form of a Schema. Most notably, the strings specifying the converters in the spec are replaced by the converters themselves.
>>> import phyles
>>> import textwrap
>>> def get_cell(cell):
      return [float(f) for f in cell]
>>> converters = {'get_cell' : get_cell}
>>> y = '''
        %YAML 1.2
        ---
        !!omap
        - 'pdb model' : [str, my_model.pdb, null]
        - 'reset b-facs' : 
              - float
              - -1
              - "New B factor (-1 for no reset)"
              - -1
        - 'cell dimensions' : [get_cell, [200, 200, 200], null]
        '''
>>> phyles.load_schema(textwrap.dedent(y), converters=converters)
Schema([('pdb model',
         [<type 'str'>, 'my_model.pdb', None]),
        ('reset b-facs',
         [<type 'float'>, -1,
          'New B factor (-1 for no reset)', -1]),
        ('cell dimensions',
         [<function get_cell at 0x101d7cf50>,
          [200, 200, 200], None])])
phyles.sample_config(schema)

Creates a sample config specification (returned as a str) from the schema, as described in read_schema().

Items with empty help strings (i.e. "") are not included in the sample config. These may be used for testing, etc.

Args:
schema: a Schema
Returns:
A str that is useful as a template config specification. Example values from the schema will be used. Additionally, the help strings will be inserted as reasonably formatted YAML comments.
>>> import phyles
>>> import textwrap
>>> def get_cell(cell):
      return [float(f) for f in cell]
>>> converters = {'get_cell' : get_cell}
>>> y = '''
        !!omap
        - 'pdb model' : [str, my_model.pdb, null]
        - 'reset b-facs' : 
              - float
              - -1
              - "New B factor (-1 for no reset)"
              - -1
        - 'cell dimensions' : [get_cell, [200, 200, 200], null]
        '''
>>> schema = phyles.load_schema(textwrap.dedent(y),
                                converters=converters)
>>> print phyles.sample_config(schema)
%YAML 1.2
---

pdb model : my_model.pdb

# New B factor (-1 for no reset)
reset b-facs : -1

cell dimensions : [200, 200, 200]
phyles.validate_config(schema, config)

Takes a YAML specification for a configuration, config, and uses the schema (as described in load_schema()) for validation, which:

  1. checks for required config entries, raising a ConfigError if any are missing
  2. ensures that no unrecognized config entries are present, raising a ConfigError in any such entries are present
  3. ensures, through the use of converters, that the values given in the config specification are of the appropriate types and within accepted limits (if applicable), raising a ConfigError if any fail to convert
  4. uses the converters to turn values given in the configuration into values of the appropriate types (e.g. the YAML str '1+4j' is converted into the python complex number (1+4j) if the converter is 'complex')

Note

Why is conversion a part of validation? Conversion facilitates the end-user’s working with a minimal subset of the YAML vocabulary. In the complex number example above, the end-user only needs to know how complex numbers are usually represented (e.g. '1+4j') and not what gibbersh like '!!python/object:__main__.SomeComplexClass' means, where to put it, how to specify its attributes, etc.

Args:
config:
a mapping (e.g. OrderedDict or dict) of configuration entries
schema:
a Schema as described in load_schema()
Returns:
The converted config as a Configuration.
Raises:
ConfigError
>>> import phyles
>>> import textwrap
>>> import yaml
>>> def get_cell(cell):
      return [float(f) for f in cell]
>>> converters = {'get_cell' : get_cell}
>>> y = '''
        %YAML 1.2
        ---
        !!omap
        - 'pdb model' : [str, my_model.pdb, null]
        - 'reset b-facs' : 
              - float
              - -1
              - "New B factor (-1 for no reset)"
              - -1
        - 'cell dimensions' : [get_cell, [200, 200, 200], null]
        '''
>>> schema = phyles.load_schema(textwrap.dedent(y),
                                converters=converters)
>>> y = '''
        pdb model : model.pdb
        reset b-facs : 20
        cell dimensions : [59, 95, 159]
        '''
>>> cfg = yaml.load(textwrap.dedent(y))
>>> cfg
{'cell dimensions': [59, 95, 159],
 'pdb model': 'model.pdb',
 'reset b-facs': 20}
>>> phyles.validate_config(schema, cfg)
Configuration([('pdb model', 'model.pdb'),
               ('reset b-facs', 20.0),
               ('cell dimensions', [59.0, 95.0, 159.0])])
phyles.read_config(schema, config_file)

Reads a YAML config file from the the file named config_file and returns the config validated by schema.

Args:
config_file:

YAML file holding the config, for example:

pdb model : model.pdb
reset b-facs : 20
cell dimensions : [59, 95, 159]

schema: a Schema as described in load_schema()

Returns:
a Configuration
phyles.last_made(dirpath='.', suffix=None, depth=0)

Returns the most recently created file in dirpath. If provided, the newest of the files with the given suffix/suffices is returned. This will recurse to depth, with dirpath being at depth 0 and its children directories being at depth 1, etc. Set depth to any value but 0 or a positive integer if recursion should be exauhstive (e.g. -1 or None).

Returns None if no files are found that match suffix.

The suffix parameter is either a single suffix (e.g. '.txt') or a sequence of suffices (e.g. ['.txt', '.text']).

phyles.wait_exec(cmd, instr=None)

Waits for cmd to execute and returns the output from stdout. The cmd follows the same rules as for python’s popen2.Popen3. If instr is provided, this string is passed to the standard input of the child process.

Except for the convenience of passing instr, this funciton is somewhat redundant with pyhon’s subprocess.call().

phyles.doyn(msg, cmd=None, exc=<built-in function system>, outfile=None)

Uses the raw_input() builtin to query the user a yes or no question. If cmd is provided, then the function specified by exc (default os.system()) will be called with the argument cmd.

If a file name for outfile is provided, then stdout will be directed to a file of that name.

phyles.banner(program, version, width=70)

Uses the program and version to print a banner to stderr. The banner will be printed at width (default 70).

Args:

program: str

version: str

width: int

phyles.usage(parser, msg=None, width=70, pad=4)

Uses the parser (argparse.ArgumentParser) to print the usage. If msg (which can be an Exception, str, etc.) is supplied then it will be printed as an error message, hopefully in a way that catches the user’s eye. The usage message will be formatted at width (default 70). The pad is used to add some extra space to the beginning of the error lines to make them stand out (default 4).

phyles.graceful(msg=None, width=70, pad=4)

Gracefully exits the program with an error message.

The msg, width and pad arguments are the same as for usage().

phyles.get_home_dir()

Returns the home directory of the account under which the python program is executed. The home directory is represented in a manner that is comprehensible by the host operating system (e.g. C:\something\ for Windows, etc.).

Adapted directly from K. S. Sreeram’s approach, message 393819 on c.l.python (7/18/2006). I treat this code as public domain.

phyles.get_data_path(env_var, package_name, data_dir)

Returns the path to the data directory. First it looks for the directory specified in the env_var environment variable and if that directory does not exists, finds data_dir in one of the following places (in this order):

  1. The package directory (i.e. where the __init.py__ is for the package named by the package_name parameter)
  2. If the package is a file, the directory holding the package file
  3. If frozen, the directory holding the frozen executable
  4. If frozen, the parent directory of the directory holding the frozen executable
  5. If frozen, the first element of sys.path

Thus, if the package were at /path/to/my_package, (i.e. with /path/to/my_package/__init__.py), then a very reasonable place for the data directory would be /path/to/my_package/package-data/.

The anticipated use of this function is within the package with which the data files are associated. For this use, the package name can be found with the global variable __package__, which for this example would have the value 'my_package'. E.g.:

pth = get_data_path('MYPACKAGEDATA', __package__, 'package-data')

This code is adapted from _get_data_path() from matplotlib __init__.py. Some parts of this code are most likely subject to the matplotlib license.

Note

The env_var argument can be ignored using phyles.Undefined because it’s guaranteed not to be in os.environ:

pth = get_data_path(Undefined, __package__, 'package-data')

Warning

The use of '__package__' for package_name will fail in certain circumstances. For example, if the value of __name__ is '__main__', then __package__ is usually None. In such cases, it is necessary to pass the package name explicitly.

pth = get_data_path(Undefined, 'my_package', 'package-data')
phyles.prune(patterns, doit=False)

Recursively deletes files matching the specified unix style patterns. The doit parameter must be explicitly set to True for the files to actually get deleted, otherwise, it just logs with logging.info() what would happen. Raises a SystemExit if deletion of any file is unsuccessful (only when doit is True).

Example:

prune(['*~', '*.pyc'], doit=True)
Args:
  • patterns: list of unix style pathname patterns
  • doit: bool

Returns: None

Raises: SystemExit

phyles.default_argparser()

Returns a default argparse.ArgumentParser with mutually exclusive template (-t, --template) and config (-c, --config) arguments added. It also has the override (-o, --override) option added, to override configuration items on the command line.

The argument to --override should be a valid YAML map (with the single exception that the outermost curly braces are optional for YAML flow style). Because YAML relies on syntactically meaningful whitespace, single quotes should surround the argument to --override.

The following examples execute a program called program, overriding opt1 and opt2 of the config in the file config.yml with foo and bar, respctively:

program -c config.yml -o 'opt1 : foo, opt2 : bar'
program -c config.yml -o '{opt1 : foo, opt2 : bar}'
program -c config.yml -o 'opt1 : foo\nopt2 : bar'

Note

The latter example illustrates how YAML block style can be used with --override: a single forward slash (\) escapes an n, which evaluates to a so-called “newline”. In other words, the YAML that corresponds to this latter example is:

opt1 : foo
opt2 : bar

Similarly, other escape sequences can also be used with --override. For example, the following overrides an option called sep, setting its value to the tab character:

program -c config.yml -o 'sep : "\t"'

Returns: argparse.ArgumentParser

phyles.package_spec(env_var, package_name, data_dir, specfile_name)

Reads and returns the contents of a schema specification somewhere in a package as YAML text (described in load_schema()).

This function pulls out all the stops to find the specification. It is best to try to give all of env_var, package_name, and data_dir if they are available to have the best chance of finding the path to the specification file. See get_data_path() for a full description.

Args:

The arguments env_var, package_name, and data_dir are identical to those required in get_data_path().

specfile_name: name of the schema specification file found within the package contents.

Returns:
A YAML string specifying the schema.
phyles.set_up(program, version, spec, converters=None, argparser=None)

Given the name of the program (program), the version string, the specification for the schema (spec; described in load_schema()), and converters for the schema (also described in load_schema()), this function:

  1. sets up the default argparser if the argparser keyword argument is None or not provided (see default_argparser())
  2. prints the template or banner as appropriate (see template() and banner())
  3. creates a schema and uses it to validate the config (see load_config() and validate_config())
  4. overrides items in the config according to the command line option --override or -o (see default_argparser() for a description of --override)
  5. exits gracefully with usage() if any problems are found in the command line arguments of config
  6. returns a dict of the argparse.ArgumentParser, the parsed command line arguments, the Schema, and the Configuration with keys 'argparser', 'args', 'schema', 'config', respectively
Args:

program: the program name as a str

version: the program version as a str

spec: a schema specification as described in load_schema()

coverters: a dict of converters as described in load_schema()

argparser: used as the argparse.ArgumentParser if provided; else the argparse.ArgumentParser returned by default_argparser() is used

Returns: a dict with the keys:

  1. 'argparser': argparse.ArgumentParser
  2. 'args': the parsed command lines arguments as a argparse.Namespace
  3. 'schema': schema for the configuration as a Schema
  4. 'config': the configuration as a Configuration
phyles.run_main(main, config, catchall=<class 'phyles._phyles.DummyException'>)

Trivial convenience function that runs a main function within a try-except block. The main function should take as its sole argument the config, which is a mapping object that holds the configuration (usually a Configuration object). The catchall is an Exception or a tuple of Exceptions, which if caught will result in a graceful exit of the program (see graceful()).

Args:

main: a function, equivalent to the main function of a program

config: a mapping object, usually a Configuration
which is generally produced by validate_config() or read_config()

catchall: an Exception or a tuple of Exceptions

phyles.mapify(f)

Given a function f with an arbitrary set of arguments and keyword arguments, a new function is returned that instead takes as an argument a mapping object (e.g. a dict) that has as keys the original arguments and keyword arguments of f.

If f has “kwargs”, as in (def f(**kwargs):), then items in the mapping object that do not correspond to any arguments or defaults are included in kwargs. See the 'extra' key in the first example in the doctest below.

Args:
  • f:a function

Returns: a function

>>> amap = {'a': 1, 'b': 2, 'c': 3, 'd': 42,
...         'args': [7, 8],
...         'kwargs': {'bob':39, 'carol':36},
...         'extra': 99}
>>> @mapify
... def f(a, (b, c), d=2, *args, **kwargs):
...   print "a=%s  (b=%s  c=%s) d=%s" % (a, b, c, d)
...   print "args are: %s" % (args,)
...   print "kwargs are: %s" % (kwargs,)
...
>>> f(amap)
a=1  (b=2  c=3) d=42
args are: (7, 8)
kwargs are: {'bob': 39, 'carol': 36, 'extra': 99}
>>> @mapify
... def g(a):
...   print "a is: %s" % (a,)
... 
>>> g(amap)
a is: 1
>>> @mapify
... def h(y=4, *args):
...   print "y is: %s" % (y,)
...   print "args are: %s" % (args,)
... 
>>> h(amap)
y is: 4
args are: (7, 8)
phyles.get_terminal_size()

return width and height of console; works on linux, os x,windows,cygwin(windows)

based on https://gist.github.com/jtriley/1108174 (originally retrieved from: http://goo.gl/CcPZh)

Returns: 2-tuple of int

phyles.zipdir(basedir, archivename)

Uses python zipfile package to create a zip archive of the directory basedir and store the archive to archivename.

Virtually unmodified from http://goo.gl/Ty5k9 except that empty directories aren’t ignored.

Args:
  • basedir: directory to zip as str
  • archivename: name of zip archive a str

Returns: None

phyles.basic_logger(name, level=20)

Returns an instance of logging.Logger named name with level of level (defaults to logging.INFO). The format of the messages is “%(levelname)s: %(message)s”.