The phyles API has several functions and classes to facilitate the construction of medium-sized utilites. These functions and classes are divided into four categories:
- phyles.Schema
encapsulates a schema and wraps phyles.sample_config, phyles.validate_config, and phyles.read_config for convenience
- phyles.Configuration
encapsulates a configuration, remembering values before any conversion
- phyles.read_schema
makes a schema from a specification in a YAML file
- phyles.load_schema
makes a schema a specification in YAML text, a mapping, or a list of 2-tuples with unique keys
- phyles.sample_config
produces a sample config from a schema
- phyles.validate_config
validates a config file with a schema
- phyles.read_config
reads a yaml config file and validates the config with a schema
- phyles.last_made
returns the most recently created file in a directory
- phyles.get_home_dir
returns the users home directory in a representation native to the host OS
- phyles.get_data_path
returns the absolute path to a data directory within a package
- phyles.package_spec
reads and returns the contents a schema specification somewhere in a package as YAML text
- phyles.prune
recursively deletes files matching specified unix sytle patterns
- phyles.zipdir
uses python zipfile package to create a zip archive of a directory
- phyles.wait_exec
waits for a command to execute via a system call and returns the output from stdout; slightly more convenient than popen2.Popen3
- phyles.doyn
queries user for yes/no input from raw_input() and can execute an optional command with phyles.wait_exec
- phyles.banner
prints a banner for the program to stdout
- phyles.usage
uses optparse.OptionParser to print usage and can print an optional error message, if provided.
- phyles.default_argparser
returns a default argparse.ArgumentParser with mutually exclusive template, config, and override arguments added
- phyles.get_terminal_size
returns the terminal size as a (width, height) tuple (works with Linux, OS X, Windows, Cygwin)
- phyles.set_up
sets up the runtime with an argparse.ArgParser, loads a schema and validates a config with config override, and prepares state for graceful recovery from user error
- phyles.run_main
trivial try-except block for graceful recovery from anticipated types of user error
- phyles.mapify
function decorator that converts a function taking any arbitrary set of arguments into a funciton taking as a single argument a mapping object keyed with the names of the original arguments
phyles: A package to simplify authoring utilites. Copyright (C) 2013 James C. Stroud All rights reserved.
An OrderedDict subclass that provides identity for schemata.
A Schema has an implicit structure and is created by the load_schema() or read_schema() functions. See the documentation in load_schema() for a detailed explanation.
Warning
Creating instances of Schema through its class constructor (i.e. '__init__') is not yet advised or supported and may break forward compatibility.
This is a wrapper for read_config() (see documentation therein).
Comparison of usage with read_config():
phyles.read_config(schema, config)
schema.read_config(config)
This is a wrapper for sample_config() (see documentation therein).
Comparison of usage with sample_config():
phyles.sample_config(schema)
schema.sample_config()
This is a wrapper for validate_config() (see documentation therein).
Comparison of usage with validate_config():
phyles.validate_config(schema, config)
schema.validate_config(config)
An OrderedDict subclass that encapsulates configurations and also remembers the original input.
A Configuration has a specific structure and is creatd by validate_config() or read_config() functions, usually by the latter. The Configuration class is exposed in the API purely for purposes of documentation.
allowing the remembering of user input while also allowing conversion
>>> colors = {'red': 'ff0000',
... 'green': '00ff00',
... 'blue': '0000ff'}
>>> c = Configuration({'color': 'blue'})
>>> c['color'] = colors[c['color']]
>>> c['color']
>>> '0000ff'
>>> c.original['color']
'blue'
Warning
Creating instances of Configuration through its class constructor (i.e. with phyles.Configuration()) is not yet advised or supported and may break forward compatibility.
Loads the schema specified in the file named in yaml_file. This function simply opens and reads the yaml_file before sending its contents to load_schema() to produce the Schema.
yaml_file: name of a yaml file holding the specification
Creates a Schema from the specification, spec.
Note
If the schema specification is in a YAML file, then use phyles.read_schema(), which is a convenience wrapper around load_schema().
Args:
- spec:
Can either be YAML text, a list of 2-tuples with unique first elements, a mapping object (e.g. dict), or None (which would be equivalent to an empty dict. If the schema is in a YAML file, then use phyles.read_schema(). The values of the items of spec are:
- converter
- example value
- help string
- default value (optional)
Example YAML specification as a complete YAML file:
%YAML 1.2 --- !!omap - 'pdb model' : [str, my_model.pdb, null] - 'reset b-facs' : - float - -1 - "New B factor (-1 for no reset)" - -1 - 'cell dimensions' : [get_cell, [200, 200, 200], null]The same example as a python specification via a list of 2-tuples with unique first elements:
[('pdb model', ['str', 'my_model.pdb', None]), ('reset b-facs', ['float', -1, 'New B factor (-1 for no reset)', -1]), ('cell dimensions', [get_cell, [200, 200, 200], None])]For completeness, the same example as a dict:
{'pdb model': ['str', 'my_model.pdb', None], 'reset b-facs': ['float', -1, 'New B factor (-1 for no reset)', -1], 'cell dimensions': [get_cell, [200, 200, 200], None]}Note
The python structure (list of 2-tuples) of this example specification is simply the result of parsing the YAML with the PyYAML parser. Because of isomorphism between a list of 2-tuples with unique first elements, OrderedDicts, dicts, and other mapping types, the specification may take any of these forms.
The following YAML representation of a config conforms to the preceding schema specification:
pdb model : model.pdb reset b-facs : 20 cell dimensions : [59, 95, 159]- converters:
A dict of callables keyed by converter name (which must match the converter names in spec), The callables convert values from the actual config.
Converters that correspond to several native python types and YAML types do not need to be explicitly specified. The names that these converters take in a schema specification and the corresponding python types produced by these converters are:
- map: dict
- dict: dict (encoded as a YAML dict)
- omap: collections.OrderedDict
- odict: collections.OrderedDict (alias for “omap”)
- pairs: list of 2-tuples
- set: set
- seq: list
- list: list (encoded as a sequence, see list())
- tuple: tuple (encoded as a sequence, see tuple())
- bool: bool
- float: float
- int: int
- long: long (encoded as a YAML int)
- complex: complex (encoded as a sequence of 0 to 2, or as a string representation, e.g. '3+2j'; see complex())
- str: str
- unicode: unicode
- timestamp: datetime.datetime (encoded as a YAML timestamp)
- slice: slice (encoded as a sequence of 1 to 3, see slice())
Note
Except where noted, these types are encoded according to the YAML types specification in a YAML representation of a config.
>>> import phyles
>>> import textwrap
>>> def get_cell(cell):
return [float(f) for f in cell]
>>> converters = {'get_cell' : get_cell}
>>> y = '''
%YAML 1.2
---
!!omap
- 'pdb model' : [str, my_model.pdb, null]
- 'reset b-facs' :
- float
- -1
- "New B factor (-1 for no reset)"
- -1
- 'cell dimensions' : [get_cell, [200, 200, 200], null]
'''
>>> phyles.load_schema(textwrap.dedent(y), converters=converters)
Schema([('pdb model',
[<type 'str'>, 'my_model.pdb', None]),
('reset b-facs',
[<type 'float'>, -1,
'New B factor (-1 for no reset)', -1]),
('cell dimensions',
[<function get_cell at 0x101d7cf50>,
[200, 200, 200], None])])
Creates a sample config specification (returned as a str) from the schema, as described in read_schema().
Items with empty help strings (i.e. "") are not included in the sample config. These may be used for testing, etc.
>>> import phyles
>>> import textwrap
>>> def get_cell(cell):
return [float(f) for f in cell]
>>> converters = {'get_cell' : get_cell}
>>> y = '''
!!omap
- 'pdb model' : [str, my_model.pdb, null]
- 'reset b-facs' :
- float
- -1
- "New B factor (-1 for no reset)"
- -1
- 'cell dimensions' : [get_cell, [200, 200, 200], null]
'''
>>> schema = phyles.load_schema(textwrap.dedent(y),
converters=converters)
>>> print phyles.sample_config(schema)
%YAML 1.2
---
pdb model : my_model.pdb
# New B factor (-1 for no reset)
reset b-facs : -1
cell dimensions : [200, 200, 200]
Takes a YAML specification for a configuration, config, and uses the schema (as described in load_schema()) for validation, which:
- checks for required config entries, raising a ConfigError if any are missing
- ensures that no unrecognized config entries are present, raising a ConfigError in any such entries are present
- ensures, through the use of converters, that the values given in the config specification are of the appropriate types and within accepted limits (if applicable), raising a ConfigError if any fail to convert
- uses the converters to turn values given in the configuration into values of the appropriate types (e.g. the YAML str '1+4j' is converted into the python complex number (1+4j) if the converter is 'complex')
Note
Why is conversion a part of validation? Conversion facilitates the end-user’s working with a minimal subset of the YAML vocabulary. In the complex number example above, the end-user only needs to know how complex numbers are usually represented (e.g. '1+4j') and not what gibbersh like '!!python/object:__main__.SomeComplexClass' means, where to put it, how to specify its attributes, etc.
>>> import phyles
>>> import textwrap
>>> import yaml
>>> def get_cell(cell):
return [float(f) for f in cell]
>>> converters = {'get_cell' : get_cell}
>>> y = '''
%YAML 1.2
---
!!omap
- 'pdb model' : [str, my_model.pdb, null]
- 'reset b-facs' :
- float
- -1
- "New B factor (-1 for no reset)"
- -1
- 'cell dimensions' : [get_cell, [200, 200, 200], null]
'''
>>> schema = phyles.load_schema(textwrap.dedent(y),
converters=converters)
>>> y = '''
pdb model : model.pdb
reset b-facs : 20
cell dimensions : [59, 95, 159]
'''
>>> cfg = yaml.load(textwrap.dedent(y))
>>> cfg
{'cell dimensions': [59, 95, 159],
'pdb model': 'model.pdb',
'reset b-facs': 20}
>>> phyles.validate_config(schema, cfg)
Configuration([('pdb model', 'model.pdb'),
('reset b-facs', 20.0),
('cell dimensions', [59.0, 95.0, 159.0])])
Reads a YAML config file from the the file named config_file and returns the config validated by schema.
YAML file holding the config, for example:
pdb model : model.pdb
reset b-facs : 20
cell dimensions : [59, 95, 159]
schema: a Schema as described in load_schema()
Returns the most recently created file in dirpath. If provided, the newest of the files with the given suffix/suffices is returned. This will recurse to depth, with dirpath being at depth 0 and its children directories being at depth 1, etc. Set depth to any value but 0 or a positive integer if recursion should be exauhstive (e.g. -1 or None).
Returns None if no files are found that match suffix.
The suffix parameter is either a single suffix (e.g. '.txt') or a sequence of suffices (e.g. ['.txt', '.text']).
Waits for cmd to execute and returns the output from stdout. The cmd follows the same rules as for python’s popen2.Popen3. If instr is provided, this string is passed to the standard input of the child process.
Except for the convenience of passing instr, this funciton is somewhat redundant with pyhon’s subprocess.call().
Uses the raw_input() builtin to query the user a yes or no question. If cmd is provided, then the function specified by exc (default os.system()) will be called with the argument cmd.
If a file name for outfile is provided, then stdout will be directed to a file of that name.
Uses the program and version to print a banner to stderr. The banner will be printed at width (default 70).
program: str
version: str
width: int
Uses the parser (argparse.ArgumentParser) to print the usage. If msg (which can be an Exception, str, etc.) is supplied then it will be printed as an error message, hopefully in a way that catches the user’s eye. The usage message will be formatted at width (default 70). The pad is used to add some extra space to the beginning of the error lines to make them stand out (default 4).
Gracefully exits the program with an error message.
The msg, width and pad arguments are the same as for usage().
Returns the home directory of the account under which the python program is executed. The home directory is represented in a manner that is comprehensible by the host operating system (e.g. C:\something\ for Windows, etc.).
Adapted directly from K. S. Sreeram’s approach, message 393819 on c.l.python (7/18/2006). I treat this code as public domain.
Returns the path to the data directory. First it looks for the directory specified in the env_var environment variable and if that directory does not exists, finds data_dir in one of the following places (in this order):
- The package directory (i.e. where the __init.py__ is for the package named by the package_name parameter)
- If the package is a file, the directory holding the package file
- If frozen, the directory holding the frozen executable
- If frozen, the parent directory of the directory holding the frozen executable
- If frozen, the first element of sys.path
Thus, if the package were at /path/to/my_package, (i.e. with /path/to/my_package/__init__.py), then a very reasonable place for the data directory would be /path/to/my_package/package-data/.
The anticipated use of this function is within the package with which the data files are associated. For this use, the package name can be found with the global variable __package__, which for this example would have the value 'my_package'. E.g.:
pth = get_data_path('MYPACKAGEDATA', __package__, 'package-data')
This code is adapted from _get_data_path() from matplotlib __init__.py. Some parts of this code are most likely subject to the matplotlib license.
Note
The env_var argument can be ignored using phyles.Undefined because it’s guaranteed not to be in os.environ:
pth = get_data_path(Undefined, __package__, 'package-data')
Warning
The use of '__package__' for package_name will fail in certain circumstances. For example, if the value of __name__ is '__main__', then __package__ is usually None. In such cases, it is necessary to pass the package name explicitly.
pth = get_data_path(Undefined, 'my_package', 'package-data')
Recursively deletes files matching the specified unix style patterns. The doit parameter must be explicitly set to True for the files to actually get deleted, otherwise, it just logs with logging.info() what would happen. Raises a SystemExit if deletion of any file is unsuccessful (only when doit is True).
Example:
prune(['*~', '*.pyc'], doit=True)
Returns: None
Raises: SystemExit
Returns a default argparse.ArgumentParser with mutually exclusive template (-t, --template) and config (-c, --config) arguments added. It also has the override (-o, --override) option added, to override configuration items on the command line.
The argument to --override should be a valid YAML map (with the single exception that the outermost curly braces are optional for YAML flow style). Because YAML relies on syntactically meaningful whitespace, single quotes should surround the argument to --override.
The following examples execute a program called program, overriding opt1 and opt2 of the config in the file config.yml with foo and bar, respctively:
program -c config.yml -o 'opt1 : foo, opt2 : bar'
program -c config.yml -o '{opt1 : foo, opt2 : bar}'
program -c config.yml -o 'opt1 : foo\nopt2 : bar'
Note
The latter example illustrates how YAML block style can be used with --override: a single forward slash (\) escapes an n, which evaluates to a so-called “newline”. In other words, the YAML that corresponds to this latter example is:
opt1 : foo
opt2 : bar
Similarly, other escape sequences can also be used with --override. For example, the following overrides an option called sep, setting its value to the tab character:
program -c config.yml -o 'sep : "\t"'
Returns: argparse.ArgumentParser
Reads and returns the contents of a schema specification somewhere in a package as YAML text (described in load_schema()).
This function pulls out all the stops to find the specification. It is best to try to give all of env_var, package_name, and data_dir if they are available to have the best chance of finding the path to the specification file. See get_data_path() for a full description.
The arguments env_var, package_name, and data_dir are identical to those required in get_data_path().
specfile_name: name of the schema specification file found within the package contents.
Given the name of the program (program), the version string, the specification for the schema (spec; described in load_schema()), and converters for the schema (also described in load_schema()), this function:
- sets up the default argparser if the argparser keyword argument is None or not provided (see default_argparser())
- prints the template or banner as appropriate (see template() and banner())
- creates a schema and uses it to validate the config (see load_config() and validate_config())
- overrides items in the config according to the command line option --override or -o (see default_argparser() for a description of --override)
- exits gracefully with usage() if any problems are found in the command line arguments of config
- returns a dict of the argparse.ArgumentParser, the parsed command line arguments, the Schema, and the Configuration with keys 'argparser', 'args', 'schema', 'config', respectively
program: the program name as a str
version: the program version as a str
spec: a schema specification as described in load_schema()
coverters: a dict of converters as described in load_schema()
argparser: used as the argparse.ArgumentParser if provided; else the argparse.ArgumentParser returned by default_argparser() is used
Returns: a dict with the keys:
- 'argparser': argparse.ArgumentParser
- 'args': the parsed command lines arguments as a argparse.Namespace
- 'schema': schema for the configuration as a Schema
- 'config': the configuration as a Configuration
Trivial convenience function that runs a main function within a try-except block. The main function should take as its sole argument the config, which is a mapping object that holds the configuration (usually a Configuration object). The catchall is an Exception or a tuple of Exceptions, which if caught will result in a graceful exit of the program (see graceful()).
main: a function, equivalent to the main function of a program
catchall: an Exception or a tuple of Exceptions
Given a function f with an arbitrary set of arguments and keyword arguments, a new function is returned that instead takes as an argument a mapping object (e.g. a dict) that has as keys the original arguments and keyword arguments of f.
If f has “kwargs”, as in (def f(**kwargs):), then items in the mapping object that do not correspond to any arguments or defaults are included in kwargs. See the 'extra' key in the first example in the doctest below.
Returns: a function
>>> amap = {'a': 1, 'b': 2, 'c': 3, 'd': 42,
... 'args': [7, 8],
... 'kwargs': {'bob':39, 'carol':36},
... 'extra': 99}
>>> @mapify
... def f(a, (b, c), d=2, *args, **kwargs):
... print "a=%s (b=%s c=%s) d=%s" % (a, b, c, d)
... print "args are: %s" % (args,)
... print "kwargs are: %s" % (kwargs,)
...
>>> f(amap)
a=1 (b=2 c=3) d=42
args are: (7, 8)
kwargs are: {'bob': 39, 'carol': 36, 'extra': 99}
>>> @mapify
... def g(a):
... print "a is: %s" % (a,)
...
>>> g(amap)
a is: 1
>>> @mapify
... def h(y=4, *args):
... print "y is: %s" % (y,)
... print "args are: %s" % (args,)
...
>>> h(amap)
y is: 4
args are: (7, 8)
return width and height of console; works on linux, os x,windows,cygwin(windows)
based on https://gist.github.com/jtriley/1108174 (originally retrieved from: http://goo.gl/CcPZh)
Returns: 2-tuple of int
Uses python zipfile package to create a zip archive of the directory basedir and store the archive to archivename.
Virtually unmodified from http://goo.gl/Ty5k9 except that empty directories aren’t ignored.
Returns: None
Returns an instance of logging.Logger named name with level of level (defaults to logging.INFO). The format of the messages is “%(levelname)s: %(message)s”.