scripter: a tool for parallel execution of functions on many files

Licensed under Perl Artistic License 2.0

No warranty is provided, express or implied

Philosophy

scripter tries to make it easy to write scripts that parallelize tasks by first parsing filenames and options, and then executing an action (function) on the parsed filename objects.

Setting up the Environment

Its critical class is Environment which will generally be imported by

import scripter
e = scripter.Environment(version=VERSION, doc=__doc__)

Passing the version and documentation (usually __doc__) is recommended so that users can use the expected “–help” and “–version” options

Attaching a FilenameParser

It is usually necessary to set a FilenameParser, which acts of the filenames given at the command Your FilenameParser should inherit the class FilenameParser from scripter and should usually execute its __init__() method (either before or after yours, see the example below). It is possible to use scripter’s FilenameParser directly if you don’t need to customize it much. It is important that you allow **kwargs in __init__() for your custom FilenameParser or it will almost certainly fail to work. All options given at the command line and by scripter are passed to both FilenameParser and the action.

Here is an example FilenameParser hooked to Environment

class ExampleFilenameParser(FilenameParser):
    def __init__(filename, number_of_apples=5, **kwargs):
         super(ExampleFilenameParser, self).__init__(self, **kwargs)
         self.tree = [filename + '_%d.txt' % num for num in range(number_of_apples)]

e.set_filename_parser(ExampleFilenameParser)

Defining an action

The last thing you must for the script to run is to define the action and tell the Environment to execute it. Like FilenameParser, the action should accept **kwargs or it will probably fail. Here is an example action which makes files in the output directory for the specified number_of_apples:

import os.path
def example_function(filename_obj, **kwargs):
    tree = filename_obj.tree
    input = filename_obj.input_file
    output_dir = filename_obj.output_dir
    for f in tree:
        output_filename = os.path.join(output_dir, f)
        fh = open(output_filename, 'wb')
        fh.write('This apple came from %s' % input)

e.do_action(example_function) # this actually starts the script

Modifying the script options

You probably want to specify additional values at the command line besides the scripter defaults. You can import the argument parser and modify it, see argparse for more information:

parser = e.argument_parser
parser.add_argument("--number-of-apples", type=int, nargs='?')

Options from the parser are converted into keywords that can be accessed either from the kwargs directionary or by including the kwarg directly in the action definition or the __init__() method of the FilenameParser (we did that above by include number_of_apples in our custom __init__ method)

scripter includes a number of default options that set various script parameters

Dealing with errors/exceptions

If something goes wrong with scripter, it will usually raise a scripter.Usage exception. You may want to raise these too to avoid users seeing python error codes.

The decorator scripter.exit_on_Usage() is provided for allowing scripts to exit gracefully when errors occur. It is often a good idea to decorate the main action for this purpose:

from scripter import Usage, exit_on_Usage

@exit_on_Usage
def action(filename_obj, **kwargs):
    try:
        do_something()
    except NameError:
        raise Usage, 'could not do something'

action()

This will cause the program to exit and tell the user “could not do something”

Logging

Use scripter.get_logger() inside the action for parallel-processing-safe logging

Indices and tables