2. The `InputReader` Class¶

class input_reader.InputReader(comment=[u'#'], case=False, ignoreunknown=False, default=None)¶

InputReader is a class that is designed to read in an input file and return the information contained based on rules supplied to the class using a syntax similar to what is used in the argparse module in the Python standard library.

InputReader accepts blocks-type, line-type and boolean-type keys, mutually exclusive groups, required keys, defaults, and more.

Parameters:

comment (str or list) – Defines what is a comment in the input block. This can be a single string or a list of strings. The default is ['#']. Optional.
case (bool) – Tells if the keys are case-sensitive or not. The default is False. Optional.
ignoreunknown (bool) – Ignore keys that are not defined. The default is False. Optional
default – The default default that will be given when a key is created without a default. Optional

The InputReader class allows the user to define the keys to be read in from the input file, and also reads the input file to parse the data.

In the simplest use case, you may simply instantiate InputReader with no arguments, as in

from input_reader import InputReader
reader = InputReader()

The above instantiation assumes that the # character will be a comment in the input file, the input file is case-insensitive, unknown keys in the input file will cause the parsing to fail, and any key not found in the input file will be defaulted to None.

If an unknown parameter is passed to InputReader, a TypeError will be raised. If an inappropriate value is passed to a parameter a ValueError will be raised.

2.1. `InputReader` options¶

There are four optional parameters to InputReader: case, comment, ignoreunknown, and default. The defaults for these are illustrated below:

# The following are equivalent
reader = InputReader(comment=['#'], case=False, ignoreunknown=False, default=None)
reader = InputReader()

Of course, the user may choose to change these default values.

2.1.1. comment¶

The comment option specifies the characters that will be interpreted as a comment by InputReader. As mentioned above, the default is #. You are free to choose as many characters as you wish. If you choose one character, it may be given as a string or as a single element list of strings as shown below:

# The following are equivalent
reader = InputReader(comment='%')
reader = InputReader(comment=['%'])

If you wish to allow multiple comment characters, you must place them in a list of strings:

# Multiple comment characters.  A comment need not be one characgter long
reader = InputReader(comment=['#', '//'])

No matter the definition, comments will work just as you might expect from python: they can appear anywhere in the line, and only characters after the comment are ignored.

2.1.2. case¶

When case sensitivity is turned off (the default), all lines in the input file are converted to lower case, and all keys are converted to lower case. In general, it is best to let input files be case-insensitive, but there may be a reason this is not desirable. To turn on case-sensitivity, use

reader = InputReader(case=True)

This will cause the input file to be read in as given, and keys to be kept in the case that is given.

2.1.3. ignoreunknown¶

It is best not to assume your end-users will do everything correctly. For example, it is common to accidentally misspell a keyword. You would likely wish to alert the user of this error instead of continuing with the calculation and giving bad results. For this reason, the ignoreunknown key is defaulted to False. Any key that is not known to InputReader causes a ReaderError to be raised. If this is not desirable for your use-case, you can disable this with:

reader = InputReader(ignoreunknown=True)

2.1.4. default¶

If a key is defined, but does not appear in the input file, InputReader will assign a default value to it. By default this is None. This is useful because one can check that the key appeared in the input file with if inp.key is not None. However, it may be desirable to have a different default value, such as False, 0, or ''. To change the default value, use:

reader = InputReader(default=False)

Alternatively, you can request that keys not appearing in the input file be omitted from the Namespace. This would raise a AttributeError when trying to access a non-existent key, or you can check that the key exists with if key in inp. To do this, use:

from input_reader import SUPPRESS
reader = InputReader(default=SUPPRESS)

2.2. `read_input()`¶

InputReader.read_input(filename)¶

Reads in the input from a given file using the supplied rules.

Parameters:	filename – The name of the file to read in, `StringIO` of input, or list of strings containing the input itself.
Return type:	`Namespace`: This class contains the read-in data each key is stored as members of the class.
Exception:	`ReaderError`: Any known errors will be raised with this custom exception.

read_input() does only one thing: parse an input file based on the key rules given to it. In the coming sections, we will discuss how to define these rules, but it will be helpful first to know how we can parse input files.

read_input() accepts one of three input types: a filename, a StringIO object, or a list of strings. Let’s say we want to parse the following input:

red
blue
green

This can be sent to the input reader in one of three ways. First, we can create a text file containing this data and parse the file itself:

from os import remove
from tempfile import mkstemp

user_input = mkstemp()[1]
with open(user_input, 'w') as ui:
    ui.write('red\n')
    ui.write('blue\n')
    ui.write('green\n')

# Code defining how to read input goes here #
inp = reader.read_input(user_input)
remove(user_input)
print inp

Alternatively, we could use the StringIO object:

from StringIO import StringIO

io = StringIO()
io.write('red\n')
io.write('blue\n')
io.write('green\n')

# Code defining how to read input goes here #
inp = reader.read_input(io)
print inp

Last, you can simply define a list of strings.

lst = ['red', 'blue', 'green']

# Code defining how to read input goes here #
inp = reader.read_input(lst)
print inp

The above three sets of code would all print:

Namespace(red=True, blue=True, green=True)

Of course, the most common use case is to parse a file, as it is unlikely your users will write python code as their input file; however, these alternate modes of parsing are provided for easy testing of your key definitions.

2.3. `add_boolean_key()`¶

InputReader.add_boolean_key(keyname, action=True, **kwargs)¶

Add a boolean key to the input searcher.

Parameters:

keyname (str) – The name of the boolean-type key to search for.
action – What value to store if this key is found. The default is True.
required (bool) –
Indicates that not inlcuding keyname is an error. It makes no sense to include this for a boolean key. The default is False.

If keyname is part of a mutually exclusive group, it is best to set required for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
default –
The value stored for this key if it does not appear in the input block. A value of None is equivalent to no default. It makes no sense to give a default and mark it required as well. If the class SUPPRESS is given instead of None, then this key will be removed from the namespace if it is not given. The default is None.

If keyname is part of a mutually exclusive group and the group has been given a dest value, it is best to set default for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
dest (str) –
If dest is given, keyname will be stored in the returned Namespace as dest, not keyname. A value of None is equivalent to no dest. The default is None.

If keyname is part of a mutually exclusive group and the group has been given a dest value, do not set dest for individual members of the group.
depends (str) – Use depends to specify another key in the input file at the same input level (i.e. inside the same block or not in any block at all) that must also appear or a ReaderError will be raised. A value of None is equivalent to no depends. The default is None.
repeat (bool) – Determines if keyname can appear only once in the input file or several times. The default is False, which means this the keyname can only appear once or an error will be raised. If repeat is True, the collected data will be returned in a list in the order in which it was read in. The default is False.

A boolean key is a key in which the presence of the key triggers an action; there are no arguments to a boolean key in the input file. Typically, the presence of a boolean key makes that key True, and the absence will make it false.

Let’s say that you are defining an input file for a plotting program where distance is plotted versus time. Let’s say the unit choices for distance are meters, centimeters, kilometers, or millimeters, and the unit choices for time are seconds, minutes, and hours. We also want to know if we the distance scale (y-axis) should start at zero or at the smallest distance value to plot. We might specify this set of boolean keys in this way:

reader = InputReader()
# The distance units
reader.add_boolean_key('millimeters')
reader.add_boolean_key('centimeters')
reader.add_boolean_key('meters')
reader.add_boolean_key('kilometers')
# The time units
reader.add_boolean_key('seconds')
reader.add_boolean_key('minutes')
reader.add_boolean_key('hours')
# Start at 0 on y-axis?
reader.add_boolean_key('zero_y_axis')

# Print some results
def print_result(inp):
    # Distance unit?
    if inp.meters:
        print "distance in meters"
    elif inp.centimeters:
        print "distance in centimeters"
    elif inp.kilometers:
        print "distance in kilometers"
    elif inp.millimeters:
        print "distance in millimeters"
    else:
        print "defaulting to distance in meters"
    # Time unit?
    if inp.seconds:
        print "time in seconds"
    elif inp.minutes:
        print "time in minutes"
    elif inp.hours:
        print "time in hours"
    else:
        print "defaulting to time in seconds"
    # Zero the y-axis
    if inp.zero_y_axis:
        print "y-axis starts at zero"
    else:
        print "y-axis starts at minimum distance point"
    print '-----'

# Sample inputs
print_result(reader.read_input(['meters', 'minutes', 'zero_y_axis']))
print_result(reader.read_input(['kilometers', 'hours']))
print_result(reader.read_input(['zero_y_axis']))

The above code would output

distance in meters
time in minutes
y-axis starts at zero
-----
distance in kilometers
time in hours
y-axis starts at minimum distance point
-----
defaulting to distance in meters
defaulting to time in seconds
y-axis starts at zero
-----

Of course, the above code does not forbid both meters and centimeters (for example) to be defined simultaneously; the solution to this problem will be discussed in the add_mutually_exclusive_group() section.

Note

The options default, depends, dest, required and repeat are common beween add_boolean_key(), add_line_key(), add_block_key(), and add_regex_line() and therefore will be discussed together in the Common Options section.

2.3.1. action¶

Hint

action defaults to True, so the following two lines are equvalent:

reader.add_boolean_key('key')
reader.add_boolean_key('key', action=True)

The action option is what the key will be set to in the Namespace. By default it is True. However, in some scenarios it may be advantagous to set this to something other than a bool. You can even set it to a function. For example, lets say that the input to our plotting program is given in seconds and meters. It would be advantagous then to use our boolean keys to define a function to convert from the input unit to the plotted unit.

reader = InputReader()
# The distance units
reader.add_boolean_key('meters', action=lambda x: 1.0 * x)
reader.add_boolean_key('centimeters', action=lambda x: 100.0 * x)
reader.add_boolean_key('kilometers', action=lambda x: 0.001 * x)
reader.add_boolean_key('millimeters', action=lambda x: 1000.0 * x)
# The time units
reader.add_boolean_key('seconds', action=lambda x: x / 1.0)
reader.add_boolean_key('minutes', action=lambda x: x / 60.0)
reader.add_boolean_key('hours', action=lambda x: x / 3600.0)

def print_results(inp, distance, time):
    # Distance unit?
    if inp.meters:
        print inp.meters(distance), 'meters'
    elif inp.centimeters:
        print inp.centimeters(distance), 'centimeters'
    elif inp.kilometers:
        print inp.kilometers(distance), 'kilometers'
    elif inp.millimeters:
        print inp.millimeters(distance), 'millimeters'
    else:
        print float(x), 'meters'
    # Time unit?
    if inp.seconds:
        print inp.seconds(time), 'seconds'
    elif inp.minutes:
        print inp.minutes(time), 'minutes'
    elif inp.hours:
        print inp.hours(time), 'hours'
    else:
        print float(time), 'seconds'
    print '----'

# Supply 50 meters and 1800 seconds
print_results(reader.read_input(['centimeters', 'minutes']), 50, 1800)
print_results(reader.read_input(['kilometers', 'hours']), 50, 1800)
print_results(reader.read_input(['millimeters']), 50, 1800)

The above code would output

5000.0 centimeters
30.0 minutes
----
0.05 kilometers
0.5 hours
----
50000.0 millimeters
1800.0 seconds
----

Of course, the above example is still not quite satisfactory, because our conversion function is still in one of several variables. As a result there is a lot of needless code just to perform this conversion. It would be more convenient to have a single variable to place each group of boolean keys into. We will discuss how to do this in the dest subsection of the Common Options section.

2.4. `add_line_key()`¶

InputReader.add_line_key(keyname, type=<type 'str'>, glob={}, keywords={}, case=None, **kwargs)¶

Add a line key to the input searcher.

Parameters:

keyname (str) – The name of the key to search for.
type –
The data type that to be read in for each positional argument, given as a list. The length of the list dictates how many arguments to look for. If this is an empty list or None no positional arguments will be read in.

type may be one or more of:
- int
- float
- str
- None
- an explicit int (i.e. 4), float (i.e. 5.4) or str (i.e. "hello")
- a compiled regular expression object
If you give an explicit int, float or str, it is assumed that the value must equal what you gave. None means that the word "none" is what is expected. NOTE: If the entirety of type is None, (i.e. type=None), then no types are expected, and one of glob or keywords is required.

If you only wish to read in one argument, you may give the type(s) for that one argument directly (meaning not in a list). This will cause the returned value to be the value itself, not a 1-length list.

For each value, you may give a tuple of types to indicate more than one type is valid for that argument position. NOTE: Is is very important that type choices for each argument are given as tuple s, and that the list passed to type is an actual list (as opposed to tuple) because these are treated differently.

The default value is str.
glob (dict) –
glob is a dict giving information on how to read in a glob of arguments. Globs are read in after the positional arguments. If there are no positional arguments, the whole line is globbed. glob is not valid with keywords. The glob dict accepts only four keys:

len

Must be one of '*', '+', or '?'. '*' is a zero or more glob, '+' is an at least one or more glob, and '?' is a zero or one glob.

type

Indicates the data type the glob must be. This may be any one of the types presented for positional arguments. If this is omitted, then str is assumed.

join

join will join the globbed values as a space-separated str and thus return a single str instead of a list. This is useful for reading in sentences or titles. The default is False if len is '*' or '+' and True if len is '?'.

default

In the case that no glob is given this is what will be put into the glob. If there is no default, nothing is put into the glob.

By default this is an empty dict.
keywords (nested dict) –
keywords is a nested dictionary indicating key-value pairs that can be read in. Each key in the dictionary is a keyword to look for, and the value for that key is another dictionary with the keys type and default. If an empty dictionary or None is given, the defaults of str and SUPPRESS will be chosen, respectively. Like positional arguments, you may give as many types as you wish per keyword.

By default this is an empty dict.
case (bool) – States if this particular key is case-sensitive. Note that this applies only to the arguments of keyname; keyname itself uses the case-sensitivity default of the current level. By default, case is determined by the global value set when initiallizing the class.
required (bool) –
Indicates that not inlcuding keyname is an error. It makes no sense to give a default and mark it required as well. The default is False

If keyname is part of a mutually exclusive group, it is best to set required for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
default –
The value stored for this key if it does not appear in the input block. A value of None is equivalent to no default. It makes no sense to give a default and mark it required as well. If the class SUPPRESS is given instead of None, then this key will be removed from the namespace if it is not given. The default is None.

If keyname is part of a mutually exclusive group and the group has been given a dest value, it is best to set default for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
dest (str) –
If dest is given, keyname will be stored in the returned Namespace as dest, not keyname. A value of None is equivalent to no dest. The default is None.

If keyname is part of a mutually exclusive group and the group has been given a dest value, do not set dest for individual members of the group.
depends (str) – Use depends to specify another key in the input file at the same input level (i.e. inside the same block or not in any block at all) that must also appear or a ReaderError will be raised. A value of None is equivalent to no depends. The default is None.
repeat (bool) – Determines if keyname can appear only once in the input file or several times. The default is False, which means this the keyname can only appear once or an error will be raised. If repeat is True, the collected data will be returned in a list in the order in which it was read in. The default is False.

The line key is likely do be the work horse of your input file. It allows you to concisely specify a key and its parameters in a flexible way. There are a lot of things to think about when it comes to line keys, so we’ll take it slowly.

Note

The options default, depends, dest, required and repeat are common beween add_boolean_key(), add_line_key(), add_block_key(), and add_regex_line() and therefore will be discussed together in the Common Options section.

2.4.1. type¶

Hint

type defaults to str, so the following two lines are equivalent:

reader.add_line_key('key')
reader.add_line_key('key', type=str)

2.4.1.1. Specifying one type¶

The type key specifies the python types that are to be read in on the line key. The allowed types are int, float, str None, an explicit int (i.e. 4), explicit float (i.e. 5.4) explicit str (i.e. "hello"), or a compiled regular expression object. The default for type is str.

Continuing with our plotting program, we need to specify the style of the lines on the plot (i.e. solid, dashed, dotted, etc.). The user may optionally specify an offset for the time data, as well as the total number of data points to plot. We could specify the above requirements as

reader = InputReader()
reader.add_line_key('linestyle', type=str)
reader.add_line_key('offset', type=float)
reader.add_line_key('numpoints', type=int)

user_input = ['numpoints 10',
              'linestyle dashed',
              'offset 100']

inp = reader.read_input(user_input)
print isinstance(inp.linestyle, str) ,isinstance(inp.offset, float), isinstance(inp.numpoints, int)
print inp

The above code would output

True True True
Namespace(numpoints=10, linestyle='dashed', offset=100.0)

This is great, but you may notice a huge flaw: nothing is preventing the user from giving something silly like lobster as the linestyle. It would be better to limit what the user may give for the linestyle. Also, we should provide a way to specify plotting of all given points. We can do this by providing a tuple of choices:

reader = InputReader()
reader.add_line_key('linestyle', type=('solid', 'dashed', 'dotted'))
reader.add_line_key('offset', type=float)
reader.add_line_key('numpoints', type=(int, 'all'))

# Some examples of what won't work
try:
    inp = reader.read_input(['numpoints 10.5'])
except ReaderError as e:
    print str(e)
try:
    inp = reader.read_input(['linestyle lobster'])
except ReaderError as e:
    print str(e)
try:
    inp = reader.read_input(["offset '100'"])
except ReaderError as e:
    print str(e)

# Now things that do work
user_input = ['numpoints all',
              'linestyle dashed',
              'offset 150.3']

inp = reader.read_input(user_input)
print isinstance(inp.numpoints, int), isinstance(inp.numpoints, str)
print inp

The above code would output

...expected one of "all" or int, got "10.5"
...expected one of "dashed", "dotted" or "solid", got "lobster"
...expected float, got "'100'"
False True
Namespace(numpoints='all', linestyle='dashed', offset=150.3)

Attention

It is important that you provide a tuple of choices, not a list, as these two object types are interpreted differently by the type option. This will be illustrated in the Specifying multiple types subsection.

Warning

When giving a tuple of type choices and one of those choices is a str, it is important that you give the str last. This is because str acts as a catch-all (i.e. str matches everything). Given the line "key 4.5", type=(float, str) will return the float 4.5, but type=(str, float) will return the str "4.5".

It is valid for a user to specify "none". It makes sense that the user may not want an offset, and can give "none" as a value (of course one could just specify 0 but that wouldn’t teach us anything). The variable will be set to None:

reader = InputReader()
reader.add_line_key('offset', type=(float, None))

inp = reader.read_input(['offset 50.0'])
print inp.offset # 50.0
inp = reader.read_input(['offset none'])
print inp.offset is None # True

There are always times when you may want more specificity than the native python types provide, but it is impractical to specify all possibilities. For this purpose, you may also give a compiled regular expression object as a type. For details on how regular expressions see the documentation for the re module. (This doesn’t really apply to our plotting program, so here is an arbitrary example):

import re
reader = InputReader()
reader.add_line_key('red', type=re.compile(r'silly\d+\w*thing'))

# OK
inp = reader.read_input(['red silly5314whatisthisthing'])
print inp.red

# Error
try:
    inp = reader.read_input(['red silly542.0notsogood'])
except ReaderError as e:
    print str(e)

The above code would output

silly5314whatisthisthing
...expected regex(silly\d+\w*thing), got "silly542.0notsogood"

2.4.1.2. Specifying multiple types¶

Continuing with our plotting program, we need to specify the color and shape of the data point markers on our plot, as well as the size of the marker (an integer). We should also be able to specify the color and size of the connecting lines. To give multiple data entries for a single line key, you must provide a list to type. Our definitions would be changed as follows:

reader = InputReader()
colors = ('green', 'red', 'blue', 'orange', 'black', 'violet')
reader.add_line_key('linestyle', type=[('solid', 'dashed', 'dotted'), colors, int])
reader.add_line_key('pointstyle', type=[('circles', 'squares', 'triangles'), colors, int])

user_input = ['linestyle solid black 2',
              'pointstyle squares blue 3']

inp = reader.read_input(user_input)
print inp
print inp.pointstyle[1]

# Some examples of what won't work
try:
    inp = reader.read_input(['linestyle dashed green'])
except ReaderError as e:
    print str(e)
try:
    inp = reader.read_input(['pointstyle squares red 5 extra'])
except ReaderError as e:
    print str(e)

The above code would output

Namespace(linestyle=('solid', 'black', 2), pointstyle=('squares', 'blue', 3))
blue
...expected 3 arguments, got 2
...expected 3 arguments, got 4

The parameters are read in the order in which they were defined. For this reason, parameters defined using the type option will be referred to as positional parameters.

Attention

The tuple vs. list distinction is very important for the type option; a tuple is used to define parameter choices, and a list is used to define multiple parameters. It is not legal to have a list inside of a list for the type object.

Warning

If you are specifying only one parameter, it is important to realize that the following are not equivalent:

reader.add_line_key('key1', type=str)
reader.add_line_key('key2', type=[str])

The former will store a str in the key attribute of the Namespace, whereas the latter will store a single-element list of a str. Let’s say that our input was ['key1 fish', 'key2 fish']. Our result would be:

print inp.key1    # fish
print inp.key2    # ["fish"]
print inp.key2[0] # fish

Obviously, this distinction will affect how you access the input data.

Note

Each of the parameters in the list follows the rules discussed for a single type as discussed in subsection Specifying one type.

Hint

If you do not wish to define any type parameters, you can give None. This may be useful when using the glob or keywords options.

2.4.2. case¶

Hint

case defaults to False, so the following two lines are equivalent:

reader.add_line_key('key')
reader.add_line_key('key', case=False)

The case option allows makes the parameters to the given line key case-sensitive. This does not make the keyword itself case-sensitive, just the parameters. The most obvious use-case would be for file names or labels. In our plotting program, we need to specify the file name that contains the raw data:

reader = InputReader()
reader.add_line_key('rawdata', case=True)

inp = reader.read_input(['rawdata /path/to/RAW/Data.txt'])

print inp.rawdata # /path/to/RAW/Data.txt
# If case=False, this would return /path/to/raw/data.txt

Obviously, this only affects str types. This has no affect on compiled regular expressions because case-sensitivity is determined at compile-time for regular expressions.

2.4.3. glob¶

Hint

glob defaults to {}, so the following two lines are equivalent:

reader.add_line_key('key')
reader.add_line_key('key', glob={})

Note

The options glob and keywords are mutually exclusive.

There are often times when you may not know the number of parameters the user will give. For this purpose, the option glob has been provided. With glob, it is possible to specify that a variable number of parameters will be given, with the options being:

* - Zero or more parameters

+ - One or more parameters

? - Zero or one parameters

glob must be given as a dict; the key len specifies one of the above three variable length specifiers. Two other important keys are type, which follows the same rules as discussed in subsection Specifying one type, and default, which is the default value assigned if the glob is not included. Like the type option, if the type key for glob is omitted, the default is str. The glob values are appended to the end of the tuple for the given key.

Thinking about our plotting program, we might prefer to not force the user to specify the size of the lines and points and default them to 1 if not included. This could be coded as:

reader = InputReader()
colors = ('green', 'red', 'blue', 'orange', 'black', 'violet')
reader.add_line_key('linestyle', type=[('solid', 'dashed', 'dotted'), colors],
                    glob={'len':'?', 'type':int, 'default':1})
reader.add_line_key('pointstyle', type=[('circles', 'squares', 'triangles'), colors],
                    glob={'len':'?', 'type':int, 'default':1})

# We choose the default size for linestyle
inp = reader.read_input(['linestyle dotted red',
                         'pointstyle circles violet 3'])

# The glob values are appended to the end of the tuple
print inp

try:
    inp = reader.read_input(['linestyle solid black 4 extra'])
except ReaderError as e:
    print str(e)

The above code would output

Namespace(linestyle=('dotted', 'red', 1), pointstyle=('circles', 'violet', 3))
...expected at most 3 arguments, got 4

There is a fourth key to the glob option, and it is join. Join causes all the globbed parameters to be joined together into a single space-separated string. The default value is False. join is useful when reading things like titles. For example, to allow the user to specify a title for the plot, we would use the following code:

reader = InputReader()
reader.add_line_key('title', type=None, case=True, glob={'len':'*', 'join':True, 'default':''})

inp = reader.read_input(['title The Best Plot EVER!!!'])

print inp.title # The Best Plot EVER!!!
# If join=False, this would be ('The', 'Best', 'Plot', 'EVER!!!')

Please note that in the above code, we used type=None. This means there are no required positional parameters, and only glob parameters will be read.

Note

When the glob option is used, the parameters will always be stored as a tuple with the glob parameters appended to the end of the positional parameters. There is one exception to this rule. If type equals None and join equals True, then the parameters will be returned as the joined string, similar to if type equals str and glob is omitted.

What if it was possible to read in raw data from multiple files? There are two ways we might choose to code this:

# Method 1
reader = InputReader()
reader.add_line_key('rawdata', case=True, type=str, glob={'len':'*'}) # Remember that str is the default type
inp = reader.read_input(['rawdata file1.txt file2.txt file3.txt'])
print inp.rawdata

# Method 2
reader = InputReader()
reader.add_line_key('rawdata', case=True, type=None, glob={'len':'+'})
inp = reader.read_input(['rawdata file1.txt file2.txt file3.txt'])
print inp.rawdata

The above code would output

('file1.txt', 'file2.txt', 'file3.txt')
('file1.txt', 'file2.txt', 'file3.txt')

OK, so what’s the difference between type=str, glob={'len':'*'} and type=None, glob={'len':'+'}? In the above example, nothing. However, if join were True, then method 1 above would return 'file1.txt file2.txt file3.txt', whereas method 2 would return ('file1.txt', 'file2.txt file3.txt'). This is an obscure use-case, but it may be important for you in the future.

Warning

If you set a default value for glob, the default will only be set if the keyname actually appears in the input file.

2.4.4. keywords¶

Hint

keywords defaults to {}, so the following two lines are equivalent:

reader.add_line_key('key')
reader.add_line_key('key', keywords={})

Note

The options glob and keywords are mutually exclusive.

There are times when a line key should offer optional parameters with more flexibility than can be offered by the glob option. The keywords option provides this flexibility. Each parameter specified in the keywords option is accessed through some keyword name; for this reason, we will refer to these parameters as named parameters.

keywords must be given as a dict with nested dict. Each named parameter in the keywords dict has a dict containing two possible keys: type and default. The rules for type are the same as described in the Specifying one type subsection, and default can be anything. If not specified, the default values of type and default are str and SUPPRESS, respectively. The named parameters can appear in any order, as long as they come after the positional parameters.

Going back to our plotting program, we might agree that the way we have set up the linestyle and pointstyle keys is not optimal. What if the user wants to accept a default value for the color, but change the size? To get around this, we will use the keywords option:

reader = InputReader()
colors = ('green', 'red', 'blue', 'orange', 'black', 'violet')
reader.add_line_key('linestyle', type=('solid', 'dashed', 'dotted'),
                    keywords={'color':{'type':colors,'default':'black'},
                              'size' :{'type':int,   'default':1}})
reader.add_line_key('pointstyle', type=('circles', 'squares', 'triangles'),
                    keywords={'color':{'type':colors,'default':'black'},
                              'size' :{'type':int,   'default':1}})

inp = reader.read_input(['linestyle solid size=3',
                         'pointstyle circles color=green'])

print inp.linestyle
print inp.pointstyle[-1]['color'], inp.pointstyle[-1]['size']

# Error!
try:
    inp = reader.read_input(['linestyle solid lobster=red'])
except ReaderError as e:
    print str(e)
# Another error!
try:
    inp = reader.read_input(['linestyle solid color red'])
except ReaderError as e:
    print str(e)

The above code would output

('solid', {'color': 'black', 'size': 3})
green 1
...Unknown keyword: "lobster"
...Error reading keyword argument "color"

This code illustrates three important points about named parameters.

The parameters are returned as a|tuple|, with the positional parameters first and the dict containing the named parameters appended to the end. This means that the named parameters will always be the last element of the tuple, so you may access them using the [-1] notation.

Note

When the keywords option is used, the parameters will always be stored as a tuple with the keywords parameter dict appended to the end of the positional parameters. There is one exception to this rule. If the optiojn type equals None, the parameters will be returned as only the keywords parameter dict.
The value of the parameter must be separated from the key by =. There are no exceptions. If they are separated by a space or anything else, an error will be raised. Be sure you make this clear to your users!
Unknown keys will raise a ReaderError.

Let’s take a look at what happens when no default is given. All along, we have forgotten to specify a name for the file of our plot! We will define this now. The user will give a filename, then an optional image format and compression:

reader = InputReader()
formats = ('pdf', 'png', 'jpg', 'svg', 'bmp', 'eps')
compress = ('zip', 'tgz', 'tbz2')
reader.add_line_key('output', case=True, type=str,
                    keywords={'format':{'type':formats},
                              'compression':{'type':compress}})

print reader.read_input(['output filename format=png compression=zip'])
print reader.read_input(['output filename format=png'])
print reader.read_input(['output filename compression=tgz'])
print reader.read_input(['output filename'])

The above code would output

Namespace(output=('filename', {'compression': 'zip', 'format': 'png'}))
Namespace(output=('filename', {'format': 'png'}))
Namespace(output=('filename', {'compression': 'tgz'}))
Namespace(output=('filename', {}))

Since the default default is SUPPRESS, named parameters not appearing in the input are omitted from the dict. This means that if no named parameters are given, an empty dict is returned.

Warning

If you set a default value for keyword, the default will only be set if the keyname actually appears in the input file.

2.5. `add_block_key()`¶

InputReader.add_block_key(keyname, end=u'end', case=None, ignoreunknown=None, **kwargs)¶

Add a block key to the input searcher.

Parameters:

keyname (str) – The name of the key to search for.
end (str) – The str used to signify the end of this block. The default is 'end'.
case (bool) – States if this particular key is case-sensitive. Note that this applies only to the subkeys of keyname; keyname itself uses the case-sensitivity default of the current level. By default, case is determined by the global value set when initiallizing the class.
ignoreunknown (bool) – Suppresses raising the ReaderError when an unknown key is found. By default, ignoreunknown is determined by the global value set when initiallizing the class.
required (bool) –
Indicates that not inlcuding keyname is an error. It makes no sense to give a default and mark it required as well. The default is False.

If keyname is part of a mutually exclusive group, it is best to set required for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
default –
The value stored for this key if it does not appear in the input block. A value of None is equivalent to no default. It makes no sense to give a default and mark it required as well. If the class SUPPRESS is given instead of None, then this key will be removed from the namespace if it is not given. The default is None.

If keyname is part of a mutually exclusive group and the group has been given a dest value, it is best to set default for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
dest (str) –
If dest is given, keyname will be stored in the returned Namespace as dest, not keyname. A value of None is equivalent to no dest. The default is None.

If keyname is part of a mutually exclusive group and the group has been given a dest value, do not set dest for individual members of the group.
depends (str) – Use depends to specify another key in the input file at the same input level (i.e. inside the same block or not in any block at all) that must also appear or a ReaderError will be raised. A value of None is equivalent to no depends. The default is None.
repeat (bool) – Determines if keyname can appear only once in the input file or several times. The default is False, which means this the keyname can only appear once or an error will be raised. If repeat is True, the collected data will be returned in a list in the order in which it was read in. The default is False.

Note

The options default, depends, dest, required and repeat are common beween add_boolean_key(), add_line_key(), add_block_key(), and add_regex_line() and therefore will be discussed together in the Common Options section.

A block key is a way to group logically similar keys together. It is also useful to separate some keys from the others. Block keys have the form of:

block_start
    key
    anotherkey
    etc...
end

The indentation is optional.

Let’s say that we wanted to add the capability of including a legend in our plotting program. We decide to do this as a block key. We need to specify the location of the legend, the size of the legend, and if the legend should have a shadow. We might code this as follows:

reader= InputReader()
# Notice that we return the legend block object
legend = reader.add_block_key('legend')
# We add keys to the legend block object just as we have done before
legend.add_boolean_key('shadow')
legend.add_line_key('location', type=('upper_left', 'upper_right',
                                      'lower_left', 'lower_right'))
legend.add_line_key('size', type=int)

from textwrap import dedent
from StringIO import StringIO
user_input = StringIO()
user_input.write(dedent('''\
                        legend
                            shadow
                            location upper_right
                            size 3
                        end
                        '''))

inp = reader.read_input(user_input)
print inp
print inp.legend.location

The above code would output

Namespace(legend=Namespace(shadow=True, location='upper_right', size=3))
upper_right

First, we should note that the block creates a Namespace held inside the legend attribute of the main Namespace. This makes it easy to access the keys within the block with the dot operator (shown for the location key).

2.5.1. end¶

By default, a block key is terminated by the word "end" (the default to the end option). However, it may make sense to use a different word. A perfect example would be a block inside of another block. The sub-block might use the word subend. Perhaps there are more than one thing to specify for the size parameter of the legend, and we need a sub-block for this:

reader = InputReader()
legend = reader.add_block_key('legend')
legend.add_boolean_key('shadow')
legend.add_line_key('location', type=('upper_left', 'upper_right',
                                      'lower_left', 'lower_right'))
size = legend.add_block_key('size', end='subend')
size.add_line_key('box', type=int)
size.add_line_key('font', type=int)

from textwrap import dedent
from StringIO import StringIO
user_input = StringIO()
user_input.write(dedent('''\
                        legend
                            shadow
                            location upper_right
                            size
                                box 2
                                font 5
                            subend
                        end
                        '''))

inp = reader.read_input(user_input)
print inp
print inp.legend.size.font

The above code would output

Namespace(legend=Namespace(shadow=True, location='upper_right', size=Namespace(box=2, font=5)))
5

We have a Namespace nested in a Namespace nested in a Namespace! There is no limit to the amount of nesting you can have, although your users may get irritated if it is arbitrarily complex.

2.5.2. case¶

The case option for a block key is identical to that of the case option for the InputReader class except that it only applies to the keys inside the block.

2.5.3. ignoreunknown¶

The ignoreunknown option for a block key is identical to that of the ignoreunknown option for the InputReader class except that it only applies to the keys inside the block.

2.6. `add_regex_line()`¶

InputReader.add_regex_line(handle, regex, case=None, **kwargs)¶

Add a regular expression line to the input searcher. This searches the entire line based on the given regex.

NOTE: You may either pass a string that will be converted to a regular expression object, or a compiled regular expression object.

Parameters:

handle (str) – The name to store the resultant regex match object in the namespace. This is required since there is technically no keyword.
regex (str, compiled re object) – The regular expression that is used to search each line.
case (bool) – Determines if the if the search of this line is case-sensitive. This only applies if a string is given as regex; you determine if the regex is case-sensitive or not if you compile it yourself. By default, case is determined by the global value set when initiallizing the class.
required (bool) –
Indicates that not inlcuding regex is an error. It makes no sense to give a default and mark it required as well. The default is False.

If regex is part of a mutually exclusive group, it is best to set required for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
default –
The value stored for this key if it does not appear in the input block. A value of None is equivalent to no default. It makes no sense to give a default and mark it required as well. If the class SUPPRESS is given instead of None, then this key will be removed from the namespace if it is not given. The default is None.

If regex is part of a mutually exclusive group and the group has been given a dest value, it is best to set default for the group as a whole and not set it for the individual members of the group because you may get unforseen errors.
dest (str) –
If dest is given, regex will be stored in the returned Namespace as dest, not handle. A value of None is equivalent to no dest. The default is None.

If regex is part of a mutually exclusive group and the group has been given a dest value, do not set dest for individual members of the group.
depends (str) – Use depends to specify another key in the input file at the same input level (i.e. inside the same block or not in any block at all) that must also appear or a ReaderError will be raised. A value of None is equivalent to no depends. The default is None.
repeat (bool) – Determines if regex can appear only once in the input file or several times. The default is False, which means this the regex can only appear once or an error will be raised. If repeat is True, the collected data will be returned in a list in the order in which it was read in. The default is False.

Note

The options default, depends, dest, required and repeat are common beween add_boolean_key(), add_line_key(), add_block_key(), and add_regex_line() and therefore will be discussed together in the Common Options section.

Sometimes it is necessary to accept user input that does not fit nicely into the categories discussed above. For these situations the regex line is offered. The regex line allows you to specify a regular expression that must match the input line. For details on how regular expressions see the documentation for the re module.

Let’s say that we want the user to be able to draw polygons on the plot. The user can specify a series of x,y points that define the vertices of the polygon. We could create a polygon block, and each line in the block is a vertex:

reader = InputReader()
polygon = reader.add_block_key('polygon')
polygon.add_regex_line('xypoint', r'(-?\d+\.?\d*) (-?\d+\.?\d*)', repeat=True)
# Another way to define the above is with the following three lines.  They are completely equivalent.
# import re
# reg = re.compile(r'(-?\d+\.?\d*) (-?\d+\.?\d*)')
# polygon.add_regex_line('xypoint', reg, repeat=True)

from textwrap import dedent
from StringIO import StringIO
user_input = StringIO()
user_input.write(dedent('''\
                        polygon
                            0 0
                            3.5 0
                            3.3 3.5
                            0 3.3
                        end
                        '''))

inp = reader.read_input(user_input)
for regex in inp.polygon.xypoint:
    print regex.group(0), regex.group(1), regex.group(2)

The above code would output

0 0 0
5 0 3.5 0
3 3.5 3.3 3.5
3.3 0 3.3

The repeat option will be discussed in subsection repeat; for now we will say that it allows a key to be repeated multiple times.

Even though the handle xypoint does not appear in the input file, we must specify it so that we have a name to access in the Namespace.

Note that the regex line cannot do any type checking for you. You will have to write your own post-processing to check that the types are correct and to parse the line so that the data is in a usable format.

2.6.1. case¶

The case option for the regex line is identical to the case option for the line key, with the exception that case only applies to regular expressions given to add_regex_line() as a string and not a compiled regular expression object. This is because the regular expression object has case-sensitivity built in at compile time.

2.7. Common Options¶

The methods add_boolean_key(), add_line_key(), add_block_key(), and add_regex_line() all share a set of options: dest, default, depends, required, and repeat. They will be discussed together here.

2.7.1. dest¶

The dest option specifies that a key will appear under a different name in the Namespace than it does in the input.

Let’s think back to our unit conversion example with the boolean keys in the action section. The result we came up with was not quite satisfactory because it required a lot of code to do relatively little work. It would be more convenient to have a single variable to place a group of boolean keys into. We might code this as follows

reader = InputReader()
# The distance units
reader.add_boolean_key('meters', action=lambda x: 1.0 * x, dest='distconv')
reader.add_boolean_key('centimeters', action=lambda x: 100.0 * x, dest='distconv')
reader.add_boolean_key('kilometers', action=lambda x: 0.001 * x, dest='distconv')
reader.add_boolean_key('milimeters', action=lambda x: 1000.0 * x, dest='distconv')
# The time units
reader.add_boolean_key('seconds', action=lambda x: x / 1.0, dest='timeconv')
reader.add_boolean_key('minutes', action=lambda x: x / 60.0, dest='timeconv')
reader.add_boolean_key('hours', action=lambda x: x / 3600.0, dest='timeconv')

inp = reader.read_input(['centimeters', 'minutes'])

# No matter what boolean key was used, the conversion function is under
# "distconv".  The original key names have been removed.
if 'centimeters' in inp:
    print "This will never print"

print inp.distconv(50), inp.timeconv(1800)
inp = reader.read_input(['kilometers', 'hours'])
print inp.distconv(50), inp.timeconv(1800)

# Error:
try:
    inp = reader.read_input(['meters', 'milimeters', 'hours'])
except ReaderError as e:
    print str(e)

The above code would output

5000.0 30.0
0.05 0.5
...The key "..." appears twice

As you will see later, it is often more advantageous to use dest in conjunction with add_mutually_exclusive_group() because it provides more protection against bad user input for keys that are grouped together.

2.7.2. default¶

This is the default value of a key. It defaults to None.

2.7.3. depends¶

The depends option specifies that in order for this key to be valid, another key must also appear in the input. For example, let’s say that we add the boolean key miles to the unit conversion list. In addition, we add nautical, but this only makes sense in the context of miles. Therefore, the key nautical depends on the miles key. This part of the code would be given as follows:

reader = InputReader()
reader.add_boolean_key('miles')
reader.add_boolean_key('nautical', depends='miles')

# This is fine
inp = reader.read_input(['miles', 'nautical'])
# Error!
try:
    inp = reader.read_input(['nautical'])
except ReaderError as e:
    print str(e) #

The above code would output

...The key "nautical" requires that "miles" is also present, but it is not

2.7.4. required¶

This specifies that the given key is required to appear in the input file. This is not likely to be necessary for a boolean key, but may be necessary for a line or block key. For example, our rawdata line key accepts the file that contains the raw data to plot. Our plotting program would not be able to do anything without this line, so we make it required:

reader = InputReader()
reader.add_line_key('rawdata', case=True, required=True)
reader.add_boolean_key('dummy')

# This works as expected
inp = reader.read_input(['dummy', 'rawdata filename.txt'])
# Error!
try:
    inp = reader.read_input(['dummy'])
except ReaderError as e:
    print str(e)

...The key "rawdata" is required but not found

Hint

Don’t bother defining a default if required equals True.

2.7.5. repeat¶

By default, a key is only allowed to appear once; if it appears twice a ReaderError is raised. However, there are certain use cases when it makes sense to have a key repeat. If this is is the case, you can specify the repeat option to be true. The values will be returned in a tuple, so you will have to be wary of this when extracting the data from the Namespace.

Instead of defining multiple rawdata files on one line as we did in the glob subsection, perhaps we would want to define multiple files on different lines:

reader = InputReader()
reader.add_line_key('rawdata', case=True, repeat=True, required=True)
inp = reader.read_input(['rawdata filename3.txt', 'rawdata filename6.txt', 'rawdata filename1.txt'])
print inp

reader = InputReader()
reader.add_line_key('rawdata', case=True, required=True)
try:
    inp = reader.read_input(['rawdata filename3.txt', 'rawdata filename6.txt', 'rawdata filename1.txt'])
except ReaderError as e:
    print str(e)

The above code would output

Namespace(rawdata=('filename3.txt', 'filename6.txt', 'filename1.txt'))
...The key "..." appears twice

The order of the tuple returned when repeat is True is the same as the order the keys appear in the input file.

2.8. `add_mutually_exclusive_group()`¶

InputReader.add_mutually_exclusive_group(dest=None, default=None, required=False)¶

Defines a mutually exclusive group.

Parameters:

dest (str) – Defines an alternate name for the key to be stored in rather than the keyname. Useful if you you wish to access the data from the mutually exclusive group without having to search the names of all the keys in the group. It also removes the names of the keys in this group from the Namespace. NOTE: It is best not to set the dest value for members of the group (just the group itself), as it may result in undetected errors.
default – The default to use for the mutually exclusive group. This is only valid if dest is defined. This overwrites the defaults of the keys in this group. NOTE: It is best not to set the default value for members of the group (just the group itself) as it as it may result in undetected errors. If SUPPRESS is given then this group will not apprear in the namespace if not found in input.
required (bool) – At one of the members of this group is required to be in the input file NOTE: It is best not to set the required status for members of the group (just the group itself), as it may result in the program flagging errors for keys in this group when there in fact is no error.

There are times when certain keys cannot appear with other keys in an input file. This is especially often true of boolean keys. The add_mutually_exclusive_group() method allows you to declare a group of keys that may not appear in the input file together. Let’s look back to our unit conversion example.

reader = InputReader()
# The distance units
dunits = reader.add_mutually_exclusive_group()
dunits.add_boolean_key('milimeters')
dunits.add_boolean_key('centimeters')
dunits.add_boolean_key('meters')
dunits.add_boolean_key('kilometers')
# The time units
tunits = reader.add_mutually_exclusive_group()
tunits.add_boolean_key('seconds')
tunits.add_boolean_key('minutes')
tunits.add_boolean_key('hours')

# OK!
inp = reader.read_input(['meters', 'seconds'])
# Error!!!!
try:
    reader.read_input(['meters', 'milimeters', 'seconds'])
except ReaderError as e:
    print str(e)

The above code would output

...Only one of 'centimeters', 'kilometers', 'meters', or 'milimeters' may be included.

2.8.1. dest¶

We illustrated that we can use the dest option of the add_boolean_key() method to send keys to a different namespace. If your keys are part of a mutually exclusive group, you should let the group handle this for you. This will be cleaner and give better error messages:

reader = InputReader()
# The distance units
dunits = reader.add_mutually_exclusive_group(dest='distconv')
dunits.add_boolean_key('meters', action=lambda x: 1.0 * x)
dunits.add_boolean_key('centimeters', action=lambda x: 100.0 * x)
dunits.add_boolean_key('kilometers', action=lambda x: 0.001 * x)
dunits.add_boolean_key('millimeters', action=lambda x: 1000.0 * x)
# The time units
tunits = reader.add_mutually_exclusive_group(dest='timeconv')
tunits.add_boolean_key('seconds', action=lambda x: x / 1.0)
tunits.add_boolean_key('minutes', action=lambda x: x / 60.0)
tunits.add_boolean_key('hours', action=lambda x: x / 3600.0)

inp = reader.read_input(['centimeters', 'minutes'])

# No matter what boolean key was used, the conversion function is under
# "distconv".  The original key names have been removed.
if 'centimeters' in inp:
    print "This will never print"

print inp.distconv(50), inp.timeconv(1800)
inp = reader.read_input(['kilometers', 'hours'])
print inp.distconv(50), inp.timeconv(1800)

# Error:
try:
    inp = reader.read_input(['meters', 'millimeters', 'hours'])
except ReaderError as e:
    print str(e)

The above code would output

5000.0 30.0
0.05 0.5
...Only one of 'centimeters', 'kilometers', 'meters', or 'millimeters' may be included.

2.8.2. default¶

You may notice that the aboce code cannot handle the situation of a missing distance coversion or time conversion:

inp = reader.read_input(['minutes'])
try:
    print inp.distconv(50), inp.timeconv(1800)
except TypeError as e:
    print str(e) # 'NoneType' object is not callable

Obvously, we can work around this issue by having a default for the whole mutually exclusive group:

reader = InputReader()
# The distance units
dunits = reader.add_mutually_exclusive_group(dest='distconv', default=lambda x: 1.0 * x)
dunits.add_boolean_key('meters', action=lambda x: 1.0 * x)
dunits.add_boolean_key('centimeters', action=lambda x: 100.0 * x)
dunits.add_boolean_key('kilometers', action=lambda x: 0.001 * x)
dunits.add_boolean_key('millimeters', action=lambda x: 1000.0 * x)
# The time units
tunits = reader.add_mutually_exclusive_group(dest='timeconv', default=lambda x: x / 1.0)
tunits.add_boolean_key('seconds', action=lambda x: x / 1.0)
tunits.add_boolean_key('minutes', action=lambda x: x / 60.0)
tunits.add_boolean_key('hours', action=lambda x: x / 3600.0)

inp = reader.read_input(['minutes'])
print inp.distconv(50), inp.timeconv(1800)

The above code would output

50.0 30.0

2.8.3. required¶

An alternative to supplying a default is to simply make one of the keys in the mutually exclusive group required:

reader = InputReader()
# The distance units
dunits = reader.add_mutually_exclusive_group(dest='distconv', required=True)
dunits.add_boolean_key('meters', action=lambda x: 1.0 * x)
dunits.add_boolean_key('centimeters', action=lambda x: 100.0 * x)
dunits.add_boolean_key('kilometers', action=lambda x: 0.001 * x)
dunits.add_boolean_key('millimeters', action=lambda x: 1000.0 * x)
# The time units
tunits = reader.add_mutually_exclusive_group(dest='timeconv', required=True)
tunits.add_boolean_key('seconds', action=lambda x: x / 1.0)
tunits.add_boolean_key('minutes', action=lambda x: x / 60.0)
tunits.add_boolean_key('hours', action=lambda x: x / 3600.0)

try:
    inp = reader.read_input(['minutes'])
except ReaderError as e:
    print str(e)

The above code would output

...One and only one of 'centimeters', 'kilometers', 'meters', or 'millimeters' must be included.

Hint

Don’t bother defining a default if required equals True.

2.9. Putting it all together¶

Let’s now take the best of the above examples to make a full working input reader definition for the plotting program:

from input_reader import InputReader, ReaderError
reader = InputReader()

# Distance conversion booleans.  Default is meters.
dunits = reader.add_mutually_exclusive_group(dest='distconv', default=lambda x: 1.0 * x)
dunits.add_boolean_key('meters', action=lambda x: 1.0 * x)
dunits.add_boolean_key('centimeters', action=lambda x: 100.0 * x)
dunits.add_boolean_key('kilometers', action=lambda x: 0.001 * x)
dunits.add_boolean_key('millimeters', action=lambda x: 1000.0 * x)

# Time conversion booleans. Default is seconds.
tunits = reader.add_mutually_exclusive_group(dest='timeconv', default=lambda x: x / 1.0)
tunits.add_boolean_key('seconds', action=lambda x: x / 1.0)
tunits.add_boolean_key('minutes', action=lambda x: x / 60.0)
tunits.add_boolean_key('hours', action=lambda x: x / 3600.0)

# The raw data file(s)
reader.add_line_key('rawdata', case=True, repeat=True, required=True)

# Output file
formats = ('pdf', 'png', 'jpg', 'svg', 'bmp', 'eps')
compress = ('zip', 'tgz', 'tbz2', None)
reader.add_line_key('output', case=True, type=str, required=True,
                    keywords={'format':{'type':formats, 'default':'pdf'},
                              'compression':{'type':compress, 'default':None}})

# Line and point styles
colors = ('green', 'red', 'blue', 'orange', 'black', 'violet')
reader.add_line_key('linestyle', type=('solid', 'dashed', 'dotted'),
                    keywords={'color':{'type':colors,'default':'black'},
                              'size' :{'type':int,   'default':1}})
reader.add_line_key('pointstyle', type=('circles', 'squares', 'triangles'),
                    keywords={'color':{'type':colors,'default':'black'},
                              'size' :{'type':int,   'default':1}})

# Optional legend on the plot
legend = reader.add_block_key('legend')
legend.add_boolean_key('shadow')
legend.add_line_key('location', type=('upper_left', 'upper_right',
                                      'lower_left', 'lower_right'))
size = legend.add_block_key('size', end='subend')
size.add_line_key('box', type=int)
size.add_line_key('font', type=int)

# Optional polygon(s) to draw on the plot
polygon = reader.add_block_key('polygon', repeat=True)
polygon.add_regex_line('xypoint', r'(-?\d+\.?\d*) (-?\d+\.?\d*)', repeat=True)

Let’s say that we give the following input to the input reader:

# Plot in centimeters
centimeters
# ... and in hours
hours

# Read from two data files
rawdata /path/to/DATA.txt # absolute path
rawdata ../raw.txt        # relative path

# Output filename and format
output myplot format=png compression=zip

# Line and point styles
linestyle dashed
# Point style... make them big, and green!
pointstyle circles color=green size=8

# Note there is no legend or polygon in this input

This input file is passed to the input reader and the results are as follows:

# You should always wrap the read_input code in a try block
# to catch reader errors
try:
    inp = reader.read_input(user_input)
except ReaderError as e:
    import sys
    sys.exit(str(e))

# Let's take a look at the Namespace
print inp.rawdata
print inp.output
print inp.linestyle
print inp.pointstyle
print inp.legend, inp.polygon
print inp.distconv(400)
print inp.timeconv(7200)

The above code would output

('/path/to/DATA.txt', '../raw.txt')
('myplot', {'compression': 'zip', 'format': 'png'})
('dashed', {'color': 'black', 'size': 1})
('circles', {'color': 'green', 'size': 8})
None None
40000.0
2.0

2.10. `post_process()`¶

InputReader.post_process(namespace)¶

Perform post-processing of the data collected from the input file.

This is a “virtual” method... does nothing and is intended to be re-implemented in a subclass.

Please see Subclassing InputReader for more details.

2.11. `input_file`¶

This is an attribute of InputReader that holds the input file given to the reader with comments removed.

2.12. `filename`¶

The name of the file passed to InputReader.

2.13. Gotchas¶

2.13.1. case-sensitivity¶

If you have set case to True in either the InputReader constructor or in a block key, the variables in the Namespace must be accessed with the same case that was given in the definition. Conversely, if case is False, the variables will be accessed with a lower-cased version. In the case = True version:

reader = InputReader(case=True)
reader.add_boolean_key('RED')
try:
    inp = reader.read_input(['red']) # Error, 'red' != 'RED'
except ReaderError:
    pass
inp = reader.read_input(['RED'])
print 'red' in inp # False
print 'RED' in inp # True

In the case = False version (default):

reader = InputReader()
reader.add_boolean_key('RED')
inp = reader.read_input(['red'])
print 'red' in inp # True
print 'RED' in inp # False
inp = reader.read_input(['RED'])
print 'red' in inp # True
print 'RED' in inp # False

2.13.2. Strings with spaces¶

InputReader does not let you use strings with spaces in them. This is because it is impossible (read: very difficult to implement) to parse each line without splitting them on whitespace first. If a key name or other given str had a space, it would be split and be difficult to detect, resulting in unforeseen parsing errors. For this reason, InputReader will raise an error if it is attempted to give a str with spaces.

from input_reader import InputReader
reader = InputReader()
try: # ValueError is raised because a bad string value is given
    reader.add_boolean_key('hard to parse')
except ValueError:
    pass
try: # Same reason
    reader.add_line_key('red', type=('OK', 'NOT OK'))
except ValueError:
    pass

2.13.3. Regular expressions with spaces¶

For the same reasons as above, regular expression objects that might allow spaces will raise a ValueError. Not only does this include regular expressions with an explicit space, but also with whitespace character ("s") and the anything character (".") as these may potentially match spaces.

Navigation

Table Of Contents

Previous topic

Next topic

This Page

Quick search

2. The InputReader Class¶

2.1. InputReader options¶

2.1.1. comment¶

2.1.2. case¶

2.1.3. ignoreunknown¶

2.1.4. default¶

2.2. read_input()¶

2.3. add_boolean_key()¶

2.3.1. action¶

2.4. add_line_key()¶

2.4.1. type¶

2.4.1.1. Specifying one type¶

2.4.1.2. Specifying multiple types¶

2.4.2. case¶

2.4.3. glob¶

2.4.4. keywords¶

2.5. add_block_key()¶

2.5.1. end¶

2.5.2. case¶

2.5.3. ignoreunknown¶

2.6. add_regex_line()¶

2.6.1. case¶

2.7. Common Options¶

2.7.1. dest¶

2.7.2. default¶

2.7.3. depends¶

2.7.4. required¶

2.7.5. repeat¶

2.8. add_mutually_exclusive_group()¶

2.8.1. dest¶

2.8.2. default¶

2.8.3. required¶

2.9. Putting it all together¶

2.10. post_process()¶

2.11. input_file¶

2.12. filename¶

2.13. Gotchas¶

2.13.1. case-sensitivity¶

2.13.2. Strings with spaces¶

2.13.3. Regular expressions with spaces¶

Navigation

2. The `InputReader` Class¶

2.1. `InputReader` options¶

2.2. `read_input()`¶

2.3. `add_boolean_key()`¶

2.4. `add_line_key()`¶

2.5. `add_block_key()`¶

2.6. `add_regex_line()`¶

2.8. `add_mutually_exclusive_group()`¶

2.10. `post_process()`¶

2.11. `input_file`¶

2.12. `filename`¶