modgrammar.debugging – Debugging Modgrammar grammars

The modgrammar.debugging module contains classes and constants used for debugging modgrammar grammars, and in some cases the parser itself.

Invoking Debug Mode

Debugging mode can be turned on or off for each GrammarParser used, and debug output can be sent to any of a number of different places, depending on your requirements (including different output for different parsers). When creating a new parser (using parser()), enabling debug mode is done by setting the debug parameter.

By default, debug is set to False, which causes all debugging to be disabled. If it is set to True, debugging will be enabled, and all debug output will be output using the Python logging subsystem. By default, this is done using the “modgrammar” logger (i.e. by using logging.getLogger("modgrammar")).

Alternately, if you wish debug logging output to go to a different logger object, you can specify a logger name instead (i.e. debugging="some.other.logger"). In this case, Modgrammar will use logging.getLogger() to look up the appropriate logger to use. Alternately, you can also just pass your own instance of logging.Logger to use as well.

Note

All debuging messages will be output to the specified logger using a log level of DEBUG, which logger objects do not display by default. You need to make sure that the log level of your logging heirarchy is set to display DEBUG messages somewhere, or you will not see any of the Modgrammar debug output.

If you don’t want to mess around with the Python logging framework and just want a quick-and-dirty way to print debugging information to a file or stream, you can also pass any open file-like object (descended from io.Base) as the argument to debug (i.e. debug=sys.stderr). Debug messages will be output to that file prefixed by “–”.

Finally, for advanced users, the debug parameter can also be set to an instance of a GrammarDebugger class (or subclass), which allows the ability to do entirely custom debugging.

Debugging Flags

If debugging is enabled, the debug_flags parameter of parser() can be used to specify what types of messages should be logged. This is specified as a bitmask of any of the following constants or-ed together:

modgrammar.debugging.DEBUG_TRY

Show “Trying” debug messages whenever entering a grammar for the first time at a given position.

modgrammar.debugging.DEBUG_RETRY

Show “Retrying” debug messages whenever re-entering a grammar to try for additional matches at a given position (if a previous successful match did not work with the larger grammar).

modgrammar.debugging.DEBUG_FAILURES

Show match failures in debug output.

modgrammar.debugging.DEBUG_SUCCESSES

Show match successes in debug output.

modgrammar.debugging.DEBUG_PARTIALS

Show “partial match” results (where the parser needs more input (or EOF) to determine a match or failure) in debug output.

modgrammar.debugging.DEBUG_WHITESPACE

Show information about skipped whitespace in debug output.

modgrammar.debugging.DEBUG_ALL

Show all types of debug messages (this is equivalent to specifying all of the above together).

Additionally, there are some debug flags which determine what to display based on the grammar being parsed:

modgrammar.debugging.DEBUG_TERMINALS

Some grammars marked as “terminals” actually have sub-grammars (which are normally just hidden from the user). Showing these can be confusing if one is not expecting them (and generally the grammars inside of terminals are fairly well tested already), so by default the debugger does not show information about sub-grammars of terminals. Supplying this debug option will cause it to output this additional information.

modgrammar.debugging.DEBUG_OR

By default, to save some space, and because it usually does not supply any additional information, the debugger does not explicitly show OR() grammars as separate elements, but instead pretends that their subgrammars are called directly by the parent (this usually matches the way the grammars are written, when using the or-operator (“|”), and so is more intuitive). However, if you want to display OR() grammars explicitly in the debug output, you can supply this flag.

modgrammar.debugging.DEBUG_FULL

Turn on all special debugging options. This is equivalent to DEBUG_TERMINALS | DEBUG_OR.

The following constants are also provided for convenience:

modgrammar.debugging.DEBUG_DEFAULT

The default debugging flags if none are specified. This is equivalent to DEBUG_FAILURES | DEBUG_SUCCESSES | DEBUG_PARTIALS | DEBUG_WHITESPACE.

modgrammar.debugging.DEBUG_NONE

Disable all debugging flags (this will produce no debugging output, but will still perform many of the additional correctness checks in the debugger, so can be useful if you suspect something may be wrong with the grammar parser code and want to perform additional sanity-checks).

The GrammarDebugger Class

class modgrammar.debugging.GrammarDebugger

GrammarDebugger objects are able to hook into the modgrammar parsing logic to inspect the parsing process as it progresses, perform validation checks, and output debugging information. They are normally created automatically based on the debug parameter provided to the parser() method when creating a new GrammarParser.

For advanced users, it is possible to create your own subclass of GrammarDebugger and pass that to parser() in order to perform customized debugging functions.

Attributes

Debugger objects have the following attributes, initialized based on the parameters provided to the constructor by parser():

GrammarDebugger.logger

A logging.Logger instance (or equivalent) which should be used for outputting debugging info.

GrammarDebugger.show_try

True or False, depending on whether DEBUG_TRY was set in the debug flags

GrammarDebugger.show_retry

True or False, depending on whether DEBUG_RETRY was set in the debug flags

GrammarDebugger.show_failures

True or False, depending on whether DEBUG_FAILURES was set in the debug flags

GrammarDebugger.show_successes

True or False, depending on whether DEBUG_SUCCESSES was set in the debug flags

GrammarDebugger.show_partials

True or False, depending on whether DEBUG_PARTIALS was set in the debug flags

GrammarDebugger.show_whitespace

True or False, depending on whether DEBUG_WHITESPACE was set in the debug flags

GrammarDebugger.debug_terminals

True or False, depending on whether DEBUG_TERMINALS was set in the debug flags

GrammarDebugger.show_or

True or False, depending on whether DEBUG_OR was set in the debug flags

Additionally, the default implementation of debug_wrapper() maintains the following properties during debugging:

GrammarDebugger.stack

A list of [(id(grammar), pos), grammar, lastmatch, [sub-element stacks]] entries, one for each level which has been descended into to reach this position. (The stack format is difficult to explain concisely. For detailed examples of how this works it is probably best to look at the source of the GrammarDebugger methods (such as stack_summary()))

GrammarDebugger.seen

A set of (id(grammar), pos) tuples for all of the grammars and corresponding match positions which have been descended into to reach this point (this is used to quickly check for left-recursion).

GrammarDebugger.in_terminal

If this attribute is non-zero, indicates that the parser is currently processing subgrammars inside a grammar which is marked as a terminal, and therefore debugging info should not be printed unless debug_terminals is True.

Methods

GrammarDebugger.__init__(output=True, flags=DEBUG_DEFAULT)

Initialize the GrammarDebugger. output and flags correspond to the debug and debug_flags passed to parser().

While not strictly required, it is strongly recommended if you override this method that you call GrammarDebugger.__init__(self, output, flags) from your own __init__() method. This will make sure that all of the standard attributes are configured properly.

GrammarDebugger.debug_wrapper(parser_generator, grammar, pos, text)

This is the main method called by the parser to hook into the debugger. parser_generator is the generator object which was obtained by calling grammar‘s grammar_parse() method. The debug_wrapper() method is expected to wrap that generator and return a new generator to be used instead during parsing.

The default implementation maintains the stack, seen and in_terminal attributes automatically, checks all yielded results by calling check_result(), and then calls the various match_*() methods, as appropriate.

GrammarDebugger.check_result(grammar, pos, text, result)

Called by debug_wrapper() after every iteration of the parser function to validate that the result it returned was reasonable.

(The default version of this method performs a number of special sanity checks which are not normally performed for performance reasons, because they should theoretically never happen.)

GrammarDebugger.ws_skipped(grammar, pos, text, offset)

Called by the parser when whitespace has been automatically skipped between subgrammars (when grammar_whitespace_mode is 'optional' or 'required').

GrammarDebugger.ws_not_found(grammar, pos, text)

Called by the parser when whitespace between subgrammars was required (because grammar_whitespace_mode is 'required'), but the required whitespace was not found.

GrammarDebugger.error_left_recursion(stack)

Called by debug_wrapper() when left-recursion has been detected.

The default implementation will construct an appropriately descriptive error message based on the current stack, and then raise a GrammarDefError.

GrammarDebugger.match_try(grammar, pos, text, substack)

Called by debug_wrapper() before trying to match a grammar (at a particular position) for the first time.

GrammarDebugger.match_retry(grammar, pos, text, substack)

Called by debug_wrapper() before re-entering a grammar (at a particular position) to look for additional matches because a previously successful match did not work for the larger grammar.

GrammarDebugger.match_failed(grammar, pos, text, substack)

Called by debug_wrapper() after a grammar returns a match-failure result.

GrammarDebugger.match_partial(grammar, pos, text, substack, matched)

Called by debug_wrapper() after a grammar returns a partial-match result (it needs more input or EOF to determine whether it has a match or not).

GrammarDebugger.match_success(grammar, pos, text, substack, matched)

Called by debug_wrapper() after a successful match is returned.

GrammarDebugger.stack_summary(stack=None)

Accepts a stack (in the format of stack) and returns a string summary of the current state to display to the user.

If stack is not provided, defaults to using the current contents of the stack attribute.

GrammarDebugger.event(msg, pos=None, grammar=None)

Output a debugging “event”. This is called by the default implementations of the match_*() methods to actually output all debugging information.

The default implementation checks in_terminal and the grammar type against debug_terminals and show_or to determine whether to display the event, and then calls logger‘s debug() method to output the message, along with a stack_summary().