This text is intended to help people already familiar with Pratt parsers to start using this library to write them. If you want a full introduction to the concept(s) behind this kind of parser, go to the next section, the tutorial.
It’s suggested that your code follows this structure:
It is also suggested that you encapsulate definitions for a single instance of a parser inside a single closure (see below for an example). This way you can be sure that each parser instance has its own state and doesn’t interfere with other parsers.
Usually, you’ll want to put all binding power settings into a single place, using the bp_spec helper. For each level of precedence, you write a single line with all symbols of that binding power, and finally the binding power. You can write two lines with the same binding power, and the lines don’t need to be ordered by binding power. PrattParse generally doesn’t place any restrictions on symbol ids, but functions like these only work correctly with ids that don’t contain whitespace.
Take the following parser as a simple example. It handles integer literals, identifiers, the four arithmetic operators, parentheses and single-argument function calls:
def Parser():
p = parser.ParserSkeleton()
h = symbol.Helpers(p)
h.bp_spec('''
LPAREN 100
MUL DIV 20
ADD SUB 10
''')
h.literals('NAME INT')
h.infixes('MUL DIV ADD SUB')
h.symbol('RPAREN')
p.symbols.close()
# custom actions
@h.nud_of('LPAREN')
def p_parens(self):
inner = p.expression()
p.advance('RPAREN')
return inner
@h.led_of('LPAREN')
def p_call(self, left):
self.first = left
self.second = p.expression()
p.advance('RPAREN')
return inner
# return callable that's a parser for the "start" of the grammar
return p.expression
The two primary exports of the library are the ParserSkeleton and the Helpers. The former handles token consumption and exposes a minimal interface to the symbol table (the symbol(id, [bp]) function). The latter provides all kinds of convenience functions for defining symbols, their binding power, and actions.
For {pre, in, post}-fix operators, there are helpers for defining them, as well as pluralized versions that define several at once. There’s also a version with a _r suffix for infix operators which makes the operator right-associative. All of them store the (first) operand in a member called .first, and the infix operators store their second operand in a member called .second.
Also, there’s a myriad of optional arguments for the parser skeleton and the helper. For the parser, the most frequently needed options are:
The keyword arguments are:
As a rule of thumb for what’s a helper method and what’s a parser method:
Of course, there are a few exceptions, such as p.symbols.close(). This may change later, but with the current architecture, it is a necessary and perhaps lesser evil.
This is a feature many if not most parsers will find useful, but it’s still an extension and thus in a seperate module. You may know the technique of statement denotations, or std for short, from Crockford’s article. If you don’t, here’s a short summary:
PrattParse extends this idea slightly by generalizing the statement terminator part. Usually, the grammar requires some kind of statement terminator after most statements and after expressions used as statements. Our statement function automatically checks for one based on the boolean need_terminator attribute of the result (regardless of whether it came from expression or std).
To add statement handling, you can simply change the imports to from prattparse.statement import ParserSkeleton, Helpers. Then just add the id of the statement terminator symbol as first argument to the parser skeleton instanciation. The parser and helper work exactly like the basic ones, they just add a few methods. The parser simply gets the described statement function. The helper has a few new methods: