Fro Interface

Parser

Parser objects are immutable. Therefore, many Parser methods return a new parser instead of modifying the called parser.

For explanations of terminology like “chomp” “chunk” and “significant”, see Fro 101.

class Parser

An immutable parser.

__invert__()

Returns a new Parser that is equivalent to self but is insignificant.

Returns:an insignificant copy of the called parser
Return type:Parser

Example:

commap = fro.rgx(r",")
composition = fro.comp([~fro.intp, ~commap, fro.intp]).get()
composition.parse("2,3")  # evaluates to 3
__or__(func)

Returns a new Parser object that applies func to the values produced by self. The new parser has the same name and significance as self.

Parameters:U] func (Callable[[T],) – function applied to produced values
Returns:a new parser that maps produced values using func
Return type:Parser

Example:

parser = fro.intp | (lambda x: x * x)
parser.parse_str("4")  # evaluates to 16
__rshift__(func)

Returns a Parser object that unpacks the values produced by self and then applies func to them. Throws an error if the number of unpacked arguments does not equal a number of arguments that func can take, or if the value by produced self is not unpackable. Equivalent to self | lambda x: func(*x).

The new parser has the same name and significance as self.

Parameters:U] func (Callable[?,) – function applied to unpacked produced values
Returns:a new parser that maps produced values using func
Return type:Parser

Example:

parser = fro.comp([fro.intp, r"~,", fro.intp]) >> (lambda x, y: x + y)
parser.parse_str("4,5")  # evaluates to 9
append(value)

Returns a parser that chomps with the called parser, chomps with the Parser represented by value, and produces the value produced by the called parser. The returned parser has the same name as significance as self.

Parameters:value (Union[Parser,str]) – parser to “append” to self
Returns:value appended to self
Return type:Parser
get()

Returns a Parser object that retrieves the sole first element of the value produced by self, and throws an error if self produces an non-iterable value or an iterable value that does not have exactly one element. Equivalent to self >> lambda x: x.

Returns:parser that unpacks the sole produced value
Return type:Parser

Example:

# Recall that comp(..) always produces a tuple, in this case a tuple with one value
parser = fro.comp(r"~\(", fro.intp, r"~\)").get()
parser.parse_str("(-3)")  # evaluates to -3
lstrip()

Returns a parser that is equivalent to self, but ignores and consumes any leading whitespace inside a single chunk. Equivalent to fro.comp([r"~\s*", self]).get(), but with the same name and significance as self.

Returns:a parser that ignores leading whitespace inside a single chunk
Return type:Parser

Example:

parser = fro.rgx(r"[a-z]+").lstrip()

# Will succeed, producing "hello". It's okay if there's no whitespace
parser.parse_str("hello")

# Will succeed, producing "world"
parser.parse_str("\nworld")

# Will succeed, producing "planet". Note that the leading whitespace is
# confined to a single chunk (even though this chunk is different than
# the chunk that "planet" appears in)
parser.parse(["  ", "planet"])

# Will fail, leading whitespace is across multiple chunks
parser.parse(["  ", "\tgalaxy"])
lstrips()

Returns a parser that is equivalent to self, but ignores and consumes any leading whitespace across multiple chunks.

Returns:a parser that ignored leading whitespace
Return type:Parser

Example:

parser = fro.rgx(r"[a-z]+").lstrips()

# Will succeed, producing "hello". It's okay if there's no whitespace
parser.parse_str("hello")

# Will succeed, producing "world"
parser.parse_str("\nworld")

# Will succeed, producing "planet". Note that the leading whitespace is
# confined to a single chunk (even though this chunk is different than
# the chunk that "planet" appears in)
parser.parse(["  ", "planet"])

# Will succeed, producing "galaxy". Unlike lstrip(), lstrips() can handle
# whitespace across multiple chunks
parser.parse(["  ", "\r\r", "\tgalaxy"])
maybe(default=None)

Returns a parser equivalent to self, but defaults to consuming none of the input string and producing default when self fails to chomp a string. See Fro 101 for an explanation of chomping.

Parameters:default (Any) – default value to produce
Returns:parser that defaults to consuming nothing and producing default instead of failing
Return type:Parser

Example:

parser = fro.comp([fro.rgx(r"ab+").maybe("a"), fro.intp])
parser.parse_str("abb3")  # evaluates to ("abb", 3)
parser.parse_str("87")  # evaluates to ("a", 87)
name(name)

Returns a parser equivalent to self, but with the given name.

Parameters:name (str) – name for new parser
Returns:a parser identical to this, but with specified name
Return type:Parser
parse(lines, loud=True)

Parse an iterable collection of chunks. Returns the produced value, or throws a FroParseError explaining why the parse failed (or returns None if loud is False).

Parameters:
  • lines (Iterable[str]) –
  • loud (bool) – if parsing failures should result in an exception
Returns:

Value produced by parse

parse_file(filename, encoding='utf-8', loud=True)

Parse the contents of a file with the given filename, treating each line as a separate chunk. Returns the produced value, or throws a FroParseError explaining why the parse failed (or returns None if loud is False).

Parameters:
  • filename – filename of file to parse
  • encoding – encoding of filename to parse
  • loud – if parsing failures should result in an exception
Returns:

value produced by parse

parse_str(string_to_parse, loud=True)

Attempts to parse string_to_parse. Treats the entire string string_to_parse as a single chunk. Returns the produced value, or throws a FroParseError explaining why the parse failed (or returns None if loud is False).

Parameters:
  • string_to_parse (str) – string to parse
  • loud – if parsing failures should result in an exception
Returns:

value produced by parse

prepend(value)

Returns a parser that chomps with the Parser represented by value, then chomps with self, and produces the value produced by self. The returned parser has the same name as significance as self.

Parameters:value (Union[Parser,str]) – parser to “prepend” to self
Returns:value prepended to self
Return type:Parser
rstrip()

Returns a parser that is equivalent to self, but ignores and consumes trailing whitespace inside a single chunk. Equivalent to fro.comp([self, r"~\s*"]).get(), but with the same name and significance as self.

Returns:parser that ignore trailing whitespace inside a single chunk
Return type:Parser

Example:

parser = fro.rgx(r"[a-z]+").rstrip()

# Will succeed, producing "hello". It's okay if there's no whitespace
parser.parse_str("hello")

# Will succeed, producing "world"
parser.parse_str("world\n")

# Will succeed, producing "planet". Note that the trailing whitespace is
# confined to a single chunk (even though this chunk is different than
# the chunk that "planet" appears in)
parser.parse(["planet", "    "])

# Will fail, trailing whitespace is across multiple chunks
parser.parse(["galaxy\t", "\r"])
rstrips()

Returns a parser that is equivalent to self, but ignores and consumes any leading whitespace across multiple chunks.

Returns:parser that ignores leading whitespace
Return type:Parser

Example:

parser = fro.rgx(r"[a-z]+").lstrips()

# Will succeed, producing "hello". It's okay if there's no whitespace
parser.parse_str("hello")

# Will succeed, producing "world"
parser.parse_str("world\n")

# Will succeed, producing "planet".
parser.parse(["planet", "   "])

# Will succeed, producing "galaxy". Unlike rstrip(), rstrips() can handle
# whitespace spread across multiple chunks
parser.parse(["galaxy\n\n", "  ", "\r\r"])
significant()

Returns a parser that is equivalent to self but is significant.

Returns:a significant copy of the called parser
Return type:Parser
strip()

Returns a parser that is equivalent to self, but ignores and consumes leading and trailing whitespace inside a single chunk. self.strip() is equivalent to self.lstrip().rstrip().

Returns:parser that ignores leading and trailing whitespace inside a single chunk
Return type:Parser

Example:

parser = fro.rgx(r"[a-z]+").strip()

# This will succeed, producing "abc". All whitespace is inside a single chunk.
parser.parse_str(["  abc  \t"])

# This will also succeed, producing "abc". All leading whitespace is inside
# a single chunk, as is all trailing whitespace (even though those chunks
# are different!)
parser.parse_str(["\n\n", "abc \t"])

# This will not succeed. Leading whitespace is spread across multiple chunks.
parser.parse_str(["\n\n", "\n abc\t\r"])
strips()

Returns a parser object that is equivalent to self, but ignores and consumes leading and trailing whitespace, across chunk boundaries. self.strips() is equivalent to self.lstrips().rstrips().

Returns:parser that ignores leading and trailing whitespace
Return type:Parser

Example:

parser = fro.rgx(r"[a-z]+").strips()

# This will succeed, producing "abc". All whitespace is inside a single chunk.
parser.parse_str(["  abc  \t"])

# This will also succeed, producing "abc".
parser.parse_str(["\n\n", "abc \t"])

# This will succeed, producing "abc". Unlike strip(), strips() can handle
# whitespace that spans multiple chunks.
parser.parse_str(["\n\n", "\n abc\t\r"])
unname()

Returns a copy of the called parser that does not have a name.

Returns:a copy of the called parser that does not have a name
Return type:Parser

Constructing Parsers

Parser objects should not be instantiated directly; instead Fro provides factory functions for constructing Parser instances.

Many of these factory functions input “parser-like” values, or more commonly collections of “parser-like” values. A Parser object is a parser-like value, which corresponds to itself. A string s is also a parser-like value, and it corresponds to fro.rgx(s). The decision to automatically cast string to regular expression parser is primarily intended to make the client code using the Fro module more concise.

To mark a “parser-like” regular expression as insignificant, prepend it with a tilde (~). If you actually want a regular expression that begins with a tilde, escape it (e.g. r"\~..."). This rule only applies to strings that are used as “parser-like” values. It does not apply in other context where regular expression are used, such as the argument of rgx(..).

alt(parser_values, name=None)

Returns a parser that is the alternation of the parsers in parser_values.

More specifically, the returned parser chomps by successively trying to chomp with the parsers in parser_values, and producing the value producing by the first successful chomp, and failing if none of the parsers in parser_values successfully chomp.

Parameters:
  • | str]] parser_values (Iterable[Union[Parser) – collection of parser values
  • name (str) – name of the created parser
Returns:

a parser that is the alternation of the parsers in parser_values

Return type:

Parser

Example:

parser = fro.alt([r"a*b*c*", r"[0-9]{3}", fro.intp])
parser.parse_str("aac")  # evaluates to "aac"
parser.parse_str("12")  # evaluates to 12
parser.parse_str("235")  # evaluates to "235"
parser.parse_str("abc123")  # fails
parser.parse_str("")  # evaluates to ""
parser.parse_str("1234")  # fails

# The last one is tricky. When r"a*b*c*" tries to chomp "1234", it fails to chomp.
# Then, when r"[0-9]{3}" tries to chomp "1234", it chomps off "123", leaving behind
# "4". This is the first successful chomp, so this is what the variable parser chomps.
# However, since the variable parser did not chomp the entire string "1234", it fails
# to parse it.
chain(func, name=None)

Given a function func which maps one parser to another, returns a parser value that is equivalent to a large number of successive calls to func.

Conceptually, the returned parser is equivalent to func(func(func(...))). During parsing, successive calls to func are made lazily on an as-needed basis.

Fro parsers parse top-down, so users of this function should take care to avoid left recursion. In general the parser func(parser) should consume input before delegating parsing to the parser argument.

Parameters:
  • func (Callable[[Parser],Union[Parser,str]]) – function from Parser to parser value
  • name – name for created parser
Returns:

lazily-evaluated infinite parser

Return type:

Parser

Example:

box = fro.BoxedValue(None)
def wrap(parser):
    openp = fro.rgx(r"[a-z]", name="open") | box.update_and_get
    closep = fro.thunk(lambda: box.get(), name="close")
    return fro.comp([~openp, parser.maybe(0), ~closep]) >> lambda n: n + 1
parser = fro.chain(wrap)
parser.parse_str("aa")  # evaluates to 1
parser.parse_str("ab")  # fails
parser.parse_str("aeiiea")  # evaluates to 3
parser.parse_str("aeiie")  # fails
comp(parser_values, sep=None, name=None)

Returns a parser that is the composition of the parsers in parser_values.

More specifically, the returned parser chomps by successively chomping with the parsers in parser_values, and produces a tuple of the values produced by parser_values. If sep is not None, then the returned parser will chomp with sep between each parser in parser_values (and discard the produced value).

Parameters:
  • parser_values (Iterable[Union[Parser,str]]) – collection of parser values to compose
  • sep (Union[Parser,str]) – separating parser to use between composition elements
  • name (str) – name for the parser
Returns:

a parser that is the composition of the parsers parser_values

Return type:

Parser

Example:

parser = fro.comp([r"ab?c+", r"~,", fro.intp])
parser.parse_str("abcc,4")  # evaluates to ("abcc", 4)
parser.parse_str("ac,-1")  # evaluates to ("ac", -1)
parser.parse_str("abc,0,")  # fails
group_rgx(regex_string, name=None)

Returns a parser that consumes the regular expression regex_string, and produces a tuple of the groups of the corresponding match. Regular expressions should adhere to the syntax outlined in the re module. Also see the re module for a description of regular expression groups.

Parameters:
  • regex_string (str) – regular expression
  • name (str) – name for the parser
Returns:

parser that consumes the regular expression regex_string, and produces a tuple of the groups of the corresponding match.

Return type:

Parser

Example:

parser = fro.group_rgx(r"(x*)(y*)(z*)")
parser.parse_str("xxz")  # evaluates to ("xx", "", "z")
parser.parse_str("wxyz")  # fails
nested(open_regex_string, close_regex_string, reducer=<built-in method join of str object>, name=None)

Returns a Parser that parses well-nested sequences where the opening token is given by open_regex_string and the closing token given by close_regex_string.

The parser passes an iterator containing the chunks of content between the first opening token and final closing token into reducer, and produces the resulting value. The default behavior is to concatenate the chunks.

If there are overlapping opening and closing tokens, the token with the earliest start positions wins, with ties going to opening tokens.

Parameters:
  • open_regex_string (str) – regex for opening tokens
  • close_regex_string (str) – regex for closing tokens
  • reducer (Callable[[Iterable[str],T]) – function from iterator of chunks to produced value
  • name
Returns:

Example:

parser = fro.nested(r"\(", r"\)")
parser.parse_str("(hello (there))")  # evaluates to "hello (there)"
parser.parse_str("(hello (there)")  # fails, no closing ) for the first (
rgx(regex_string, name=None)

Returns a parser that parses strings that match the given regular expression, and produces the string it consumed. The regular expressions should adhere to the syntax outlined in the re module

Parameters:
  • regex_string (str) – regex that parser should match
  • name (str) – name for the parser
Returns:

parser that parses strings that match the given regular expression

Return type:

Parser

Example:

parser = fro.rgx(r"abc+")
parser.parse_str("abccc")  # evaluates to "abccc"
parser.parse_str("abd")  # fails
seq(parser_value, reducer=<class 'list'>, sep=None, name=None)

Returns a parser that parses sequences of the values parsed by parser_value.

More specifically, the returned parser repeatedly chomps with parser_value until it fails, passes an iterator of the produced values as argument to reducer, and produces the resulting value. reducer default to producing a list of the produced values.If sep is not None, the returned parser chomps using sep between each parser_value chomp (and discards the produced value).

Parameters:
  • parser_value (Union[Parser,str]) – Parser-like value
  • reducer (Callable[[Iterable[str]],T]) – function from iterator of chunks to produced value
  • sep (Union[Parser,str]) – separating parser to use between adjacent sequence elements
  • name (str) – name for the parser
Returns:

a parser that parses sequences of the values parsed by parser_value

Return type:

Parser

Example:

parser = fro.seq(fro.intp, sep=r",")
parser.parse_str("")  # evaluates to []
parser.parse_str("1")  # evaluates to [1]
parser.parse_str("1,2,3")  # evaluates to [1, 2, 3]
parser.parse_str("1,2,3,")  # fails
thunk(func, name=None)

Given a function func, which takes no argument and produces a parser value, returns a parser that when chomping, calls func() and chomps with the resulting parser. This function is primarily intended for creating parsers whose behavior is dependent on some sort of external state.

Parameters:
  • func (Callable[[],Parser]) –
  • name (str) – name for the parser
Returns:

a parser that parses with the parsers generated by func

Return type:

Parser

Example:

regex_box = fro.BoxedValue(r"ab*")
parser = fro.thunk(lambda: regex_box.get(), name="Boxed regex")
parser.parse_str("abb")  # evaluates to "abb"
parser.parse_str("aab")  # fails
box.update(r"cd*")
parser.parse_str("cdddd")  # evaluates to "cdddd"
parser.parse_str("abb")  # fails
tie(func, name=None)

Given a function func, which maps one parser to another parser, returns a cyclic parser whose structure matches the parsers returned by func.

Conceptually, what happens is:

stub = some_placeholder
result = func(stub)
... # in result, replace all references to stub to instead point back to result

The parser tie(func) is equivalent to chain(func), except that tie(func) is a cyclic parser, whereas chain(func) is a lazily-evaluated infinite parser. This difference is relevant only when the corresponding parsers are dependent on external state. In other cases, it is more memory-efficient to use tie(func).

Fro parsers parse top-down, so users of this function should take care to avoid left recursion. In general the parser func(parser) should consume input before delegating parsing to the parser argument.

Since parsers are immutable, the only way to create a self-referencing parser is via tie(..).

Parameters:
  • func (Callable[[Parser],Parser]) – function for generating cyclic parser
  • name (str) – name for the parser
Returns:

a cyclic parser whose structure matches the parsers returned by func

Return type:

Parser

Example:

def func(parser):
    return fro.comp([r"~\(", parser.maybe(0),r"~\)"]) | lambda n: n + 1
parser = fro.tie(func)
parser.parse("(())")  # evaluates to 2
parser.parse("(((())))")  # evaluates to 4
parser.parse("((()")  # fails
until(regex_str, reducer=<function <lambda>>, name=None)

Returns a parser that consumes all input until it encounters a match to the given regular expression, or the end of the input.

The parser passes an iterator of the chunks it consumed to reducer, and produces the resulting value. By default, the parser produces None. The parser does not consume the match when parsing, but only everything up until the match.

Parameters:
  • regex_str (str) – regex until which the parser will consume
  • reducer (Callable[[Iterable[str]],T]) – function from iterator of chunks to produced value
  • name (str) – name for the parser
Returns:

a parser that consumes all input until it encounters a match to regex_str or the end of the input

Return type:

Parser

Example:

untilp = fro.until(r"a|b",
                   reducer=lambda chunks: sum(len(chunk) for chunk in chunks),
                   name="until a or b")
parser = fro.comp([untilp, r"apples"], name="composition")
parser.parse(["hello\n","world\n", "apples"])  # evaluates to (12, apples)

Built-in Parsers

For convenience, the Fro module provides several common parsers.

floatp

A parser that parses floating-point values from their string representations.

Type:Parser
intp

A parser that parses int values from their string representations.

Type:Parser
natp

A parser that parses non-negative integers (i.e. natural numbers) from their string representations.

Type:Parser
posintp
A parser that parses positive integers from their string representations.
Type:Parser
floatp

A Parser that parses floating-point values from their string representations.

intp

A Parser that parses int values from their string representations.

natp

A Parser that parses non-negative integers (i.e. natural numbers) from their string representations.

posintp

A Parser that parses positive integers from their string representations.

FroParseError

FroParseError exceptions are raised by the parse(..) family of methods upon parsing failures.

exception FroParseError

An exception for parsing failures

__str__()

A human readable description of the error. Include both the error messages, and extra information describing the location of the error. Equivalent to to_str().

Returns:a human readable description
Return type:str
cause()

Returns the Exception that triggered this error, or None is this error was not triggered by another exception

Returns:the exception that triggered this error
Return type:Exception
column(index_from=1)

Returns the column number where the error occurred, or more generally the index inside the chunk where the error occurred. Indices are indexed from index_from.

Parameters:index_from (int) – number to index column numbers by
Returns:column number of error
Return type:int
line(index_from=1)

Returns the line number where the error occurred, or more generally the index of the chunk where the error occurred. Indices are indexed from index_from.

Parameters:index_from (int) – number to index line numbers by
Returns:row number of error
Return type:int
messages()

A non-empty list of Message objects which describe the reasons for failure. :return: a non-empty list of Message objects which describe the reasons for failure. :rtype: List[FroParseError.Message]

to_str(index_from=1, filename=None)

Returns a readable description of the error, with indices starting at index_from, and a filename of filename include if a filename is provided. Include both the error messages, and extra information describing the location of the error. This method is essentially a configurable version of __str__().

Parameters:
  • index_from (int) – number to index column/line numbers by
  • filename (str) – name of file whose parse trigger the exception
Returns:

a readable description of the error

Return type:

str

class FroParseError.Message

Represents an error message describing a reason for failure

__str__()

A string representation of the message that includes both the content and parser name. :return:

content()

The content of the error message

Returns:the content of the error message
Return type:str
name()

The name of the parser at which the message was generated, or None if all relevant parsers are unnamed. :return: name of parser where error occurred :rtype: str

BoxedValue

To facilitate creating parser that dependent on external state, the Fro module offers the BoxedValue class. For an example of their usage, see Example 3: XML.

class BoxedValue(value)

An updatable boxed value

__init__(value)

Initialize the box with value

Parameters:value – value for box to hold
get()

Return the box’s current value

Returns:current value
get_and_update(value)

Updates the box’s value, and returns the previous value.

Parameters:value – updated value for box to hold
Returns:The previously held value
update(value)

Update the box’s value

Parameters:value – updated value for box to hold
update_and_get(value)

Update the box’s value, and the return the updated value

Parameters:value – updated value for box to hold
Returns:The previously held value