Fro Interface¶
Parser¶
Parser objects are immutable. Therefore, many Parser methods return a new parser instead of modifying the called
parser.
For explanations of terminology like “chomp” “chunk” and “significant”, see Fro 101.
-
class
Parser¶ An immutable parser.
-
__invert__()¶ Returns a new
Parserthat is equivalent toselfbut is insignificant.Returns: an insignificant copy of the called parser Return type: Parser Example:
commap = fro.rgx(r",") composition = fro.comp([~fro.intp, ~commap, fro.intp]).get() composition.parse("2,3") # evaluates to 3
-
__or__(func)¶ Returns a new
Parserobject that appliesfuncto the values produced byself. The new parser has the same name and significance asself.Parameters: U] func (Callable[[T],) – function applied to produced values Returns: a new parser that maps produced values using funcReturn type: Parser Example:
parser = fro.intp | (lambda x: x * x) parser.parse_str("4") # evaluates to 16
-
__rshift__(func)¶ Returns a
Parserobject that unpacks the values produced byselfand then appliesfuncto them. Throws an error if the number of unpacked arguments does not equal a number of arguments thatfunccan take, or if the value by producedselfis not unpackable. Equivalent toself | lambda x: func(*x).The new parser has the same name and significance as
self.Parameters: U] func (Callable[?,) – function applied to unpacked produced values Returns: a new parser that maps produced values using funcReturn type: Parser Example:
parser = fro.comp([fro.intp, r"~,", fro.intp]) >> (lambda x, y: x + y) parser.parse_str("4,5") # evaluates to 9
-
append(value)¶ Returns a parser that chomps with the called parser, chomps with the
Parserrepresented byvalue, and produces the value produced by the called parser. The returned parser has the same name as significance asself.Parameters: value (Union[Parser,str]) – parser to “append” to selfReturns: valueappended toselfReturn type: Parser
-
get()¶ Returns a
Parserobject that retrieves the sole first element of the value produced byself, and throws an error ifselfproduces an non-iterable value or an iterable value that does not have exactly one element. Equivalent toself >> lambda x: x.Returns: parser that unpacks the sole produced value Return type: Parser Example:
# Recall that comp(..) always produces a tuple, in this case a tuple with one value parser = fro.comp(r"~\(", fro.intp, r"~\)").get() parser.parse_str("(-3)") # evaluates to -3
-
lstrip()¶ Returns a parser that is equivalent to
self, but ignores and consumes any leading whitespace inside a single chunk. Equivalent tofro.comp([r"~\s*", self]).get(), but with the same name and significance asself.Returns: a parser that ignores leading whitespace inside a single chunk Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").lstrip() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("\nworld") # Will succeed, producing "planet". Note that the leading whitespace is # confined to a single chunk (even though this chunk is different than # the chunk that "planet" appears in) parser.parse([" ", "planet"]) # Will fail, leading whitespace is across multiple chunks parser.parse([" ", "\tgalaxy"])
-
lstrips()¶ Returns a parser that is equivalent to
self, but ignores and consumes any leading whitespace across multiple chunks.Returns: a parser that ignored leading whitespace Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").lstrips() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("\nworld") # Will succeed, producing "planet". Note that the leading whitespace is # confined to a single chunk (even though this chunk is different than # the chunk that "planet" appears in) parser.parse([" ", "planet"]) # Will succeed, producing "galaxy". Unlike lstrip(), lstrips() can handle # whitespace across multiple chunks parser.parse([" ", "\r\r", "\tgalaxy"])
-
maybe(default=None)¶ Returns a parser equivalent to
self, but defaults to consuming none of the input string and producingdefaultwhenselffails to chomp a string. See Fro 101 for an explanation of chomping.Parameters: default (Any) – default value to produce Returns: parser that defaults to consuming nothing and producing defaultinstead of failingReturn type: Parser Example:
parser = fro.comp([fro.rgx(r"ab+").maybe("a"), fro.intp]) parser.parse_str("abb3") # evaluates to ("abb", 3) parser.parse_str("87") # evaluates to ("a", 87)
-
name(name)¶ Returns a parser equivalent to
self, but with the given name.Parameters: name (str) – name for new parser Returns: a parser identical to this, but with specified name Return type: Parser
-
parse(lines, loud=True)¶ Parse an iterable collection of chunks. Returns the produced value, or throws a
FroParseErrorexplaining why the parse failed (or returnsNoneifloudisFalse).Parameters: - lines (Iterable[str]) –
- loud (bool) – if parsing failures should result in an exception
Returns: Value produced by parse
-
parse_file(filename, encoding='utf-8', loud=True)¶ Parse the contents of a file with the given filename, treating each line as a separate chunk. Returns the produced value, or throws a
FroParseErrorexplaining why the parse failed (or returnsNoneifloudisFalse).Parameters: - filename – filename of file to parse
- encoding – encoding of filename to parse
- loud – if parsing failures should result in an exception
Returns: value produced by parse
-
parse_str(string_to_parse, loud=True)¶ Attempts to parse
string_to_parse. Treats the entire stringstring_to_parseas a single chunk. Returns the produced value, or throws aFroParseErrorexplaining why the parse failed (or returnsNoneifloudisFalse).Parameters: - string_to_parse (str) – string to parse
- loud – if parsing failures should result in an exception
Returns: value produced by parse
-
prepend(value)¶ Returns a parser that chomps with the
Parserrepresented byvalue, then chomps withself, and produces the value produced byself. The returned parser has the same name as significance asself.Parameters: value (Union[Parser,str]) – parser to “prepend” to selfReturns: valueprepended toselfReturn type: Parser
-
rstrip()¶ Returns a parser that is equivalent to
self, but ignores and consumes trailing whitespace inside a single chunk. Equivalent tofro.comp([self, r"~\s*"]).get(), but with the same name and significance asself.Returns: parser that ignore trailing whitespace inside a single chunk Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").rstrip() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("world\n") # Will succeed, producing "planet". Note that the trailing whitespace is # confined to a single chunk (even though this chunk is different than # the chunk that "planet" appears in) parser.parse(["planet", " "]) # Will fail, trailing whitespace is across multiple chunks parser.parse(["galaxy\t", "\r"])
-
rstrips()¶ Returns a parser that is equivalent to
self, but ignores and consumes any leading whitespace across multiple chunks.Returns: parser that ignores leading whitespace Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").lstrips() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("world\n") # Will succeed, producing "planet". parser.parse(["planet", " "]) # Will succeed, producing "galaxy". Unlike rstrip(), rstrips() can handle # whitespace spread across multiple chunks parser.parse(["galaxy\n\n", " ", "\r\r"])
-
significant()¶ Returns a parser that is equivalent to
selfbut is significant.Returns: a significant copy of the called parser Return type: Parser
-
strip()¶ Returns a parser that is equivalent to
self, but ignores and consumes leading and trailing whitespace inside a single chunk.self.strip()is equivalent toself.lstrip().rstrip().Returns: parser that ignores leading and trailing whitespace inside a single chunk Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").strip() # This will succeed, producing "abc". All whitespace is inside a single chunk. parser.parse_str([" abc \t"]) # This will also succeed, producing "abc". All leading whitespace is inside # a single chunk, as is all trailing whitespace (even though those chunks # are different!) parser.parse_str(["\n\n", "abc \t"]) # This will not succeed. Leading whitespace is spread across multiple chunks. parser.parse_str(["\n\n", "\n abc\t\r"])
-
strips()¶ Returns a parser object that is equivalent to
self, but ignores and consumes leading and trailing whitespace, across chunk boundaries.self.strips()is equivalent toself.lstrips().rstrips().Returns: parser that ignores leading and trailing whitespace Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").strips() # This will succeed, producing "abc". All whitespace is inside a single chunk. parser.parse_str([" abc \t"]) # This will also succeed, producing "abc". parser.parse_str(["\n\n", "abc \t"]) # This will succeed, producing "abc". Unlike strip(), strips() can handle # whitespace that spans multiple chunks. parser.parse_str(["\n\n", "\n abc\t\r"])
-
Constructing Parsers¶
Parser objects should not be instantiated directly; instead Fro provides factory functions for constructing Parser
instances.
Many of these factory functions input “parser-like” values, or more commonly collections of “parser-like” values.
A Parser object is a parser-like value, which corresponds to itself. A string s is also a parser-like value,
and it corresponds to fro.rgx(s). The decision to automatically cast string to regular expression parser is
primarily intended to make the client code using the Fro module more concise.
To mark a “parser-like” regular expression as insignificant, prepend it with a tilde (~).
If you actually want a regular expression that begins with a tilde, escape it (e.g. r"\~...").
This rule only applies to strings that are used as “parser-like” values. It does not apply in other
context where regular expression are used, such as the argument of rgx(..).
-
alt(parser_values, name=None)¶ Returns a parser that is the alternation of the parsers in
parser_values.More specifically, the returned parser chomps by successively trying to chomp with the parsers in
parser_values, and producing the value producing by the first successful chomp, and failing if none of the parsers inparser_valuessuccessfully chomp.Parameters: - | str]] parser_values (Iterable[Union[Parser) – collection of parser values
- name (str) – name of the created parser
Returns: a parser that is the alternation of the parsers in
parser_valuesReturn type: Example:
parser = fro.alt([r"a*b*c*", r"[0-9]{3}", fro.intp]) parser.parse_str("aac") # evaluates to "aac" parser.parse_str("12") # evaluates to 12 parser.parse_str("235") # evaluates to "235" parser.parse_str("abc123") # fails parser.parse_str("") # evaluates to "" parser.parse_str("1234") # fails # The last one is tricky. When r"a*b*c*" tries to chomp "1234", it fails to chomp. # Then, when r"[0-9]{3}" tries to chomp "1234", it chomps off "123", leaving behind # "4". This is the first successful chomp, so this is what the variable parser chomps. # However, since the variable parser did not chomp the entire string "1234", it fails # to parse it.
-
chain(func, name=None)¶ Given a function
funcwhich maps one parser to another, returns a parser value that is equivalent to a large number of successive calls tofunc.Conceptually, the returned parser is equivalent to
func(func(func(...))). During parsing, successive calls tofuncare made lazily on an as-needed basis.Fro parsers parse top-down, so users of this function should take care to avoid left recursion. In general the parser
func(parser)should consume input before delegating parsing to theparserargument.Parameters: - func (Callable[[Parser],Union[Parser,str]]) – function from
Parserto parser value - name – name for created parser
Returns: lazily-evaluated infinite parser
Return type: Example:
box = fro.BoxedValue(None) def wrap(parser): openp = fro.rgx(r"[a-z]", name="open") | box.update_and_get closep = fro.thunk(lambda: box.get(), name="close") return fro.comp([~openp, parser.maybe(0), ~closep]) >> lambda n: n + 1 parser = fro.chain(wrap) parser.parse_str("aa") # evaluates to 1 parser.parse_str("ab") # fails parser.parse_str("aeiiea") # evaluates to 3 parser.parse_str("aeiie") # fails
- func (Callable[[Parser],Union[Parser,str]]) – function from
-
comp(parser_values, sep=None, name=None)¶ Returns a parser that is the composition of the parsers in
parser_values.More specifically, the returned parser chomps by successively chomping with the parsers in
parser_values, and produces a tuple of the values produced byparser_values. Ifsepis notNone, then the returned parser will chomp withsepbetween each parser inparser_values(and discard the produced value).Parameters: - parser_values (Iterable[Union[Parser,str]]) – collection of parser values to compose
- sep (Union[Parser,str]) – separating parser to use between composition elements
- name (str) – name for the parser
Returns: a parser that is the composition of the parsers
parser_valuesReturn type: Example:
parser = fro.comp([r"ab?c+", r"~,", fro.intp]) parser.parse_str("abcc,4") # evaluates to ("abcc", 4) parser.parse_str("ac,-1") # evaluates to ("ac", -1) parser.parse_str("abc,0,") # fails
-
group_rgx(regex_string, name=None)¶ Returns a parser that consumes the regular expression
regex_string, and produces a tuple of the groups of the corresponding match. Regular expressions should adhere to the syntax outlined in the re module. Also see the re module for a description of regular expression groups.Parameters: - regex_string (str) – regular expression
- name (str) – name for the parser
Returns: parser that consumes the regular expression
regex_string, and produces a tuple of the groups of the corresponding match.Return type: Example:
parser = fro.group_rgx(r"(x*)(y*)(z*)") parser.parse_str("xxz") # evaluates to ("xx", "", "z") parser.parse_str("wxyz") # fails
-
nested(open_regex_string, close_regex_string, reducer=<built-in method join of str object>, name=None)¶ Returns a
Parserthat parses well-nested sequences where the opening token is given byopen_regex_stringand the closing token given byclose_regex_string.The parser passes an iterator containing the chunks of content between the first opening token and final closing token into
reducer, and produces the resulting value. The default behavior is to concatenate the chunks.If there are overlapping opening and closing tokens, the token with the earliest start positions wins, with ties going to opening tokens.
Parameters: - open_regex_string (str) – regex for opening tokens
- close_regex_string (str) – regex for closing tokens
- reducer (Callable[[Iterable[str],T]) – function from iterator of chunks to produced value
- name –
Returns: Example:
parser = fro.nested(r"\(", r"\)") parser.parse_str("(hello (there))") # evaluates to "hello (there)" parser.parse_str("(hello (there)") # fails, no closing ) for the first (
-
rgx(regex_string, name=None)¶ Returns a parser that parses strings that match the given regular expression, and produces the string it consumed. The regular expressions should adhere to the syntax outlined in the re module
Parameters: - regex_string (str) – regex that parser should match
- name (str) – name for the parser
Returns: parser that parses strings that match the given regular expression
Return type: Example:
parser = fro.rgx(r"abc+") parser.parse_str("abccc") # evaluates to "abccc" parser.parse_str("abd") # fails
-
seq(parser_value, reducer=<class 'list'>, sep=None, name=None)¶ Returns a parser that parses sequences of the values parsed by
parser_value.More specifically, the returned parser repeatedly chomps with
parser_valueuntil it fails, passes an iterator of the produced values as argument toreducer, and produces the resulting value.reducerdefault to producing a list of the produced values.Ifsepis notNone, the returned parser chomps usingsepbetween eachparser_valuechomp (and discards the produced value).Parameters: - parser_value (Union[Parser,str]) – Parser-like value
- reducer (Callable[[Iterable[str]],T]) – function from iterator of chunks to produced value
- sep (Union[Parser,str]) – separating parser to use between adjacent sequence elements
- name (str) – name for the parser
Returns: a parser that parses sequences of the values parsed by
parser_valueReturn type: Example:
parser = fro.seq(fro.intp, sep=r",") parser.parse_str("") # evaluates to [] parser.parse_str("1") # evaluates to [1] parser.parse_str("1,2,3") # evaluates to [1, 2, 3] parser.parse_str("1,2,3,") # fails
-
thunk(func, name=None)¶ Given a function
func, which takes no argument and produces a parser value, returns a parser that when chomping, callsfunc()and chomps with the resulting parser. This function is primarily intended for creating parsers whose behavior is dependent on some sort of external state.Parameters: - func (Callable[[],Parser]) –
- name (str) – name for the parser
Returns: a parser that parses with the parsers generated by
funcReturn type: Example:
regex_box = fro.BoxedValue(r"ab*") parser = fro.thunk(lambda: regex_box.get(), name="Boxed regex") parser.parse_str("abb") # evaluates to "abb" parser.parse_str("aab") # fails box.update(r"cd*") parser.parse_str("cdddd") # evaluates to "cdddd" parser.parse_str("abb") # fails
-
tie(func, name=None)¶ Given a function
func, which maps one parser to another parser, returns a cyclic parser whose structure matches the parsers returned byfunc.Conceptually, what happens is:
stub = some_placeholder result = func(stub) ... # in result, replace all references to stub to instead point back to result
The parser
tie(func)is equivalent tochain(func), except thattie(func)is a cyclic parser, whereaschain(func)is a lazily-evaluated infinite parser. This difference is relevant only when the corresponding parsers are dependent on external state. In other cases, it is more memory-efficient to usetie(func).Fro parsers parse top-down, so users of this function should take care to avoid left recursion. In general the parser
func(parser)should consume input before delegating parsing to theparserargument.Since parsers are immutable, the only way to create a self-referencing parser is via
tie(..).Parameters: - func (Callable[[Parser],Parser]) – function for generating cyclic parser
- name (str) – name for the parser
Returns: a cyclic parser whose structure matches the parsers returned by
funcReturn type: Example:
def func(parser): return fro.comp([r"~\(", parser.maybe(0),r"~\)"]) | lambda n: n + 1 parser = fro.tie(func) parser.parse("(())") # evaluates to 2 parser.parse("(((())))") # evaluates to 4 parser.parse("((()") # fails
-
until(regex_str, reducer=<function <lambda>>, name=None)¶ Returns a parser that consumes all input until it encounters a match to the given regular expression, or the end of the input.
The parser passes an iterator of the chunks it consumed to
reducer, and produces the resulting value. By default, the parser producesNone. The parser does not consume the match when parsing, but only everything up until the match.Parameters: - regex_str (str) – regex until which the parser will consume
- reducer (Callable[[Iterable[str]],T]) – function from iterator of chunks to produced value
- name (str) – name for the parser
Returns: a parser that consumes all input until it encounters a match to
regex_stror the end of the inputReturn type: Example:
untilp = fro.until(r"a|b", reducer=lambda chunks: sum(len(chunk) for chunk in chunks), name="until a or b") parser = fro.comp([untilp, r"apples"], name="composition") parser.parse(["hello\n","world\n", "apples"]) # evaluates to (12, apples)
Built-in Parsers¶
For convenience, the Fro module provides several common parsers.
-
floatp¶ A parser that parses floating-point values from their string representations.
Type: Parser
-
intp¶ A parser that parses int values from their string representations.
Type: Parser
-
natp¶ A parser that parses non-negative integers (i.e. natural numbers) from their string representations.
Type: Parser
-
posintp¶ - A parser that parses positive integers from their string representations.
Type: Parser
-
floatp¶ A
Parserthat parses floating-point values from their string representations.
-
intp¶ A
Parserthat parses int values from their string representations.
-
natp¶ A
Parserthat parses non-negative integers (i.e. natural numbers) from their string representations.
-
posintp¶ A
Parserthat parses positive integers from their string representations.
FroParseError¶
FroParseError exceptions are raised by the parse(..) family of methods upon parsing failures.
-
exception
FroParseError¶ An exception for parsing failures
-
__str__()¶ A human readable description of the error. Include both the error messages, and extra information describing the location of the error. Equivalent to
to_str().Returns: a human readable description Return type: str
-
cause()¶ Returns the
Exceptionthat triggered this error, orNoneis this error was not triggered by another exceptionReturns: the exception that triggered this error Return type: Exception
-
column(index_from=1)¶ Returns the column number where the error occurred, or more generally the index inside the chunk where the error occurred. Indices are indexed from
index_from.Parameters: index_from (int) – number to index column numbers by Returns: column number of error Return type: int
-
line(index_from=1)¶ Returns the line number where the error occurred, or more generally the index of the chunk where the error occurred. Indices are indexed from
index_from.Parameters: index_from (int) – number to index line numbers by Returns: row number of error Return type: int
-
messages()¶ A non-empty list of
Messageobjects which describe the reasons for failure. :return: a non-empty list ofMessageobjects which describe the reasons for failure. :rtype: List[FroParseError.Message]
-
to_str(index_from=1, filename=None)¶ Returns a readable description of the error, with indices starting at
index_from, and a filename offilenameinclude if a filename is provided. Include both the error messages, and extra information describing the location of the error. This method is essentially a configurable version of__str__().Parameters: - index_from (int) – number to index column/line numbers by
- filename (str) – name of file whose parse trigger the exception
Returns: a readable description of the error
Return type: str
-
-
class
FroParseError.Message¶ Represents an error message describing a reason for failure
-
__str__()¶ A string representation of the message that includes both the content and parser name. :return:
-
content()¶ The content of the error message
Returns: the content of the error message Return type: str
-
name()¶ The name of the parser at which the message was generated, or
Noneif all relevant parsers are unnamed. :return: name of parser where error occurred :rtype: str
-
BoxedValue¶
To facilitate creating parser that dependent on external state, the Fro module offers the
BoxedValue class. For an example of their usage, see Example 3: XML.
-
class
BoxedValue(value)¶ An updatable boxed value
-
__init__(value)¶ Initialize the box with
valueParameters: value – value for box to hold
-
get()¶ Return the box’s current value
Returns: current value
-
get_and_update(value)¶ Updates the box’s value, and returns the previous value.
Parameters: value – updated value for box to hold Returns: The previously held value
-
update(value)¶ Update the box’s value
Parameters: value – updated value for box to hold
-
update_and_get(value)¶ Update the box’s value, and the return the updated value
Parameters: value – updated value for box to hold Returns: The previously held value
-