Fro Interface¶
Parser¶
Parser
objects are immutable. Therefore, many Parser
methods return a new parser instead of modifying the called
parser.
For explanations of terminology like “chomp” “chunk” and “significant”, see Fro 101.
-
class
Parser
¶ An immutable parser.
-
__invert__
()¶ Returns a new
Parser
that is equivalent toself
but is insignificant.Returns: an insignificant copy of the called parser Return type: Parser Example:
commap = fro.rgx(r",") composition = fro.comp([~fro.intp, ~commap, fro.intp]).get() composition.parse("2,3") # evaluates to 3
-
__or__
(func)¶ Returns a new
Parser
object that appliesfunc
to the values produced byself
. The new parser has the same name and significance asself
.Parameters: U] func (Callable[[T],) – function applied to produced values Returns: a new parser that maps produced values using func
Return type: Parser Example:
parser = fro.intp | (lambda x: x * x) parser.parse_str("4") # evaluates to 16
-
__rshift__
(func)¶ Returns a
Parser
object that unpacks the values produced byself
and then appliesfunc
to them. Throws an error if the number of unpacked arguments does not equal a number of arguments thatfunc
can take, or if the value by producedself
is not unpackable. Equivalent toself | lambda x: func(*x)
.The new parser has the same name and significance as
self
.Parameters: U] func (Callable[?,) – function applied to unpacked produced values Returns: a new parser that maps produced values using func
Return type: Parser Example:
parser = fro.comp([fro.intp, r"~,", fro.intp]) >> (lambda x, y: x + y) parser.parse_str("4,5") # evaluates to 9
-
append
(value)¶ Returns a parser that chomps with the called parser, chomps with the
Parser
represented byvalue
, and produces the value produced by the called parser. The returned parser has the same name as significance asself
.Parameters: value (Union[Parser,str]) – parser to “append” to self
Returns: value
appended toself
Return type: Parser
-
get
()¶ Returns a
Parser
object that retrieves the sole first element of the value produced byself
, and throws an error ifself
produces an non-iterable value or an iterable value that does not have exactly one element. Equivalent toself >> lambda x: x
.Returns: parser that unpacks the sole produced value Return type: Parser Example:
# Recall that comp(..) always produces a tuple, in this case a tuple with one value parser = fro.comp(r"~\(", fro.intp, r"~\)").get() parser.parse_str("(-3)") # evaluates to -3
-
lstrip
()¶ Returns a parser that is equivalent to
self
, but ignores and consumes any leading whitespace inside a single chunk. Equivalent tofro.comp([r"~\s*", self]).get()
, but with the same name and significance asself
.Returns: a parser that ignores leading whitespace inside a single chunk Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").lstrip() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("\nworld") # Will succeed, producing "planet". Note that the leading whitespace is # confined to a single chunk (even though this chunk is different than # the chunk that "planet" appears in) parser.parse([" ", "planet"]) # Will fail, leading whitespace is across multiple chunks parser.parse([" ", "\tgalaxy"])
-
lstrips
()¶ Returns a parser that is equivalent to
self
, but ignores and consumes any leading whitespace across multiple chunks.Returns: a parser that ignored leading whitespace Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").lstrips() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("\nworld") # Will succeed, producing "planet". Note that the leading whitespace is # confined to a single chunk (even though this chunk is different than # the chunk that "planet" appears in) parser.parse([" ", "planet"]) # Will succeed, producing "galaxy". Unlike lstrip(), lstrips() can handle # whitespace across multiple chunks parser.parse([" ", "\r\r", "\tgalaxy"])
-
maybe
(default=None)¶ Returns a parser equivalent to
self
, but defaults to consuming none of the input string and producingdefault
whenself
fails to chomp a string. See Fro 101 for an explanation of chomping.Parameters: default (Any) – default value to produce Returns: parser that defaults to consuming nothing and producing default
instead of failingReturn type: Parser Example:
parser = fro.comp([fro.rgx(r"ab+").maybe("a"), fro.intp]) parser.parse_str("abb3") # evaluates to ("abb", 3) parser.parse_str("87") # evaluates to ("a", 87)
-
name
(name)¶ Returns a parser equivalent to
self
, but with the given name.Parameters: name (str) – name for new parser Returns: a parser identical to this, but with specified name Return type: Parser
-
parse
(lines, loud=True)¶ Parse an iterable collection of chunks. Returns the produced value, or throws a
FroParseError
explaining why the parse failed (or returnsNone
ifloud
isFalse
).Parameters: - lines (Iterable[str]) –
- loud (bool) – if parsing failures should result in an exception
Returns: Value produced by parse
-
parse_file
(filename, encoding='utf-8', loud=True)¶ Parse the contents of a file with the given filename, treating each line as a separate chunk. Returns the produced value, or throws a
FroParseError
explaining why the parse failed (or returnsNone
ifloud
isFalse
).Parameters: - filename – filename of file to parse
- encoding – encoding of filename to parse
- loud – if parsing failures should result in an exception
Returns: value produced by parse
-
parse_str
(string_to_parse, loud=True)¶ Attempts to parse
string_to_parse
. Treats the entire stringstring_to_parse
as a single chunk. Returns the produced value, or throws aFroParseError
explaining why the parse failed (or returnsNone
ifloud
isFalse
).Parameters: - string_to_parse (str) – string to parse
- loud – if parsing failures should result in an exception
Returns: value produced by parse
-
prepend
(value)¶ Returns a parser that chomps with the
Parser
represented byvalue
, then chomps withself
, and produces the value produced byself
. The returned parser has the same name as significance asself
.Parameters: value (Union[Parser,str]) – parser to “prepend” to self
Returns: value
prepended toself
Return type: Parser
-
rstrip
()¶ Returns a parser that is equivalent to
self
, but ignores and consumes trailing whitespace inside a single chunk. Equivalent tofro.comp([self, r"~\s*"]).get()
, but with the same name and significance asself
.Returns: parser that ignore trailing whitespace inside a single chunk Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").rstrip() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("world\n") # Will succeed, producing "planet". Note that the trailing whitespace is # confined to a single chunk (even though this chunk is different than # the chunk that "planet" appears in) parser.parse(["planet", " "]) # Will fail, trailing whitespace is across multiple chunks parser.parse(["galaxy\t", "\r"])
-
rstrips
()¶ Returns a parser that is equivalent to
self
, but ignores and consumes any leading whitespace across multiple chunks.Returns: parser that ignores leading whitespace Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").lstrips() # Will succeed, producing "hello". It's okay if there's no whitespace parser.parse_str("hello") # Will succeed, producing "world" parser.parse_str("world\n") # Will succeed, producing "planet". parser.parse(["planet", " "]) # Will succeed, producing "galaxy". Unlike rstrip(), rstrips() can handle # whitespace spread across multiple chunks parser.parse(["galaxy\n\n", " ", "\r\r"])
-
significant
()¶ Returns a parser that is equivalent to
self
but is significant.Returns: a significant copy of the called parser Return type: Parser
-
strip
()¶ Returns a parser that is equivalent to
self
, but ignores and consumes leading and trailing whitespace inside a single chunk.self.strip()
is equivalent toself.lstrip().rstrip()
.Returns: parser that ignores leading and trailing whitespace inside a single chunk Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").strip() # This will succeed, producing "abc". All whitespace is inside a single chunk. parser.parse_str([" abc \t"]) # This will also succeed, producing "abc". All leading whitespace is inside # a single chunk, as is all trailing whitespace (even though those chunks # are different!) parser.parse_str(["\n\n", "abc \t"]) # This will not succeed. Leading whitespace is spread across multiple chunks. parser.parse_str(["\n\n", "\n abc\t\r"])
-
strips
()¶ Returns a parser object that is equivalent to
self
, but ignores and consumes leading and trailing whitespace, across chunk boundaries.self.strips()
is equivalent toself.lstrips().rstrips()
.Returns: parser that ignores leading and trailing whitespace Return type: Parser Example:
parser = fro.rgx(r"[a-z]+").strips() # This will succeed, producing "abc". All whitespace is inside a single chunk. parser.parse_str([" abc \t"]) # This will also succeed, producing "abc". parser.parse_str(["\n\n", "abc \t"]) # This will succeed, producing "abc". Unlike strip(), strips() can handle # whitespace that spans multiple chunks. parser.parse_str(["\n\n", "\n abc\t\r"])
-
Constructing Parsers¶
Parser
objects should not be instantiated directly; instead Fro provides factory functions for constructing Parser
instances.
Many of these factory functions input “parser-like” values, or more commonly collections of “parser-like” values.
A Parser
object is a parser-like value, which corresponds to itself. A string s
is also a parser-like value,
and it corresponds to fro.rgx(s)
. The decision to automatically cast string to regular expression parser is
primarily intended to make the client code using the Fro module more concise.
To mark a “parser-like” regular expression as insignificant, prepend it with a tilde (~
).
If you actually want a regular expression that begins with a tilde, escape it (e.g. r"\~..."
).
This rule only applies to strings that are used as “parser-like” values. It does not apply in other
context where regular expression are used, such as the argument of rgx(..).
-
alt
(parser_values, name=None)¶ Returns a parser that is the alternation of the parsers in
parser_values
.More specifically, the returned parser chomps by successively trying to chomp with the parsers in
parser_values
, and producing the value producing by the first successful chomp, and failing if none of the parsers inparser_values
successfully chomp.Parameters: - | str]] parser_values (Iterable[Union[Parser) – collection of parser values
- name (str) – name of the created parser
Returns: a parser that is the alternation of the parsers in
parser_values
Return type: Example:
parser = fro.alt([r"a*b*c*", r"[0-9]{3}", fro.intp]) parser.parse_str("aac") # evaluates to "aac" parser.parse_str("12") # evaluates to 12 parser.parse_str("235") # evaluates to "235" parser.parse_str("abc123") # fails parser.parse_str("") # evaluates to "" parser.parse_str("1234") # fails # The last one is tricky. When r"a*b*c*" tries to chomp "1234", it fails to chomp. # Then, when r"[0-9]{3}" tries to chomp "1234", it chomps off "123", leaving behind # "4". This is the first successful chomp, so this is what the variable parser chomps. # However, since the variable parser did not chomp the entire string "1234", it fails # to parse it.
-
chain
(func, name=None)¶ Given a function
func
which maps one parser to another, returns a parser value that is equivalent to a large number of successive calls tofunc
.Conceptually, the returned parser is equivalent to
func(func(func(...)))
. During parsing, successive calls tofunc
are made lazily on an as-needed basis.Fro parsers parse top-down, so users of this function should take care to avoid left recursion. In general the parser
func(parser)
should consume input before delegating parsing to theparser
argument.Parameters: - func (Callable[[Parser],Union[Parser,str]]) – function from
Parser
to parser value - name – name for created parser
Returns: lazily-evaluated infinite parser
Return type: Example:
box = fro.BoxedValue(None) def wrap(parser): openp = fro.rgx(r"[a-z]", name="open") | box.update_and_get closep = fro.thunk(lambda: box.get(), name="close") return fro.comp([~openp, parser.maybe(0), ~closep]) >> lambda n: n + 1 parser = fro.chain(wrap) parser.parse_str("aa") # evaluates to 1 parser.parse_str("ab") # fails parser.parse_str("aeiiea") # evaluates to 3 parser.parse_str("aeiie") # fails
- func (Callable[[Parser],Union[Parser,str]]) – function from
-
comp
(parser_values, sep=None, name=None)¶ Returns a parser that is the composition of the parsers in
parser_values
.More specifically, the returned parser chomps by successively chomping with the parsers in
parser_values
, and produces a tuple of the values produced byparser_values
. Ifsep
is notNone
, then the returned parser will chomp withsep
between each parser inparser_values
(and discard the produced value).Parameters: - parser_values (Iterable[Union[Parser,str]]) – collection of parser values to compose
- sep (Union[Parser,str]) – separating parser to use between composition elements
- name (str) – name for the parser
Returns: a parser that is the composition of the parsers
parser_values
Return type: Example:
parser = fro.comp([r"ab?c+", r"~,", fro.intp]) parser.parse_str("abcc,4") # evaluates to ("abcc", 4) parser.parse_str("ac,-1") # evaluates to ("ac", -1) parser.parse_str("abc,0,") # fails
-
group_rgx
(regex_string, name=None)¶ Returns a parser that consumes the regular expression
regex_string
, and produces a tuple of the groups of the corresponding match. Regular expressions should adhere to the syntax outlined in the re module. Also see the re module for a description of regular expression groups.Parameters: - regex_string (str) – regular expression
- name (str) – name for the parser
Returns: parser that consumes the regular expression
regex_string
, and produces a tuple of the groups of the corresponding match.Return type: Example:
parser = fro.group_rgx(r"(x*)(y*)(z*)") parser.parse_str("xxz") # evaluates to ("xx", "", "z") parser.parse_str("wxyz") # fails
-
nested
(open_regex_string, close_regex_string, reducer=<built-in method join of str object>, name=None)¶ Returns a
Parser
that parses well-nested sequences where the opening token is given byopen_regex_string
and the closing token given byclose_regex_string
.The parser passes an iterator containing the chunks of content between the first opening token and final closing token into
reducer
, and produces the resulting value. The default behavior is to concatenate the chunks.If there are overlapping opening and closing tokens, the token with the earliest start positions wins, with ties going to opening tokens.
Parameters: - open_regex_string (str) – regex for opening tokens
- close_regex_string (str) – regex for closing tokens
- reducer (Callable[[Iterable[str],T]) – function from iterator of chunks to produced value
- name –
Returns: Example:
parser = fro.nested(r"\(", r"\)") parser.parse_str("(hello (there))") # evaluates to "hello (there)" parser.parse_str("(hello (there)") # fails, no closing ) for the first (
-
rgx
(regex_string, name=None)¶ Returns a parser that parses strings that match the given regular expression, and produces the string it consumed. The regular expressions should adhere to the syntax outlined in the re module
Parameters: - regex_string (str) – regex that parser should match
- name (str) – name for the parser
Returns: parser that parses strings that match the given regular expression
Return type: Example:
parser = fro.rgx(r"abc+") parser.parse_str("abccc") # evaluates to "abccc" parser.parse_str("abd") # fails
-
seq
(parser_value, reducer=<class 'list'>, sep=None, name=None)¶ Returns a parser that parses sequences of the values parsed by
parser_value
.More specifically, the returned parser repeatedly chomps with
parser_value
until it fails, passes an iterator of the produced values as argument toreducer
, and produces the resulting value.reducer
default to producing a list of the produced values.Ifsep
is notNone
, the returned parser chomps usingsep
between eachparser_value
chomp (and discards the produced value).Parameters: - parser_value (Union[Parser,str]) – Parser-like value
- reducer (Callable[[Iterable[str]],T]) – function from iterator of chunks to produced value
- sep (Union[Parser,str]) – separating parser to use between adjacent sequence elements
- name (str) – name for the parser
Returns: a parser that parses sequences of the values parsed by
parser_value
Return type: Example:
parser = fro.seq(fro.intp, sep=r",") parser.parse_str("") # evaluates to [] parser.parse_str("1") # evaluates to [1] parser.parse_str("1,2,3") # evaluates to [1, 2, 3] parser.parse_str("1,2,3,") # fails
-
thunk
(func, name=None)¶ Given a function
func
, which takes no argument and produces a parser value, returns a parser that when chomping, callsfunc()
and chomps with the resulting parser. This function is primarily intended for creating parsers whose behavior is dependent on some sort of external state.Parameters: - func (Callable[[],Parser]) –
- name (str) – name for the parser
Returns: a parser that parses with the parsers generated by
func
Return type: Example:
regex_box = fro.BoxedValue(r"ab*") parser = fro.thunk(lambda: regex_box.get(), name="Boxed regex") parser.parse_str("abb") # evaluates to "abb" parser.parse_str("aab") # fails box.update(r"cd*") parser.parse_str("cdddd") # evaluates to "cdddd" parser.parse_str("abb") # fails
-
tie
(func, name=None)¶ Given a function
func
, which maps one parser to another parser, returns a cyclic parser whose structure matches the parsers returned byfunc
.Conceptually, what happens is:
stub = some_placeholder result = func(stub) ... # in result, replace all references to stub to instead point back to result
The parser
tie(func)
is equivalent tochain(func)
, except thattie(func)
is a cyclic parser, whereaschain(func)
is a lazily-evaluated infinite parser. This difference is relevant only when the corresponding parsers are dependent on external state. In other cases, it is more memory-efficient to usetie(func)
.Fro parsers parse top-down, so users of this function should take care to avoid left recursion. In general the parser
func(parser)
should consume input before delegating parsing to theparser
argument.Since parsers are immutable, the only way to create a self-referencing parser is via
tie(..)
.Parameters: - func (Callable[[Parser],Parser]) – function for generating cyclic parser
- name (str) – name for the parser
Returns: a cyclic parser whose structure matches the parsers returned by
func
Return type: Example:
def func(parser): return fro.comp([r"~\(", parser.maybe(0),r"~\)"]) | lambda n: n + 1 parser = fro.tie(func) parser.parse("(())") # evaluates to 2 parser.parse("(((())))") # evaluates to 4 parser.parse("((()") # fails
-
until
(regex_str, reducer=<function <lambda>>, name=None)¶ Returns a parser that consumes all input until it encounters a match to the given regular expression, or the end of the input.
The parser passes an iterator of the chunks it consumed to
reducer
, and produces the resulting value. By default, the parser producesNone
. The parser does not consume the match when parsing, but only everything up until the match.Parameters: - regex_str (str) – regex until which the parser will consume
- reducer (Callable[[Iterable[str]],T]) – function from iterator of chunks to produced value
- name (str) – name for the parser
Returns: a parser that consumes all input until it encounters a match to
regex_str
or the end of the inputReturn type: Example:
untilp = fro.until(r"a|b", reducer=lambda chunks: sum(len(chunk) for chunk in chunks), name="until a or b") parser = fro.comp([untilp, r"apples"], name="composition") parser.parse(["hello\n","world\n", "apples"]) # evaluates to (12, apples)
Built-in Parsers¶
For convenience, the Fro module provides several common parsers.
-
floatp
¶ A parser that parses floating-point values from their string representations.
Type: Parser
-
intp
¶ A parser that parses int values from their string representations.
Type: Parser
-
natp
¶ A parser that parses non-negative integers (i.e. natural numbers) from their string representations.
Type: Parser
-
posintp
¶ - A parser that parses positive integers from their string representations.
Type: Parser
-
floatp
¶ A
Parser
that parses floating-point values from their string representations.
-
intp
¶ A
Parser
that parses int values from their string representations.
-
natp
¶ A
Parser
that parses non-negative integers (i.e. natural numbers) from their string representations.
-
posintp
¶ A
Parser
that parses positive integers from their string representations.
FroParseError¶
FroParseError exceptions are raised by the parse(..)
family of methods upon parsing failures.
-
exception
FroParseError
¶ An exception for parsing failures
-
__str__
()¶ A human readable description of the error. Include both the error messages, and extra information describing the location of the error. Equivalent to
to_str()
.Returns: a human readable description Return type: str
-
cause
()¶ Returns the
Exception
that triggered this error, orNone
is this error was not triggered by another exceptionReturns: the exception that triggered this error Return type: Exception
-
column
(index_from=1)¶ Returns the column number where the error occurred, or more generally the index inside the chunk where the error occurred. Indices are indexed from
index_from
.Parameters: index_from (int) – number to index column numbers by Returns: column number of error Return type: int
-
line
(index_from=1)¶ Returns the line number where the error occurred, or more generally the index of the chunk where the error occurred. Indices are indexed from
index_from
.Parameters: index_from (int) – number to index line numbers by Returns: row number of error Return type: int
-
messages
()¶ A non-empty list of
Message
objects which describe the reasons for failure. :return: a non-empty list ofMessage
objects which describe the reasons for failure. :rtype: List[FroParseError.Message]
-
to_str
(index_from=1, filename=None)¶ Returns a readable description of the error, with indices starting at
index_from
, and a filename offilename
include if a filename is provided. Include both the error messages, and extra information describing the location of the error. This method is essentially a configurable version of__str__()
.Parameters: - index_from (int) – number to index column/line numbers by
- filename (str) – name of file whose parse trigger the exception
Returns: a readable description of the error
Return type: str
-
-
class
FroParseError.
Message
¶ Represents an error message describing a reason for failure
-
__str__
()¶ A string representation of the message that includes both the content and parser name. :return:
-
content
()¶ The content of the error message
Returns: the content of the error message Return type: str
-
name
()¶ The name of the parser at which the message was generated, or
None
if all relevant parsers are unnamed. :return: name of parser where error occurred :rtype: str
-
BoxedValue¶
To facilitate creating parser that dependent on external state, the Fro module offers the
BoxedValue
class. For an example of their usage, see Example 3: XML.
-
class
BoxedValue
(value)¶ An updatable boxed value
-
__init__
(value)¶ Initialize the box with
value
Parameters: value – value for box to hold
-
get
()¶ Return the box’s current value
Returns: current value
-
get_and_update
(value)¶ Updates the box’s value, and returns the previous value.
Parameters: value – updated value for box to hold Returns: The previously held value
-
update
(value)¶ Update the box’s value
Parameters: value – updated value for box to hold
-
update_and_get
(value)¶ Update the box’s value, and the return the updated value
Parameters: value – updated value for box to hold Returns: The previously held value
-