Examples¶

These examples use the convention of appending variable names with p to denote that the variable is a parser.

Example 1: Email Addresses¶

Suppose you are given a text file where each line is of the form:

<first name> <last name> : <email address>

Unfortunately for you, the file is rather sloppy; sometimes there is no whitespace between the last name and the colon, other times there are multiple spaces. There are some blank lines, and there are some lines with multiple entries. You need generate an EmailDirectory object from this file:

class EmailAddress(object):
    def __init__(self, local_name, domain):
        self.local_name = local_name
        self.domain = domain

    def __str__(self):
        return "{loc}@{dom}".format(loc=self.local_name, dom=self.domain)


class Name(object):
    def __init__(self, first, last):
        self.first = first
        self.last = last

    def __str__(self):
        return "{f} {l}".format(f=self.first, l=self.last)


class EmailDirectory(object):
    def __init__(self, entries):  # entries: (Name, EmailAddress) iterator
        self.entries = dict(entries) # dict from Name to EmailAddress

    def __str__(self):
        entry_strings = ("{n} : {a}".format(n=name, a=addr)
                         for (name, addr) in self.entries.items())
        return "\n".join(entry_strings)

Using Fro, you can write code for parsing the text file in under ten lines:

# email address ::= <local name>@<domain>, where <domain> is <something>.<something>
emailaddrp = fro.comp([r"[\w]+", r"~@", r"[\w]+\.[\w]+"], name="email address")\
                 .strips() >> EmailAddress
firstnamep = fro.rgx(r"[\w]+", name="first name").strips()
lastnamep = firstnamep.name("last name")
namep = fro.comp([firstnamep, lastnamep], name="full name") >> Name
# entry ::= <full name> : <email address>
entryp = fro.comp([namep, r"~:", emailaddrp], name="directory entry")
emaildirp = fro.seq(entryp, name="directory") | EmailDirectory

emaildirp.parse("input.txt")

Not only will this code exhibit the desired behavior, it will raise informative error messages when it encounters misformatted input.

Example 2: TeX¶

Suppose you want to parse a simplified TeX language. Each document consists of elements delimited by whitespace. Each element is either a command, of the form \commandname or \commandname{argument}, or raw text. You want to parse a (simplified) TeX file into a TexDocument:

class TexElement(object):
    pass  # "abstract" parent class


class TexCommand(TexElement):
    def __init__(self, name, argument=None):
        self.name = name
        self.argument = argument

    def __str__(self):
        if self.argument is None:
            return "\%s" % self.name
        return "\%s{%s}" % (self.name, self.argument)


class TexText(TexElement):
    def __init__(self, text):
        self.text = text

    def __str__(self):
        return self.text


class TexDocument(object):
    def __init__(self, elements):
        self.elements = elements

    def __str__(self):
        return " ".join(str(e) for e in self.elements)

With Fro, you can do this quickly and painlessly:

# parser for TexText objects
textp = fro.rgx(r"[^\\s]+", name="TeX text") | TexText

# parser for TexCommand objects
namep = fro.group_rgx(r"\\([^\{\s]+)").get()
argumentp = fro.group_rgx(r"\{([^\}]*?)\}").get().maybe()
commandp = fro.comp([namep, argumentp], name="TeX command") >> TexCommand

# parser for TexDocument objects
elementp = fro.alt([commandp, textp]).strips()
documentp = fro.seq(elementp, name="Tex document") | TexDocument

documentp.parse_file("input.txt")

Example 3: XML¶

Disclaimer: This is a more advanced example intended to highlight the powerful and expressive offerings of the Fro library. It is not intended to be a fully complete XML parser (it doesn’t support comments, for instance). If you actually needed to parse an XML file, you would be better off using a specialized library.

Suppose you have a large XML file, and you want to parse it into a hierarchy of XMLNode objects, shown below:

class XmlNode(object):
    def __init__(self, tag, text, children, tail):
        self._tag = tag
        self._text = text  # text appearing inside the node
        self._children = children  # list of XmlNodes
        self._tail = tail  # text appearing immediately after the node

With Fro, we can quickly write a declarative solution, and we get features such as informative error messages for free:

# Recognizes <open> tags, producing the tag name
open_tagp = fro.rgx(r"\w+",name="open tag").prepend(r"<").append(r">")


def close_of_tag(tag_name):
    # Regex for the </close> tag of a given tag name
    return re.escape("</{}>".format(tag_name))


def xml_node_parser(recursive_parser):
    tag = fro.BoxedValue(None)  # stores the tag name of the current XML node
    boxed_open_tagp = open_tagp.lstrips() | tag.update_and_get
    textp = fro.until(r"<", reducer="".join, name="text")
    childrenp = fro.seq(recursive_parser)
    tailp = textp.name("tail")
    boxed_close_tagp = fro.thunk(lambda: close_of_tag(tag.get()), name="close_tag")
    return fro.comp([boxed_open_tagp, textp,
                     childrenp, ~boxed_close_tagp, tailp]) >> XmlNode


xmlp = fro.chain(xml_node_parser)
xmlp.parse_file("input.xml")