Package pyparsing :: Module pyparsing :: Class ParserElement
[frames] | no frames]

Class ParserElement

source code

object --+
         |
        ParserElement
Known Subclasses:

Abstract base level parser element class.

Nested Classes
  literalStringClass
Token to exactly match a specified string.
Instance Methods
 
__init__(self, savelist=False)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
copy(self)
Make a copy of this ParserElement.
source code
 
setName(self, name)
Define name for this expression, for use in debugging.
source code
 
setResultsName(self, name, listAllMatches=False)
Define name for referencing matching tokens as a nested attribute of the returned parse results.
source code
 
setBreak(self, breakFlag=True)
Method to invoke the Python pdb debugger when this element is about to be parsed.
source code
 
setParseAction(self, *fns, **kwargs)
Define action to perform when successfully matching parse element definition.
source code
 
addParseAction(self, *fns, **kwargs)
Add parse action to expression's list of parse actions.
source code
 
setFailAction(self, fn)
Define action to perform if parsing fails at this expression.
source code
 
preParse(self, instring, loc) source code
 
parseImpl(self, instring, loc, doActions=True) source code
 
postParse(self, instring, loc, tokenlist) source code
 
tryParse(self, instring, loc) source code
 
parseString(self, instring, parseAll=False)
Execute the parse expression with the given string.
source code
 
scanString(self, instring, maxMatches=2147483647, overlap=False)
Scan the input string for expression matches.
source code
 
transformString(self, instring)
Extension to scanString, to modify matching text with modified tokens that may be returned from a parse action.
source code
 
searchString(self, instring, maxMatches=2147483647)
Another extension to scanString, simplifying the access to the tokens found to match the given parse expression.
source code
 
__add__(self, other)
Implementation of + operator - returns And
source code
 
__radd__(self, other)
Implementation of + operator when left operand is not a ParserElement
source code
 
__sub__(self, other)
Implementation of - operator, returns And with error stop
source code
 
__rsub__(self, other)
Implementation of - operator when left operand is not a ParserElement
source code
 
__mul__(self, other)
Implementation of * operator, allows use of expr * 3 in place of expr + expr + expr.
source code
 
__rmul__(self, other) source code
 
__or__(self, other)
Implementation of | operator - returns MatchFirst
source code
 
__ror__(self, other)
Implementation of | operator when left operand is not a ParserElement
source code
 
__xor__(self, other)
Implementation of ^ operator - returns Or
source code
 
__rxor__(self, other)
Implementation of ^ operator when left operand is not a ParserElement
source code
 
__and__(self, other)
Implementation of & operator - returns Each
source code
 
__rand__(self, other)
Implementation of & operator when left operand is not a ParserElement
source code
 
__invert__(self)
Implementation of ~ operator - returns NotAny
source code
 
__call__(self, name=None)
Shortcut for setResultsName, with listAllMatches=default:
source code
 
suppress(self)
Suppresses the output of this ParserElement; useful to keep punctuation from cluttering up returned output.
source code
 
leaveWhitespace(self)
Disables the skipping of whitespace before matching the characters in the ParserElement's defined pattern.
source code
 
setWhitespaceChars(self, chars)
Overrides the default whitespace chars
source code
 
parseWithTabs(self)
Overrides default behavior to expand <TAB>s to spaces before parsing the input string.
source code
 
ignore(self, other)
Define expression to be ignored (e.g., comments) while doing pattern matching; may be called repeatedly, to define multiple comment or other ignorable patterns.
source code
 
setDebugActions(self, startAction, successAction, exceptionAction)
Enable display of debugging messages while doing pattern matching.
source code
 
setDebug(self, flag=True)
Enable display of debugging messages while doing pattern matching.
source code
 
__str__(self)
str(x)
source code
 
__repr__(self)
repr(x)
source code
 
streamline(self) source code
 
checkRecursion(self, parseElementList) source code
 
validate(self, validateTrace=[])
Check defined expressions for valid structure, check for infinite recursive definitions.
source code
 
parseFile(self, file_or_filename, parseAll=False)
Execute the parse expression on the given file or filename.
source code
 
__eq__(self, other) source code
 
__ne__(self, other) source code
 
__hash__(self)
hash(x)
source code
 
__req__(self, other) source code
 
__rne__(self, other) source code

Inherited from object: __delattr__, __format__, __getattribute__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Static Methods
 
setDefaultWhitespaceChars(chars)
Overrides the default whitespace chars
source code
 
inlineLiteralsUsing(cls)
Set class to be used for inclusion of string literals into a parser.
source code
 
resetCache() source code
 
enablePackrat()
Enables "packrat" parsing, which adds memoizing to the parsing logic.
source code
Class Variables
  DEFAULT_WHITE_CHARS = ' \n\t\r'
  verbose_stacktrace = False
Properties

Inherited from object: __class__

Method Details

__init__(self, savelist=False)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides: object.__init__
(inherited documentation)

copy(self)

source code 

Make a copy of this ParserElement. Useful for defining different parse actions for the same parsing pattern, using copies of the original parse element.

setResultsName(self, name, listAllMatches=False)

source code 

Define name for referencing matching tokens as a nested attribute of the returned parse results. NOTE: this returns a *copy* of the original ParserElement object; this is so that the client can define a basic element, such as an integer, and reference it in multiple places with different names.

You can also set results names using the abbreviated syntax, expr("name") in place of expr.setResultsName("name") - see __call__.

setBreak(self, breakFlag=True)

source code 

Method to invoke the Python pdb debugger when this element is about to be parsed. Set breakFlag to True to enable, False to disable.

setParseAction(self, *fns, **kwargs)

source code 

Define action to perform when successfully matching parse element definition. Parse action fn is a callable method with 0-3 arguments, called as fn(s,loc,toks), fn(loc,toks), fn(toks), or just fn(), where:

  • s = the original string being parsed (see note below)
  • loc = the location of the matching substring
  • toks = a list of the matched tokens, packaged as a ParseResults object

If the functions in fns modify the tokens, they can return them as the return value from fn, and the modified list of tokens will replace the original. Otherwise, fn does not need to return any value.

Note: the default parsing behavior is to expand tabs in the input string before starting the parsing process. See parseString for more information on parsing strings containing <TAB>s, and suggested methods to maintain a consistent view of the parsed string, the parse location, and line and column positions within the parsed string.

addParseAction(self, *fns, **kwargs)

source code 

Add parse action to expression's list of parse actions. See setParseAction.

setFailAction(self, fn)

source code 

Define action to perform if parsing fails at this expression. Fail acton fn is a callable function that takes the arguments fn(s,loc,expr,err) where:

  • s = string being parsed
  • loc = location where expression match was attempted and failed
  • expr = the parse expression that failed
  • err = the exception thrown

The function returns no value. It may throw ParseFatalException if it is desired to stop parsing immediately.

enablePackrat()
Static Method

source code 

Enables "packrat" parsing, which adds memoizing to the parsing logic. Repeated parse attempts at the same string location (which happens often in many complex grammars) can immediately return a cached value, instead of re-executing parsing/validating code. Memoizing is done of both valid results and parsing exceptions.

This speedup may break existing programs that use parse actions that have side-effects. For this reason, packrat parsing is disabled when you first import pyparsing. To activate the packrat feature, your program must call the class method ParserElement.enablePackrat(). If your program uses psyco to "compile as you go", you must call enablePackrat before calling psyco.full(). If you do not do this, Python will crash. For best results, call enablePackrat() immediately after importing pyparsing.

parseString(self, instring, parseAll=False)

source code 

Execute the parse expression with the given string. This is the main interface to the client code, once the complete expression has been built.

If you want the grammar to require that the entire input string be successfully parsed, then set parseAll to True (equivalent to ending the grammar with StringEnd()).

Note: parseString implicitly calls expandtabs() on the input string, in order to report proper column numbers in parse actions. If the input string contains tabs and the grammar uses parse actions that use the loc argument to index into the string being parsed, you can ensure you have a consistent view of the input string by:

  • calling parseWithTabs on your grammar before calling parseString (see parseWithTabs)
  • define your parse action using the full (s,loc,toks) signature, and reference the input string using the parse action's s argument
  • explictly expand the tabs in your input string before calling parseString

scanString(self, instring, maxMatches=2147483647, overlap=False)

source code 

Scan the input string for expression matches. Each match will return the matching tokens, start location, and end location. May be called with optional maxMatches argument, to clip scanning after 'n' matches are found. If overlap is specified, then overlapping matches will be reported.

Note that the start and end locations are reported relative to the string being parsed. See parseString for more information on parsing strings with embedded tabs.

transformString(self, instring)

source code 

Extension to scanString, to modify matching text with modified tokens that may be returned from a parse action. To use transformString, define a grammar and attach a parse action to it that modifies the returned token list. Invoking transformString() on a target string will then scan for matches, and replace the matched text patterns according to the logic in the parse action. transformString() returns the resulting transformed string.

searchString(self, instring, maxMatches=2147483647)

source code 

Another extension to scanString, simplifying the access to the tokens found to match the given parse expression. May be called with optional maxMatches argument, to clip searching after 'n' matches are found.

__mul__(self, other)

source code 

Implementation of * operator, allows use of expr * 3 in place of expr + expr + expr. Expressions may also me multiplied by a 2-integer tuple, similar to {min,max} multipliers in regular expressions. Tuples may also include None as in:

  • expr*(n,None) or expr*(n,) is equivalent to expr*n + ZeroOrMore(expr) (read as "at least n instances of expr")
  • expr*(None,n) is equivalent to expr*(0,n) (read as "0 to n instances of expr")
  • expr*(None,None) is equivalent to ZeroOrMore(expr)
  • expr*(1,None) is equivalent to OneOrMore(expr)

Note that expr*(None,n) does not raise an exception if more than n exprs exist in the input stream; that is, expr*(None,n) does not enforce a maximum number of expr occurrences. If this behavior is desired, then write expr*(None,n) + ~expr

__call__(self, name=None)
(Call operator)

source code 

Shortcut for setResultsName, with listAllMatches=default:

 userdata = Word(alphas).setResultsName("name") + Word(nums+"-").setResultsName("socsecno")

could be written as:

 userdata = Word(alphas)("name") + Word(nums+"-")("socsecno")

If name is given with a trailing '*' character, then listAllMatches will be passed as True.

If name is omitted, same as calling copy.

leaveWhitespace(self)

source code 

Disables the skipping of whitespace before matching the characters in the ParserElement's defined pattern. This is normally only used internally by the pyparsing module, but may be needed in some whitespace-sensitive grammars.

parseWithTabs(self)

source code 

Overrides default behavior to expand <TAB>s to spaces before parsing the input string. Must be called before parseString when the input grammar contains elements that match <TAB> characters.

setDebug(self, flag=True)

source code 

Enable display of debugging messages while doing pattern matching. Set flag to True to enable, False to disable.

__str__(self)
(Informal representation operator)

source code 

str(x)

Overrides: object.__str__
(inherited documentation)

__repr__(self)
(Representation operator)

source code 

repr(x)

Overrides: object.__repr__
(inherited documentation)

parseFile(self, file_or_filename, parseAll=False)

source code 

Execute the parse expression on the given file or filename. If a filename is specified (instead of a file object), the entire file is opened, read, and closed before parsing.

__hash__(self)
(Hashing function)

source code 

hash(x)

Overrides: object.__hash__
(inherited documentation)