Package tipy :: Module tknz
[hide private]
[frames] | no frames]

Module tknz

source code

The Tokenizer class takes an input stream and parses it into tokens.

The parsing process is controlled by the character classification sets:

Each byte read from the input stream is regarded as a character in the range '\u0000' through '\u00FF'.

In addition, an instance has flags that control:

A typical application first constructs an instance of this class, supplying the input stream to be tokenized, the set of blankspaces, and the set of eparators, and then repeatedly loops, while method has_more_tokens() returns true, calling the next_token() method.

Classes [hide private]
  Tokenizer
Abstract class for all tokenizers.
  ForwardTokenizer
Tokenize a stream from the beginning to the end.
  ReverseTokenizer
Tokenize a stream from the end to the beginning.
  TextTokenizer
Tokenizer to tokenize a text file.
Variables [hide private]
  __package__ = 'tipy'