Package prest :: Module tknz :: Class Tokenizer
[hide private]
[frames] | no frames]

Class Tokenizer

source code

object --+
         |
        Tokenizer

Abstract class for all tokenizers.

Class Hierarchy for Tokenizer
Class Hierarchy for Tokenizer

Instance Methods [hide private]
 
__init__(self, stream, blankspaces=char.blankspaces, separators=char.separators)
Constructor of the Tokenizer abstract class.
source code
bool
is_blankspace(self, char)
Test if a character is a blankspace.
source code
bool
is_separator(self, char)
Test if a character is a separator.
source code
 
count_chars(self) source code
 
reset_stream(self) source code
 
count_tokens(self) source code
 
has_more_tokens(self) source code
 
next_token(self) source code
 
progress(self) source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  __metaclass__ = abc.ABCMeta
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, stream, blankspaces=char.blankspaces, separators=char.separators)
(Constructor)

source code 

Constructor of the Tokenizer abstract class.

Parameters:
  • stream (str or io.IOBase) - The stream to tokenize. Can be a filename or any open IO stream.
  • blankspaces (str) - The characters that represent empty spaces.
  • separators (str) - The characters that separate token units (e.g. word boundaries).
Overrides: object.__init__

is_blankspace(self, char)

source code 

Test if a character is a blankspace.

Parameters:
  • char (str) - The character to test.
Returns: bool
True if character is a blankspace, False otherwise.

is_separator(self, char)

source code 

Test if a character is a separator.

Parameters:
  • char (str) - The character to test.
Returns: bool
True if character is a separator, False otherwise.

count_chars(self)

source code 
Decorators:
  • @abc.abstractmethod

reset_stream(self)

source code 
Decorators:
  • @abc.abstractmethod

count_tokens(self)

source code 
Decorators:
  • @abc.abstractmethod

has_more_tokens(self)

source code 
Decorators:
  • @abc.abstractmethod

next_token(self)

source code 
Decorators:
  • @abc.abstractmethod

progress(self)

source code 
Decorators:
  • @abc.abstractmethod