Package cjklib :: Module build :: Class CharacterVariantBuilder :: Class VariantGenerator
[hide private]
[frames] | no frames]

Class VariantGenerator

source code

Generates the character to variant mapping from the Unihan table.

Instance Methods [hide private]
 
__init__(self, variantEntries, typeList, quiet=False)
Initialises the VariantGenerator.
source code
 
generator(self)
Provides one entry per variant and character.
source code
Class Variables [hide private]
  HEX_INDEX_REGEX = re.compile(r'\s*U\+([0-9A-F]+)\s*$')
  MULT_HEX_INDEX_REGEX = re.compile(r'\s*(U\+([0-9A-F]+)( |(?=$)...
  MULT_HEX_INDEX_FIND_REGEX = re.compile(r'U\+([0-9A-F]+)(?: |(?...
  SEMANTIC_REGEX = re.compile(r'(U\+[0-9A-F]+(<\S+)?( |(?=$)))+$')
  SEMANTIC_FIND_REGEX = re.compile(r'U\+([0-9A-F]+)(?:<\S+)?(?: ...
  ZVARIANT_REGEX = re.compile(r'\s*U\+([0-9A-F]+)(?::\S+)?\s*$')
  VARIANT_REGEX_MAPPING = {'C': (re.compile(r'\s*U\+([0-9A-F]+)\...
Mapping of entry types to regular expression describing the entry's pattern.
Method Details [hide private]

__init__(self, variantEntries, typeList, quiet=False)
(Constructor)

source code 

Initialises the VariantGenerator.

Parameters:
  • variantEntries (list of tuple) - character variant entries from the Unihan database
  • typeList (list of str) - variant types in the order given in tableEntries
  • quiet (bool) - if true no status information will be printed

Class Variable Details [hide private]

MULT_HEX_INDEX_REGEX

Value:
re.compile(r'\s*(U\+([0-9A-F]+)( |(?=$)))+\s*$')

MULT_HEX_INDEX_FIND_REGEX

Value:
re.compile(r'U\+([0-9A-F]+)(?: |(?=$))')

SEMANTIC_FIND_REGEX

Value:
re.compile(r'U\+([0-9A-F]+)(?:<\S+)?(?: |(?=$))')

VARIANT_REGEX_MAPPING

Mapping of entry types to regular expression describing the entry's pattern.

Value:
{'C': (re.compile(r'\s*U\+([0-9A-F]+)\s*$'),
       re.compile(r'\s*U\+([0-9A-F]+)\s*$')),
 'M': (re.compile(r'(U\+[0-9A-F]+(<\S+)?( |(?=$)))+$'),
       re.compile(r'U\+([0-9A-F]+)(?:<\S+)?(?: |(?=$))')),
 'P': (re.compile(r'(U\+[0-9A-F]+(<\S+)?( |(?=$)))+$'),
       re.compile(r'U\+([0-9A-F]+)(?:<\S+)?(?: |(?=$))')),
 'S': (re.compile(r'\s*(U\+([0-9A-F]+)( |(?=$)))+\s*$'),
       re.compile(r'U\+([0-9A-F]+)(?: |(?=$))')),
...