Package cjklib :: Module build
[hide private]
[frames] | no frames]

Module build

source code

Provides the building methods for the cjklib package.

Each table that needs to be created has to be implemented by a TableBuilder. The DatabaseBuilder is the central instance for managing the build process. As the creation of a table can depend on other tables the DatabaseBuilder keeps track of dependencies to process a build in the correct order.

Building is tested on the following storage methods:

Some TableBuilder implementations aren't used by the CJK library but are provided here for additional usage.

For MS Windows default versions provided seem to be a "narrow build" and not support characters outside the BMP (see e.g. http://wordaligned.org/articles/narrow-python). Currently no Unicode characters outside the BMP will thus be supported on Windows platforms.

Examples

The following examples should give a quick view into how to use this package.


To Do (Fix): On interruption (Ctrl+C) remove tables that were only created because of dependencies.

To Do (Impl): Further character domains: BIG5 (Taiwan), kIRG_GSource (Unicode, Simplified Chinese), kIRG_JSource (Unicode, Japanese), kIRG_KPSource and kIRG_KSource (Unicode, Korean), kIRG_TSource (Unicode, Traditional Chinese), kIRG_VSource (Unicode, Vietnamese)

Classes [hide private]
    TableBuilder and generic classes
  TableBuilder
TableBuilder provides the abstract layout for classes that build a distinct table.
  EntryGeneratorBuilder
Implements an abstract class for building a table from a generator providing entries.
  ListGenerator
A simple generator for a given list of elements.
    Unihan character information
  UnihanGenerator
Regular expression matching one entry in the Unihan database (e.g.
  UnihanBuilder
Builds the Unihan database from the Unihan file provided by Unicode.
  UnihanBMPBuilder
Builds the Unihan database from the Unihan file provided by Unicode for characters from the Basic Multilingual Plane (BMP) with code values between U+0000 and U+FFFF.
  SlimUnihanBuilder
Builds a slim version of the Unihan database.
  SlimUnihanBMPBuilder
Builds a slim version of the Unihan database from the Unihan file provided by Unicode for characters from the Basic Multilingual Plane (BMP) with code values between U+0000 and U+FFFF.
  Kanjidic2Builder
Builds the Kanjidic database from the Kanjidic2 XML file http://www.csse.monash.edu.au/~jwb/kanjidic2/.
  UnihanDerivedBuilder
Provides an abstract class for building a table with a relation between a Chinese character and another column using the Unihan database.
  UnihanStrokeCountBuilder
Builds a mapping between characters and their stroke count using the Unihan data.
  CharacterRadicalBuilder
Provides an abstract class for building a character radical mapping table using the Unihan database.
  CharacterKangxiRadicalBuilder
Builds the character Kangxi radical mapping table from the Unihan database.
  CharacterKanWaRadicalBuilder
Builds the character Dai Kan-Wa jiten radical mapping table from the Unihan database.
  CharacterJapaneseRadicalBuilder
Builds the character Japanese radical mapping table from the Unihan database.
  CharacterKoreanRadicalBuilder
Builds the character Korean radical mapping table from the Unihan database.
  CharacterVariantBuilder
Builds a character variant mapping table from the Unihan database.
  CharacterVariantBMPBuilder
Builds a character variant mapping table from the Unihan database for characters from the Basic Multilingual Plane (BMP) with code values between U+0000 and U+FFFF.
  UnihanCharacterSetBuilder
Builds a simple list of characters that belong to a specific class using the Unihan data.
  IICoreSetBuilder
Builds a simple list of all characters in IICore (Unicode International Ideograph Core).
  GB2312SetBuilder
Builds a simple list of all characters in the Chinese standard GB2312-80.
    Unihan reading information
  CharacterReadingBuilder
Provides an abstract class for building a character reading mapping table using the Unihan database.
  CharacterUnihanPinyinBuilder
Builds the character Pinyin mapping table from the Unihan database.
  CharacterJyutpingBuilder
Builds the character Jyutping mapping table from the Unihan database.
  CharacterJapaneseKunBuilder
Builds the character Kun'yomi mapping table from the Unihan database.
  CharacterJapaneseOnBuilder
Builds the character On'yomi mapping table from the Unihan database.
  CharacterHangulBuilder
Builds the character Hangul mapping table from the Unihan database.
  CharacterVietnameseBuilder
Builds the character Vietnamese mapping table from the Unihan database.
  CharacterXHPCReadingBuilder
Builds the Xiandai Hanyu Pinlu Cidian Pinyin mapping table using the Unihan database.
  CharacterXHCReadingBuilder
Builds the Xiandai Hanyu Cidian Pinyin mapping table using the Unihan database.
  CharacterPinyinBuilder
Builds the character Pinyin mapping table from the several sources.
    CSV file based
  CSVFileLoader
Builds a table by loading its data from a list of comma separated values (CSV).
  PinyinSyllablesBuilder
Builds a list of Pinyin syllables.
  PinyinInitialFinalBuilder
Builds a mapping from Pinyin syllables to their initial/final parts.
  WadeGilesSyllablesBuilder
Builds a list of Wade-Giles syllables.
  GRSyllablesBuilder
Builds a list of Gwoyeu Romatzyh syllables.
  GRRhotacisedFinalsBuilder
Builds a list of Gwoyeu Romatzyh rhotacised finals.
  GRAbbreviationBuilder
Builds a list of Gwoyeu Romatzyh abbreviated spellings.
  JyutpingSyllablesBuilder
Builds a list of Jyutping syllables.
  JyutpingInitialFinalBuilder
Builds a mapping from Jyutping syllables to their initial/final parts.
  CantoneseYaleSyllablesBuilder
Builds a list of Cantonese Yale syllables.
  CantoneseYaleInitialNucleusCodaBuilder
Builds a mapping of Cantonese syllable in the Yale romanisation system to the syllables' initial, nucleus and coda.
  JyutpingYaleMappingBuilder
Builds a mapping between syllables in Jyutping and the Yale romanization system.
  WadeGilesPinyinMappingBuilder
Builds a mapping between syllables in Wade-Giles and Pinyin.
  PinyinGRMappingBuilder
Builds a mapping between syllables in Pinyin and Gwoyeu Romatzyh.
  PinyinIPAMappingBuilder
Builds a mapping between syllables in Pinyin and their representation in IPA.
  MandarinIPAInitialFinalBuilder
Builds a mapping from Mandarin syllables in IPA to their initial/final parts.
  JyutpingIPAMappingBuilder
Builds a mapping between syllables in Jyutping and their representation in IPA.
  CantoneseIPAInitialFinalBuilder
Builds a mapping from Cantonese syllables in IPA to their initial/final parts.
  KangxiRadicalBuilder
Builds a mapping between Kangxi radical index and radical characters.
  KangxiRadicalIsolatedCharacterBuilder
Builds a mapping between Kangxi radical index and radical equivalent characters without radical form.
  RadicalEquivalentCharacterBuilder
Builds a mapping between Unicode radical forms and Unicode radical variants on one side and equivalent characters on the other side.
  StrokesBuilder
Builds a list of strokes and their names.
  StrokeOrderBuilder
Builds a mapping between characters and their stroke order.
  CharacterDecompositionBuilder
Builds a mapping between characters and their decomposition.
  LocaleCharacterVariantBuilder
Builds a mapping between a character under a locale and its default variant.
  MandarinBraileInitialBuilder
Builds a mapping of Mandarin Chinese syllable initials in Pinyin to Braille characters.
  MandarinBraileFinalBuilder
Builds a mapping of Mandarin Chinese syllable finals in Pinyin to Braille characters.
    Library dependant
  ZVariantBuilder
Builds a list of glyph indices for characters.
  StrokeCountBuilder
Builds a mapping between characters and their stroke count.
  CombinedStrokeCountBuilder
Builds a mapping between characters and their stroke count.
  CharacterComponentLookupBuilder
Builds a mapping between characters and their components.
  CharacterRadicalStrokeCountBuilder
Builds a mapping between characters and their radical with stroke count of residual components.
  CharacterResidualStrokeCountBuilder
Builds a mapping between characters and their residual stroke count when splitting of the radical form.
  CombinedCharacterResidualStrokeCountBuilder
Builds a mapping between characters and their residual stroke count when splitting of the radical form.
    Dictionary builder
  EDICTFormatBuilder
Provides an abstract class for loading EDICT formatted dictionaries.
  WordIndexBuilder
Builds a translation word index for a given dictionary.
  EDICTBuilder
Builds the EDICT dictionary.
  EDICTWordIndexBuilder
Builds the word index of the EDICT dictionary.
  CEDICTFormatBuilder
Provides an abstract class for loading CEDICT formatted dictionaries.
  CEDICTBuilder
Builds the CEDICT dictionary.
  CEDICTWordIndexBuilder
Builds the word index of the CEDICT dictionary.
  CEDICTGRBuilder
Builds the CEDICT-GR dictionary.
  CEDICTGRWordIndexBuilder
Builds the word index of the CEDICT-GR dictionary.
  HanDeDictBuilder
Builds the HanDeDict dictionary.
  HanDeDictWordIndexBuilder
Builds the word index of the HanDeDict dictionary.
    DatabaseBuilder
  DatabaseBuilder
DatabaseBuilder provides the main class for building up a database for the cjklib package.
Functions [hide private]
    Global methods
 
warn(message)
Prints the given message to stderr with the system's default encoding.
source code
Function Details [hide private]

warn(message)

source code 

Prints the given message to stderr with the system's default encoding.

Parameters:
  • message (str) - message to print