Script buildcjkdb
[hide private]
[frames] | no frames]

Script buildcjkdb

Builds the database for cjklib.

For the Unihan data only characters in the Basic Multilingual Plane (BMP) with code values between U+0000 and U+FFFF are currently included, as MySQL < 6 doesn't support 4-byte UTF-8. To include characters outside the BMP change 'UnihanBMPBuilder' and 'SlimUnihanBMPBuilder' to 'UnihanBuilder' and 'SlimUnihanBuilder' respectively.

For MS Windows default versions provided seem to be a "narrow build" and not support characters outside the BMP (see e.g. http://wordaligned.org/articles/narrow-python). Currently no Unicode characters outside the BMP will thus be supported on Windows platforms.

Some TableBuilders make an assumption about the file names being loaded (the builder only knows the directory of the data files), so naming the input files according to the builder's setting is necessary.


Copyright: Copyright (C) 2006-2008 Christoph Burgmer.

To Do (Impl): Add option for rebuilding dependencies by setting rebuildDepending=False/True (True by default). Consider asking the user if all dependencies should be rebuilt.

Functions [hide private]
 
version()
Prints the version of this script.
 
usage()
Prints the usage for this script.
 
printFormattedLine(outputString, lineLength=80, subsequentPrefix='')
Formats the given input string to fit to a output with a limited line length and prints it to stdout with the systems encoding.
 
main()
Main method of script
Variables [hide private]
  DEFAULT_DATA_PATH = ['.', '/tmp/cjklib-read-only/cjklib/data']
  buildModulePath = '/tmp/cjklib-read-only/cjklib'
  BUILD_GROUPS = {'KangxiRadicalData': ['CharacterKangxiRadical'...
Definition of build groups available to the user.
Function Details [hide private]

printFormattedLine(outputString, lineLength=80, subsequentPrefix='')

 

Formats the given input string to fit to a output with a limited line length and prints it to stdout with the systems encoding.

Parameters:
  • outputString (str) - a string that is formated to fit to the screen
  • lineLength (int) - with of screen
  • subsequentPrefix (str) - prefix used after line break

Variables Details [hide private]

BUILD_GROUPS

Definition of build groups available to the user. Recursive definitions are not allowed and will lead to a lock up.

Value:
{'KangxiRadicalData': ['CharacterKangxiRadical',
                       'KangxiRadical',
                       'KangxiRadicalIsolatedCharacter',
                       'RadicalEquivalentCharacter',
                       'CharacterRadicalResidualStrokeCount',
                       'CharacterResidualStrokeCount'],
 'Readings': ['PinyinSyllables',
              'WadeGilesSyllables',
...