Development¶
Development takes places on GitHub, see <https://github.com/mbr/slugger>.
External libraries¶
There are very few actual rules for slugging inside the library itself, most of these come from external libraries. These are:
- glibc‘s locales, the
LC_CTYPE
section - unihandecode, a fork of unidecode that also handles asian languages other than chinese. unihandecode itself brings in four different transcription libraries for Chinese, Japanese, Korean and Vietnamese.
This is done mainly to offset the weaknesses of the respective libraries, as glibc handles asian transliterations rather poorly and incomplete, while unidecode (and with this, unihandecode) doesn’t handle any language specific substitutions at all.
The glibc-locale parser¶
The glcp.py
script contains a parser for glibc-locale files and extracts
the LC_CTYPE
section to use with the script. Try python glcp.py --help
for a bit of help.
Generating the localedata files¶
First, make sure the development dependencies are installed:
$ pip install click remember logbook
Afterwards, if you haven’t done so, checkout the glibc-submodule:
git submodule update --init --recursive
Now the glcp
tool can be used to generate the necessary localedata
.
$ mkdir -p slugger/localedata
$ glcp -o slugger/localedata glibc/localedata/locales/*