cjklib

Package cjklib

Han character library. Cjklib provides language routines related to Han characters (characters based on Chinese characters named Hanzi, Kanji, Hanja and chu Han respectively) used in writing of the Chinese, the Japanese, infrequently the Korean and formerly the Vietnamese language(s). Functionality is included for character pronunciations, radicals, glyph components, stroke decomposition and variant information.

Supported

Tools

Data

This project makes use of the Unicode Han database provided by the Unicode Consortium: Unicode Standard Annex #38 - Unicode Han database (Unihan): http://www.unicode.org/reports/tr38/tr38-5.html, 28.03.2008.

Currently no data validation scheme is implemented as this library is still in early development. Rather than specifying few data cjklib tries to support as much options as possible. The library tries to be as accurate as possible but mistakes do happen, especially for data which differs on different locales.

Dependencies

cjklib is written in Python and is well tested on Python 2.5. Apart from this dependency it needs a database back-end for most of its parts and library SQLAlchemy. Currently tested are:

Author: Christoph Burgmer <cburgmer@ira.uka.de>

Requires: Python 2.5+, SQLAlchemy 0.5+ and either SQLite 3+ or MySQL 5+ and MySQL-Python

Version: 0.1alpha

cjklib comes with absolutely no warranty; for details see License.

Parts of the data used by this library are copyrighted by the following organisations:

Copyright © 1991-2007 Unicode, Inc. All rights reserved. Distributed under the Terms of Use in http://www.unicode.org/copyright.html.
Permission is hereby granted, free of charge, to any person obtaining a copy of the Unicode data files and any associated documentation (the "Data Files") or Unicode software and any associated documentation (the "Software") to deal in the Data Files or Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of the Data Files or Software, and to permit persons to whom the Data Files or Software are furnished to do so, provided that (a) the above copyright notice(s) and this permission notice appear with all copies of the Data Files or Software, (b) both the above copyright notice(s) and this permission notice appear in associated documentation, and (c) there is clear notice in each modified Data File or in the Software as well as in the documentation associated with the Data File(s) or Software that the data or software has been modified.
The Jyutping phrase box, Linguistic Society of Hong Kong.
The copyright of the Jyutping phrase box belongs to the Linguistic Society of Hong Kong. We would like to thank the Jyutping Group of the Linguistic Society of Hong Kong for permission to use the electronic file in our research and/or product development. Note that the inclusion of the phrase box in the Unihan database requires that any products developed using the kCantonese field needs to include this acknowledgement.

License: The library and all parts are distributed under the terms of the LGPL Version 3, 29 June 2007 (http://www.gnu.org/licenses/lgpl.html) if not otherwise noted.