hanziconv package

Hanzi Converter 繁簡轉換器 | 繁简转换器

This module converts between simplified and traditional Chinese Characters. It consists of two parts:

  • a command line tool: hanzi-convert
  • a python library: hanziconv

It supports both Python 2 and 3

Build Status Latest Version Documentation Source Code License

Installation

$ pip install hanziconv

Uninstallation

$ [sudo] pip uninstall hanziconv

Command Line Tool

Synopsis

$ ./hanzi-convert --help
usage: hanzi-convert [-h] [-o OUTFILE] [-s] [-v] infile

Simplified and Traditional Chinese Character Conversion
Version 0.3.2 (By Bernard Yue)

Converting to Traditional Hanzi by default with no -s flag

positional arguments:
  infile                filename | "-", corresponds to stdin

optional arguments:
  -h, --help            show this help message and exit
  -o OUTFILE, --output OUTFILE
                        filename to save output, stdout if omitted
  -s, --simplified      convert to simplified characters
  -v, --version         show program's version number and exit

Example

Conversion from stdin

$ ./hanzi-convert -
Press Crtl-D when finished
Typing away
Now write some chinese characters
繁简转换器
^D
Typing away
Now write some chinese characters
繁簡轉換器
$

Testing

The module uses pytest. Use pip to install pytest.

$ [sudo] pip install pytest

Then checkout source code and run test as normal.

$ git clone https://github.com/berniey/hanziconv.git
$ cd hanziconv
$ python setup.py test

You are encouraged to use virtualenv and virtualenvwrapper for to avoid changing your currently operating environment.

License

This module is distributed under Apache License Version 2.0.

The character map used in this module is based on the Multi-function Chinese Character Database developed by Chinese University of Hong Kong.

Python API

Example

>>> from hanziconv import HanziConv
>>> print(HanziConv.toSimplified('繁簡轉換器'))
繁简转换器
>>> print(HanziConv.toTraditional('繁简转换器'))
繁簡轉換器
>>> HanziConv.same('繁簡轉換器', '繁简转换器')
True

API

class hanziconv.HanziConv

Bases: object

This class supports hanzi (漢字) convention between simplified and traditional format

classmethod same(text1, text2)

Return True if text1 and text2 meant literally the same, False otherwise

Parameters:
  • text1 – string to compare to text2
  • text2 – string to compare to text1
Returns:

Truetext1 and text2 are the same in meaning, False – otherwise

>>> from hanziconv import HanziConv
>>> print(HanziConv.same('繁简转换器', '繁簡轉換器'))
True
classmethod toSimplified(text)

Convert text to simplified character string. Assuming text is traditional character string

Parameters:text – text to convert
Returns:converted UTF-8 characters
>>> from hanziconv import HanziConv
>>> print(HanziConv.toSimplified('繁簡轉換器'))
繁简转换器
classmethod toTraditional(text)

Convert text to traditional character string. Assuming text is simplified character string

Parameters:text – text to convert
Returns:converted UTF-8 characters
>>> from hanziconv import HanziConv
>>> print(HanziConv.toTraditional('繁简转换器'))
繁簡轉換器