Provides a converter for different representations of the Chinese
romanisation Hanyu Pinyin.
The following examples show how to convert between different
representations of Pinyin.
-
Create the Converter and convert from standard Pinyin to Pinyin
with tones represented by numbers:
>>> from cjklib.reading import *
>>> targetOp = operator.PinyinOperator(toneMarkType='Numbers')
>>> pinyinConv = converter.PinyinDialectConverter(
... targetOperators=[targetOp])
>>> pinyinConv.convert(u'hànzì', 'Pinyin', 'Pinyin')
u'han4zi4'
-
Convert Pinyin written with numbers, the ü (u with umlaut) replaced
by character v and omitted fifth tone to standard Pinyin:
>>> sourceOp = operator.PinyinOperator(toneMarkType='Numbers',
... yVowel='v', missingToneMark='fifth')
>>> pinyinConv = converter.PinyinDialectConverter(
... sourceOperators=[sourceOp])
>>> pinyinConv.convert('nv3hai2zi', 'Pinyin', 'Pinyin')
u'nǚháizi'
-
Or more elegantly:
>>> f = ReadingFactory()
>>> f.convert('nv3hai2zi', 'Pinyin', 'Pinyin',
... sourceOptions={'toneMarkType': 'Numbers', 'yVowel': 'v',
... 'missingToneMark': 'fifth'})
u'nǚháizi'
-
Decompose the reading of a dictionary entry from CEDICT into
syllables and convert the ü-vowel and forms of Erhua sound:
>>> pinyinFrom = operator.PinyinOperator(toneMarkType='Numbers',
... yVowel='u:', Erhua='oneSyllable')
>>> syllables = pinyinFrom.decompose('sun1nu:r3')
>>> print syllables
['sun1', 'nu:r3']
>>> pinyinTo = operator.PinyinOperator(toneMarkType='Numbers',
... Erhua='twoSyllables')
>>> pinyinConv = converter.PinyinDialectConverter(
... sourceOperators=[pinyinFrom], targetOperators=[pinyinTo])
>>> pinyinConv.convertEntities(syllables, 'Pinyin', 'Pinyin')
[u'sun1', u'nü3', u'r5']
-
Or more elegantly with entities already decomposed:
>>> f.convertEntities(['sun1', 'nu:r3'], 'Pinyin', 'Pinyin',
... sourceOptions={'toneMarkType': 'Numbers', 'yVowel': 'u:',
... 'Erhua': 'oneSyllable'},
... targetOptions={'toneMarkType': 'Numbers',
... 'Erhua': 'twoSyllables'})
[u'sun1', u'nü3', u'r5']
|
__init__(self,
*args,
**options)
Creates an instance of the PinyinDialectConverter. |
source code
|
|
list of str
|
convertEntities(self,
readingEntities,
fromReading=' Pinyin ' ,
toReading=' Pinyin ' )
Converts a list of entities in the source reading to the given target
reading. |
source code
|
|
list of tuple/str
|
|
list of tuple/str
|
|
list of tuple/str
|
|
Inherited from ReadingConverter :
convert ,
getOption
Inherited from object :
__delattr__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__repr__ ,
__setattr__ ,
__str__
|