Package cjklib :: Package reading :: Module operator :: Class MandarinIPAOperator
[hide private]
[frames] | no frames]

Class MandarinIPAOperator

source code


Provides an operator on strings in Mandarin Chinese written in the International Phonetic Alphabet (IPA).

Features:

Tones

Tones in IPA can be expressed using different schemes. The following schemes are implemented here:

Unlike other operators for Mandarin, distinction is made for six different tonal occurrences. The third tone is affected by tone sandhi and basically two different tone contours exist. Therefore getTonalEntity() and splitEntityTone() work with string representations as tones defined in TONES. Same behaviour as found in other operators for Mandarin can be achieved by simply using the first character of the given string:

>>> from cjklib.reading import operator
>>> ipaOp = operator.MandarinIPAOperator(toneMarkType='IPAToneBar')
>>> syllable, toneName = ipaOp.splitEntityTone(u'mən˧˥')
>>> tone = int(toneName[0])

The implemented schemes render tone information differently. Mapping might lose information so a full back-transformation can not be guaranteed.

Source

Instance Methods [hide private]
set of str
getPlainReadingEntities(self)
Gets the list of plain entities supported by this reading.
source code
tuple of str
getOnsetRhyme(self, plainSyllable)
Splits the given plain syllable into onset (initial) and rhyme (final).
source code

Inherited from TonalIPAOperator: __init__, compose, decompose, getTonalEntity, getToneForToneMark, getTones, splitEntityTone

Inherited from TonalFixedEntityOperator: getReadingEntities, isPlainReadingEntity, isReadingEntity

Inherited from ReadingOperator: getOption

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Methods [hide private]

Inherited from TonalIPAOperator: getDefaultOptions

Class Variables [hide private]
  READING_NAME = 'MandarinIPA'
Unique name of reading
  TONE_MARK_PREFER = {'ChaoDigits': {}, 'Diacritics': {}, 'IPATo...
Mapping of tone marks to tone name which will be preferred on ambiguous mappings.
  TONES = ['1stTone', '2ndTone', '3rdToneRegular', '3rdToneLow',...
List of tone names.
  TONE_MARK_MAPPING = {'ChaoDigits': {'1stTone': '55', '2ndTone'...
Mapping of tone names to tone mark for each tone mark type.

Inherited from TonalIPAOperator: DEFAULT_TONE_MARK_TYPE, TONE_MARK_REGEX

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

getPlainReadingEntities(self)

source code 

Gets the list of plain entities supported by this reading. These entities will carry no tone mark.

Returns: set of str
set of supported syllables
Overrides: TonalFixedEntityOperator.getPlainReadingEntities

getOnsetRhyme(self, plainSyllable)

source code 

Splits the given plain syllable into onset (initial) and rhyme (final).

Parameters:
  • plainSyllable (str) - syllable in IPA without tone marks
Returns: tuple of str
tuple of syllable onset and rhyme
Raises:

Class Variable Details [hide private]

TONE_MARK_PREFER

Mapping of tone marks to tone name which will be preferred on ambiguous mappings. Needs to be implemented in child classes.

Value:
{'ChaoDigits': {},
 'Diacritics': {},
 'IPAToneBar': {},
 'Numbers': {'3': '3rdToneRegular', '5': '5thTone'}}

TONES

List of tone names. Needs to be implemented in child class.

Value:
['1stTone',
 '2ndTone',
 '3rdToneRegular',
 '3rdToneLow',
 '4thTone',
 '5thTone',
 '5thToneHalfHigh',
 '5thToneMiddle',
...

TONE_MARK_MAPPING

Mapping of tone names to tone mark for each tone mark type. Needs to be implemented in child classes.

Value:
{'ChaoDigits': {'1stTone': '55',
                '2ndTone': '35',
                '3rdToneLow': '21',
                '3rdToneRegular': '214',
                '4thTone': '51',
                '5thTone': '',
                '5thToneHalfHigh': '',
                '5thToneHalfLow': '',
...