Package cjklib :: Package reading :: Module operator :: Class CantoneseIPAOperator
[hide private]
[frames] | no frames]

Class CantoneseIPAOperator

source code


Provides an operator on strings of the Cantonese language written in the International Phonetic Alphabet (IPA).

CantonteseIPAOperator does not supply the same closed set of syllables as other ReadingOperators as IPA provides different ways to represent pronunciation. Because of that a user defined IPA syllable will not easily map to another transcription system and thus only basic support is provided for this direction.

This operator supplies an additional method getOnsetRhyme() which allows breaking down syllables into their onset and rhyme.

Features:

Tones

Tones in IPA can be expressed using different schemes. The following schemes are implemented here:

Sources


See Also:

To Do (Lang): To Do (Impl):
Instance Methods [hide private]
 
__init__(self, **options)
Creates an instance of the CantoneseIPAOperator.
source code
list
getTones(self)
Returns a set of tones supported by the reading.
source code
set of str
getPlainReadingEntities(self)
Gets the list of plain entities supported by this reading.
source code
tuple of str
getOnsetRhyme(self, plainSyllable)
Splits the given plain syllable into onset (initial) and rhyme (final).
source code
str
getTonalEntity(self, plainEntity, tone)
Gets the entity with tone mark for the given plain entity and tone.
source code
tuple
splitEntityTone(self, entity)
Splits the entity into an entity without tone mark and the name of the entity's tone.
source code
str
getExplicitTone(self, plainSyllable, baseTone)
Gets the explicit tone for the given plain syllable and base tone.
source code
str
getBaseToneForToneMark(self, toneMark)
Gets the base tone (one of the 6/7 general tones) for the given tone mark.
source code

Inherited from TonalIPAOperator: compose, decompose, getToneForToneMark

Inherited from TonalFixedEntityOperator: getReadingEntities, isPlainReadingEntity, isReadingEntity

Inherited from ReadingOperator: getOption

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Methods [hide private]
dict
getDefaultOptions(cls)
Returns the reading operator's default options.
source code
Class Variables [hide private]
  READING_NAME = 'CantoneseIPA'
Unique name of reading
  TONES = ['HighLevel', 'MidLevel', 'MidLowLevel', 'HighRising',...
List of tone names.
  STOP_TONES = {'HighStopped': 'HighLevel', 'MidLowStopped': 'Mi...
Cantonese stop tone mapping to general level tones.
  STOP_TONES_EXPLICIT = {'HighStopped_Long': ('HighLevel', 'L'),...
Cantonese stop tone mapping to general level tones with stop tones realised for explicit marking short/long pronunciation.
  TONE_MARK_PREFER = {'ChaoDigits': {}, 'Diacritics': {}, 'IPATo...
Mapping of tone marks to tone name which will be preferred on ambiguous mappings.
  TONE_MARK_MAPPING = {'ChaoDigits': {'HighFalling': '52', 'High...
Mapping of tone names to tone mark for each tone mark type.

Inherited from TonalIPAOperator: DEFAULT_TONE_MARK_TYPE, TONE_MARK_REGEX

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, **options)
(Constructor)

source code 

Creates an instance of the CantoneseIPAOperator.

By default no tone marks will be shown.

Parameters:
  • options - extra options
  • dbConnectInst - instance of a DatabaseConnector, if none is given, default settings will be assumed.
  • toneMarkType - type of tone marks, one out of 'Numbers', 'ChaoDigits', 'IPAToneBar', 'Diacritics', 'None'
  • 1stToneName - tone for mark 1 under tone mark type 'Numbers', either 'HighLevel' or 'HighFalling'.
  • stopTones - if set to 'none' the basic 6 (7) tones will be used and stop tones will be reported as one of them, if set to 'general' the three stop tones will be included, if set to 'explicit' the short and long forms will be explicitly supported.
Overrides: object.__init__

getDefaultOptions(cls)
Class Method

source code 

Returns the reading operator's default options.

The default implementation returns an empty dictionary. The keyword 'dbConnectInst' is not regarded a configuration option of the operator and is thus not included in the dict returned.

Returns: dict
the reading operator's default options.
Overrides: ReadingOperator.getDefaultOptions
(inherited documentation)

getTones(self)

source code 

Returns a set of tones supported by the reading. These tones don't necessarily reflect the tones of the underlying language but may defer to reflect notational or other features.

The default implementation will raise a NotImplementedError.

Returns: list
list of supported tone marks.
Overrides: TonalFixedEntityOperator.getTones
(inherited documentation)

getPlainReadingEntities(self)

source code 

Gets the list of plain entities supported by this reading. Different to getReadingEntities() the entities will carry no tone mark.

The default implementation will raise a NotImplementedError.

Returns: set of str
set of supported syllables
Overrides: TonalFixedEntityOperator.getPlainReadingEntities
(inherited documentation)

getOnsetRhyme(self, plainSyllable)

source code 

Splits the given plain syllable into onset (initial) and rhyme (final).

Parameters:
  • plainSyllable (str) - syllable in IPA without tone marks
Returns: tuple of str
tuple of syllable onset and rhyme
Raises:

getTonalEntity(self, plainEntity, tone)

source code 

Gets the entity with tone mark for the given plain entity and tone.

The plain entity returned will always be in Unicode's Normalization Form C (NFC, see http://www.unicode.org/reports/tr15/).

Parameters:
  • plainEntity - entity without tonal information
  • tone - tone
Returns: str
entity with appropriate tone
Raises:
Overrides: TonalFixedEntityOperator.getTonalEntity
(inherited documentation)

splitEntityTone(self, entity)

source code 

Splits the entity into an entity without tone mark and the name of the entity's tone.

The plain entity returned will always be in Unicode's Normalization Form C (NFC, see http://www.unicode.org/reports/tr15/).

Parameters:
  • entity - entity with tonal information
Returns: tuple
plain entity without tone mark and additionally the tone
Raises:
Overrides: TonalFixedEntityOperator.splitEntityTone
(inherited documentation)

getExplicitTone(self, plainSyllable, baseTone)

source code 

Gets the explicit tone for the given plain syllable and base tone.

In case the 6 (7) base tones are used, the stop tone value can be deduced from the given syllable. The stop tone returned will be even more precise in denoting the vowel length that influences the tone contour.

Parameters:
  • plainSyllable (str) - syllable without tonal information
  • baseTone (str) - tone
Returns: str
explicit tone
Raises:

getBaseToneForToneMark(self, toneMark)

source code 

Gets the base tone (one of the 6/7 general tones) for the given tone mark.

Parameters:
  • toneMark (str) - tone mark representation of the tone
Returns: str
base tone
Raises:

Class Variable Details [hide private]

TONES

List of tone names. Needs to be implemented in child class.

Value:
['HighLevel',
 'MidLevel',
 'MidLowLevel',
 'HighRising',
 'MidLowRising',
 'MidLowFalling',
 'HighFalling']

STOP_TONES

Cantonese stop tone mapping to general level tones.

Value:
{'HighStopped': 'HighLevel',
 'MidLowStopped': 'MidLowLevel',
 'MidStopped': 'MidLevel'}

STOP_TONES_EXPLICIT

Cantonese stop tone mapping to general level tones with stop tones realised for explicit marking short/long pronunciation.

Value:
{'HighStopped_Long': ('HighLevel', 'L'),
 'HighStopped_Short': ('HighLevel', 'S'),
 'MidLowStopped_Long': ('MidLowLevel', 'L'),
 'MidLowStopped_Short': ('MidLowLevel', 'S'),
 'MidStopped_Long': ('MidLevel', 'L'),
 'MidStopped_Short': ('MidLevel', 'S')}

TONE_MARK_PREFER

Mapping of tone marks to tone name which will be preferred on ambiguous mappings. Needs to be implemented in child classes.

Value:
{'ChaoDigits': {},
 'Diacritics': {},
 'IPAToneBar': {},
 'Numbers': {'1': 'HighLevel'}}

TONE_MARK_MAPPING

Mapping of tone names to tone mark for each tone mark type. Needs to be implemented in child classes.

Value:
{'ChaoDigits': {'HighFalling': '52',
                'HighLevel': '55',
                'HighRising': '25',
                'HighStopped_Long': '55',
                'HighStopped_Short': '5',
                'MidLevel': '33',
                'MidLowFalling': '21',
                'MidLowLevel': '22',
...