Package cjklib :: Package reading :: Module converter :: Class GRPinyinConverter
[hide private]
[frames] | no frames]

Class GRPinyinConverter

source code

Provides a converter between the Chinese romanisation Gwoyeu Romatzyh and Hanyu Pinyin.


Upper or lower case will be transfered between syllables, no special formatting according to the standards (i.e. Pinyin) will be made. Upper/ lower case will be identified according to three classes: either the whole syllable is upper case, only the initial letter is upper case or otherwise the whole syllable is assumed being lower case.


Conversion cannot in general be done in a one-to-one manner. Gwoyeu Romatzyh (GR) gives the etymological tone for a syllable in neutral tone while Pinyin doesn't. In contrast to tones in GR carrying more information r-coloured syllables (Erlhuah) are rendered the way they are pronounced that loosing the original syllable. Converting those forms to Pinyin in a general manner is not possible while yielding the original string in Chinese characters might help do disambiguate. Another issue tone-wise is that Pinyin allows to specify the changed tone when dealing with tone sandhis instead of the etymological one while GR doesn't. Only working with the Chinese character string might help to restore the original tone.

Conversion from Pinyin is crippled as the neutral tone in this form cannot be transfered to GR as described above. More information is needed to resolve this. For the other direction the neutral tone can be mapped but the etymological tone information is lost. For the optional neutral tone either a mapping is done to the neutral tone in Pinyin or to the original (etymological).

Instance Methods [hide private]
__init__(self, *args, **options)
Creates an instance of the GRPinyinConverter.
source code
convertBasicEntity(self, entity, fromReading, toReading)
Converts a basic entity (e.g.
source code
Creates an instance of a GROperator if needed and returns it.
source code

Inherited from RomanisationConverter: convertEntities

Inherited from ReadingConverter: convert, getOption

Inherited from ReadingConverter (private): _getFromOperator, _getToOperator

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Methods [hide private]
Returns the reading converter's default options.
source code
Class Variables [hide private]
  CONVERSION_DIRECTIONS = [('GR', 'Pinyin'), ('Pinyin', 'GR')]
List of tuples for specifying supported conversion directions from reading A to reading B.
  DEFAULT_READING_OPTIONS = {'GR': {'abbreviations': False}, 'Pi...
Defines default reading options for the reading used to convert from (to resp.) before (after resp.) converting to (from resp.) the user specified dialect.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, *args, **options)

source code 

Creates an instance of the GRPinyinConverter.

  • args - optional list of RomanisationOperators to use for handling source and target readings.
  • options - extra options
  • dbConnectInst - instance of a DatabaseConnector, if none is given, default settings will be assumed.
  • sourceOperators - list of ReadingOperators used for handling source readings.
  • targetOperators - list of ReadingOperators used for handling target readings.
  • GROptionalNeutralToneMapping - if set to 'original' GR syllables marked with an optional neutral tone will be mapped to the etymological tone, if set to 'neutral' they will be mapped to the neutral tone in Pinyin.
Overrides: object.__init__

Class Method

source code 

Returns the reading converter's default options.

The keyword 'dbConnectInst' is not regarded a configuration option of the converter and is thus not included in the dict returned.

Returns: dict
the reading converter's default options.
Overrides: ReadingConverter.getDefaultOptions
(inherited documentation)

convertBasicEntity(self, entity, fromReading, toReading)

source code 

Converts a basic entity (e.g. a syllable) in the source reading to the given target reading.

This method is called by convertEntities() and a lower case entity is given for conversion. The returned value should be in lower case characters too, as convertEntities() will take care of capitalisation.

If a single entity needs to be converted it is recommended to use convertEntities() instead. In the general case it can not be ensured that a mapping from one reading to another can be done by the simple conversion of a basic entity. One-to-many mappings are possible and there is no guarantee that any entity of a reading recognised by operator.ReadingOperator.isReadingEntity() will be mapped here.

The default implementation will raise a NotImplementedError.

  • entity - string written in the source reading in lower case letters
  • fromReading - name of the source reading
  • toReading - name of the target reading
Returns: str
the entity converted to the toReading in lower case
Overrides: EntityWiseReadingConverter.convertBasicEntity
(inherited documentation)

Class Variable Details [hide private]


Defines default reading options for the reading used to convert from (to resp.) before (after resp.) converting to (from resp.) the user specified dialect.

The most general reading dialect should be specified as to allow for a broad range of input.

{'GR': {'abbreviations': False}, 'Pinyin': {'Erhua': 'oneSyllable'}}