Package cjklib :: Package reading :: Class ReadingFactory
[hide private]
[frames] | no frames]

Class ReadingFactory

source code


Provides an abstract factory for creating ReadingOperators and ReadingConverters and a façade to directly access the methods offered by these classes.

Instances of other classes are cached in the background and reused on later calls for methods accessed through the façade. createReadingOperator() and createReadingConverter can be used to create new instances for use outside of the ReadingFactory.


To Do (Bug): Non standard reading options seem to be accepted when default in converter:

>>> print f.convert('lao3shi1', 'Pinyin', 'MandarinIPA')
lau˨˩.ʂʅ˥˥

To Do (Impl):
Nested Classes [hide private]
  SimpleReadingConverterAdaptor
Defines a simple converter between two character readings that keeps the real converter doing the work in the background.
Instance Methods [hide private]
 
__init__(self, databaseUrl=None, dbConnectInst=None)
Initialises the ReadingFactory.
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

    Meta
 
publishReadingOperator(self, readingOperator)
Publishes a ReadingOperator to the list and thus makes it available for other methods in the library.
source code
list of str
getSupportedReadings(self)
Gets a list of all supported readings.
source code
classobj
getReadingOperatorClass(self, readingN)
Gets the ReadingOperator's class for the given reading.
source code
instance
createReadingOperator(self, readingN, **options)
Creates an instance of a ReadingOperator for the given reading.
source code
 
publishReadingConverter(self, readingConverter)
Publishes a ReadingConverter to the list and thus makes it available for other methods in the library.
source code
classobj
getReadingConverterClass(self, fromReading, toReading)
Gets the ReadingConverter's class for the given source and target reading.
source code
instance
createReadingConverter(self, fromReading, toReading, *args, **options)
Creates an instance of a ReadingConverter for the given source and target reading and returns it wrapped as a SimpleReadingConverterAdaptor.
source code
bool
isReadingConversionSupported(self, fromReading, toReading)
Checks if the conversion from reading A to reading B is supported.
source code
 
getDefaultOptions(*args)
Returns the default options for the ReadingOperator or ReadingConverter applied for the given reading name or names respectively.
source code
instance
_getReadingOperatorInstance(self, readingN, **options)
Returns an instance of a ReadingOperator for the given reading from the internal cache and creates it if it doesn't exist yet.
source code
instance
_getReadingConverterInstance(self, fromReading, toReading, *args, **options)
Returns an instance of a ReadingConverter for the given source and target reading from the internal cache and creates it if it doesn't exist yet.
source code
 
_checkSpecialOperators(self, fromReading, toReading, args, options)
Checks for special operators requested for the given source and target reading.
source code
    ReadingConverter methods
str
convert(self, readingStr, fromReading, toReading, *args, **options)
Converts the given string in the source reading to the given target reading.
source code
list of str
convertEntities(self, readingEntities, fromReading, toReading, *args, **options)
Converts a list of entities in the source reading to the given target reading.
source code
    ReadingOperator methods
list of str
decompose(self, string, readingN, **options)
Decomposes the given string into basic entities that can be mapped to one Chinese character each for the given reading.
source code
str
compose(self, readingEntities, readingN, **options)
Composes the given list of basic entities to a string for the given reading.
source code
bool
isReadingEntity(self, entity, readingN, **options)
Checks if the given string is an entity of the given reading.
source code
    RomanisationOperator methods
list of list of str
getDecompositions(self, string, readingN, **options)
Decomposes the given string into basic entities that can be mapped to one Chinese character each for ambiguous decompositions.
source code
list of list of str
segment(self, string, readingN, **options)
Takes a string written in the romanisation and returns the possible segmentations as a list of syllables.
source code
bool
isStrictDecomposition(self, decomposition, readingN, **options)
Checks if the given decomposition follows the romanisation format strictly to allow unambiguous decomposition.
source code
set of str
getReadingEntities(self, readingN, **options)
Gets a set of all entities supported by the reading.
source code
    TonalRomanisationOperator methods
list
getTones(self, readingN, **options)
Returns a set of tones supported by the reading.
source code
str
getTonalEntity(self, plainEntity, tone, readingN, **options)
Gets the entity with tone mark for the given plain entity and tone.
source code
tuple
splitEntityTone(self, entity, readingN, **options)
Splits the entity into an entity without tone mark (plain entity) and the entity's tone.
source code
set of str
getPlainReadingEntities(self, readingN, **options)
Gets the list of plain entities supported by this reading.
source code
bool
isPlainReadingEntity(self, entity, readingN, **options)
Returns true if the given plain entity (without any tone mark) is recognised by the romanisation operator, i.e.
source code
Static Methods [hide private]
    Meta
 
_getHashableCopy(data)
Constructs a unique hashable (partially deep-)copy for a given instance, replacing non-hashable datatypes set, dict and list recursively.
source code
Class Variables [hide private]
  READING_OPERATORS = [<class 'cjklib.reading.operator.HangulOpe...
A list of supported reading operators.
  READING_CONVERTERS = [<class 'cjklib.reading.converter.PinyinD...
A list of supported reading converters.
  sharedState = {'readingConverterClasses': {}, 'readingOperator...
Dictionary holding global state information used by all instances of the ReadingFactory.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, databaseUrl=None, dbConnectInst=None)
(Constructor)

source code 

Initialises the ReadingFactory.

If no parameters are given default values are assumed for the connection to the database. The database connection parameters can be given in databaseUrl, or an instance of DatabaseConnector can be passed in dbConnectInst, the latter one being preferred if both are specified.

Parameters:
  • databaseUrl (str) - database connection setting in the format driver://user:pass@host/database.
  • dbConnectInst (instance) - instance of a DatabaseConnector
Overrides: object.__init__

Bug: Specifying another database connector overwrites settings of other instances.

publishReadingOperator(self, readingOperator)

source code 

Publishes a ReadingOperator to the list and thus makes it available for other methods in the library.

Parameters:

getSupportedReadings(self)

source code 

Gets a list of all supported readings.

Returns: list of str
a list of readings a ReadingOperator is available for

getReadingOperatorClass(self, readingN)

source code 

Gets the ReadingOperator's class for the given reading.

Parameters:
  • readingN (str) - name of a supported reading
Returns: classobj
a ReadingOperator class
Raises:

createReadingOperator(self, readingN, **options)

source code 

Creates an instance of a ReadingOperator for the given reading.

Parameters:
  • readingN (str) - name of a supported reading
  • options - options for the created instance
Returns: instance
a ReadingOperator instance
Raises:

publishReadingConverter(self, readingConverter)

source code 

Publishes a ReadingConverter to the list and thus makes it available for other methods in the library.

Parameters:
  • readingConverter (classobj) - a new readingConverter to be published

getReadingConverterClass(self, fromReading, toReading)

source code 

Gets the ReadingConverter's class for the given source and target reading.

Parameters:
  • fromReading (str) - name of the source reading
  • toReading (str) - name of the target reading
Returns: classobj
a ReadingConverter class
Raises:

createReadingConverter(self, fromReading, toReading, *args, **options)

source code 

Creates an instance of a ReadingConverter for the given source and target reading and returns it wrapped as a SimpleReadingConverterAdaptor.

As ReadingConverters generally support more than one conversion direction the user needs to specify which source and target reading is needed on a regular instance. Wrapping the created instance in the adaptor gives a simple convert() and convertEntities() routine, such that on conversion the source and target readings don't have to be specified. Other methods signatures remain unchanged.

Parameters:
  • fromReading (str) - name of the source reading
  • toReading (str) - name of the target reading
  • args - optional list of RomanisationOperators to use for handling source and target readings.
  • options - options for the created instance
  • hideComplexConverter - if true the ReadingConverter is wrapped as a SimpleReadingConverterAdaptor (default).
  • sourceOperators - list of ReadingOperators used for handling source readings.
  • targetOperators - list of ReadingOperators used for handling target readings.
  • sourceOptions - dictionary of options to configure the ReadingOperators used for handling source readings. If an operator for the source reading is explicitly specified, no options can be given.
  • targetOptions - dictionary of options to configure the ReadingOperators used for handling target readings. If an operator for the target reading is explicitly specified, no options can be given.
Returns: instance
a SimpleReadingConverterAdaptor or ReadingConverter instance
Raises:

isReadingConversionSupported(self, fromReading, toReading)

source code 

Checks if the conversion from reading A to reading B is supported.

Returns: bool
true if conversion is supported, false otherwise

getDefaultOptions(*args)

source code 

Returns the default options for the ReadingOperator or ReadingConverter applied for the given reading name or names respectively.

The keyword 'dbConnectInst' is not regarded a configuration option and is thus not included in the dict returned.

Raises:
  • ValueError - if more than one or two reading names are given.
  • UnsupportedError - if no ReadingOperator or ReadingConverter exists for the given reading or readings respectively.

_getReadingOperatorInstance(self, readingN, **options)

source code 

Returns an instance of a ReadingOperator for the given reading from the internal cache and creates it if it doesn't exist yet.

Parameters:
  • readingN (str) - name of a supported reading
  • options - additional options for instance
Returns: instance
a ReadingOperator instance
Raises:

To Do (Impl): Get all options when calculating key for an instance and use the information on standard parameters thus minimising instances in cache. Same for _getReadingConverterInstance().

_getReadingConverterInstance(self, fromReading, toReading, *args, **options)

source code 

Returns an instance of a ReadingConverter for the given source and target reading from the internal cache and creates it if it doesn't exist yet.

Parameters:
  • fromReading (str) - name of the source reading
  • toReading (str) - name of the target reading
  • args - optional list of RomanisationOperators to use for handling source and target readings.
  • options - additional options for instance
  • sourceOperators - list of ReadingOperators used for handling source readings.
  • targetOperators - list of ReadingOperators used for handling target readings.
  • sourceOptions - dictionary of options to configure the ReadingOperators used for handling source readings. If an operator for the source reading is explicitly specified, no options can be given.
  • targetOptions - dictionary of options to configure the ReadingOperators used for handling target readings. If an operator for the target reading is explicitly specified, no options can be given.
Returns: instance
an ReadingConverter instance
Raises:

To Do (Fix): Reusing of instances for other supported conversion directions isn't that efficient if a special ReadingOperator is specified for one direction, that doesn't affect others.

_checkSpecialOperators(self, fromReading, toReading, args, options)

source code 

Checks for special operators requested for the given source and target reading.

Parameters:
  • fromReading (str) - name of the source reading
  • toReading (str) - name of the target reading
  • args - optional list of RomanisationOperators to use for handling source and target readings.
  • options - additional options for handling the input
  • sourceOperators - list of ReadingOperators used for handling source readings.
  • targetOperators - list of ReadingOperators used for handling target readings.
  • sourceOptions - dictionary of options to configure the ReadingOperators used for handling source readings. If an operator for the source reading is explicitly specified, no options can be given.
  • targetOptions - dictionary of options to configure the ReadingOperators used for handling target readings. If an operator for the target reading is explicitly specified, no options can be given.
Raises:
  • ValueError - if options are given to create a specific ReadingOperator, but an instance is already given in args.
  • UnsupportedError - if source or target reading is not supported.

_getHashableCopy(data)
Static Method

source code 

Constructs a unique hashable (partially deep-)copy for a given instance, replacing non-hashable datatypes set, dict and list recursively.

Parameters:
  • data - non-hashable object
Returns:
hashable object, set converted to a frozenset, dict converted to a frozenset of key-value-pairs (tuple), and list converted to a tuple.

convert(self, readingStr, fromReading, toReading, *args, **options)

source code 

Converts the given string in the source reading to the given target reading.

Parameters:
  • readingStr (str) - string that needs to be converted
  • fromReading (str) - name of the source reading
  • toReading (str) - name of the target reading
  • args - optional list of RomanisationOperators to use for handling source and target readings.
  • options - additional options for handling the input
  • sourceOperators - list of ReadingOperators used for handling source readings.
  • targetOperators - list of ReadingOperators used for handling target readings.
  • sourceOptions - dictionary of options to configure the ReadingOperators used for handling source readings. If an operator for the source reading is explicitly specified, no options can be given.
  • targetOptions - dictionary of options to configure the ReadingOperators used for handling target readings. If an operator for the target reading is explicitly specified, no options can be given.
Returns: str
the converted string
Raises:
  • DecompositionError - if the string can not be decomposed into basic entities with regards to the source reading or the given information is insufficient.
  • ConversionError - on operations specific to the conversion between the two readings (e.g. error on converting entities).
  • UnsupportedError - if source or target reading is not supported for conversion.

convertEntities(self, readingEntities, fromReading, toReading, *args, **options)

source code 

Converts a list of entities in the source reading to the given target reading.

Parameters:
  • readingEntities (list of str) - list of entities written in source reading
  • fromReading (str) - name of the source reading
  • toReading (str) - name of the target reading
  • args - optional list of RomanisationOperators to use for handling source and target readings.
  • options - additional options for handling the input
  • sourceOperators - list of ReadingOperators used for handling source readings.
  • targetOperators - list of ReadingOperators used for handling target readings.
  • sourceOptions - dictionary of options to configure the ReadingOperators used for handling source readings. If an operator for the source reading is explicitly specified, no options can be given.
  • targetOptions - dictionary of options to configure the ReadingOperators used for handling target readings. If an operator for the target reading is explicitly specified, no options can be given.
Returns: list of str
list of entities written in target reading
Raises:
  • ConversionError - on operations specific to the conversion between the two readings (e.g. error on converting entities).
  • UnsupportedError - if source or target reading is not supported for conversion.
  • InvalidEntityError - if an invalid entity is given.

decompose(self, string, readingN, **options)

source code 

Decomposes the given string into basic entities that can be mapped to one Chinese character each for the given reading.

The given input string can contain other non reading characters, e.g. punctuation marks.

The returned list contains a mix of basic reading entities and other characters e.g. spaces and punctuation marks.

Parameters:
  • string (str) - reading string
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: list of str
a list of basic entities of the input string
Raises:

compose(self, readingEntities, readingN, **options)

source code 

Composes the given list of basic entities to a string for the given reading.

Parameters:
  • readingEntities (list of str) - list of basic syllables or other content
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: str
composed entities
Raises:

isReadingEntity(self, entity, readingN, **options)

source code 

Checks if the given string is an entity of the given reading.

Parameters:
  • entity (str) - entity to check
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: bool
true if string is an entity of the reading, false otherwise.
Raises:

getDecompositions(self, string, readingN, **options)

source code 

Decomposes the given string into basic entities that can be mapped to one Chinese character each for ambiguous decompositions. It all possible decompositions. This method is a more general version of decompose().

The returned list construction consists of two entity types: entities of the romanisation and other strings.

Parameters:
  • string (str) - reading string
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: list of list of str
a list of all possible decompositions consisting of basic entities.
Raises:
  • DecompositionError - if the given string has a wrong format.
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

segment(self, string, readingN, **options)

source code 

Takes a string written in the romanisation and returns the possible segmentations as a list of syllables.

In contrast to decompose() this method merely segments continuous entities of the romanisation. Characters not part of the romanisation will not be dealt with, this is the task of the more general decompose method.

Parameters:
  • string (str) - reading string
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: list of list of str
a list of possible segmentations (several if ambiguous) into single syllables
Raises:
  • DecompositionError - if the given string has an invalid format.
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

isStrictDecomposition(self, decomposition, readingN, **options)

source code 

Checks if the given decomposition follows the romanisation format strictly to allow unambiguous decomposition.

The romanisation should offer a way/protocol to make an unambiguous decomposition into it's basic syllables possible as to make the process of appending syllables to a string reversible. The testing on compliance with this protocol has to be implemented here. Thus this method can only return true for one and only one possible decomposition for all strings.

Parameters:
  • decomposition (list of str) - decomposed reading string
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: bool
False, as this methods needs to be implemented by the sub class
Raises:
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

getReadingEntities(self, readingN, **options)

source code 

Gets a set of all entities supported by the reading.

The list is used in the segmentation process to find entity boundaries.

Parameters:
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: set of str
set of supported syllables
Raises:
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

getTones(self, readingN, **options)

source code 

Returns a set of tones supported by the reading.

Parameters:
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: list
list of supported tone marks.
Raises:
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

getTonalEntity(self, plainEntity, tone, readingN, **options)

source code 

Gets the entity with tone mark for the given plain entity and tone.

Parameters:
  • plainEntity (str) - entity without tonal information
  • tone - tone
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: str
entity with appropriate tone
Raises:

splitEntityTone(self, entity, readingN, **options)

source code 

Splits the entity into an entity without tone mark (plain entity) and the entity's tone.

Parameters:
  • entity (str) - entity with tonal information
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: tuple
plain entity without tone mark and entity's tone
Raises:

getPlainReadingEntities(self, readingN, **options)

source code 

Gets the list of plain entities supported by this reading. Different to getReadingEntities() the entities will carry no tone mark.

Parameters:
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: set of str
set of supported syllables
Raises:
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

isPlainReadingEntity(self, entity, readingN, **options)

source code 

Returns true if the given plain entity (without any tone mark) is recognised by the romanisation operator, i.e. it is a valid entity of the reading returned by the segmentation method.

Reading entities will be handled as being case insensitive.

Parameters:
  • entity (str) - entity to check
  • readingN (str) - name of reading
  • options - additional options for handling the input
Returns: bool
True if string is an entity of the reading, False otherwise.
Raises:
  • UnsupportedError - if the given reading is not supported or the reading doesn't support the specified method.

Class Variable Details [hide private]

READING_OPERATORS

A list of supported reading operators.

Value:
[<class 'cjklib.reading.operator.HangulOperator'>,
 <class 'cjklib.reading.operator.PinyinOperator'>,
 <class 'cjklib.reading.operator.WadeGilesOperator'>,
 <class 'cjklib.reading.operator.GROperator'>,
 <class 'cjklib.reading.operator.MandarinIPAOperator'>,
 <class 'cjklib.reading.operator.MandarinBrailleOperator'>,
 <class 'cjklib.reading.operator.JyutpingOperator'>,
 <class 'cjklib.reading.operator.CantoneseYaleOperator'>,
...

READING_CONVERTERS

A list of supported reading converters.

Value:
[<class 'cjklib.reading.converter.PinyinDialectConverter'>,
 <class 'cjklib.reading.converter.WadeGilesDialectConverter'>,
 <class 'cjklib.reading.converter.PinyinWadeGilesConverter'>,
 <class 'cjklib.reading.converter.GRDialectConverter'>,
 <class 'cjklib.reading.converter.GRPinyinConverter'>,
 <class 'cjklib.reading.converter.PinyinIPAConverter'>,
 <class 'cjklib.reading.converter.PinyinBrailleConverter'>,
 <class 'cjklib.reading.converter.JyutpingDialectConverter'>,
...

sharedState

Dictionary holding global state information used by all instances of the ReadingFactory.

Value:
{'readingConverterClasses': {}, 'readingOperatorClasses': {}}