Package cjklib :: Module build :: Class HanDeDictBuilder
[hide private]
[frames] | no frames]

Class HanDeDictBuilder

source code


Builds the HanDeDict dictionary.

Nested Classes [hide private]

Inherited from EDICTFormatBuilder: TableGenerator

Instance Methods [hide private]
tuple
filterSpacing(self, entry)
Converts wrong spacing in readings of entries in HanDeDict.
source code
tuple
FILTER(self, entry)
Filter to apply to the read entry before writing to table.
source code
 
extractTimeStamp(self, filePath) source code
 
getPreferredFile(self, filePaths) source code
str
getArchiveContentName(self, filePath)
Function extracting the name of contained file from the zipped archive using the file name.
source code
str
findFile(self, fileGlobs, fileType=None)
Tries to locate a file with a given list of possible file names under the classes default data paths.
source code

Inherited from CEDICTFormatBuilder: __init__

Inherited from EDICTFormatBuilder: build, buildFTS3CreateTableStatement, buildFTS3Tables, getFileHandle, getGenerator, insertFTS3Tables, remove, testFTS3

Inherited from EntryGeneratorBuilder: getEntryDict

Inherited from TableBuilder: buildIndexObjects, buildTableObject

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Variables [hide private]
  PROVIDES = 'HanDeDict'
Contains the name of the table provided by this module.
  FILE_NAMES = ['handedict-*.zip', 'handedict-*.tar.bz2', 'hande...
Names of file containing the edict formated dictionary.
  ENCODING = 'utf-8'
Encoding of the dictionary file.

Inherited from CEDICTFormatBuilder: COLUMNS, COLUMN_TYPES, INDEX_KEYS

Inherited from EDICTFormatBuilder: ENTRY_REGEX, FULLTEXT_COLUMNS, IGNORE_LINES, PRIMARY_KEYS

Inherited from TableBuilder: DEPENDS

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

filterSpacing(self, entry)

source code 

Converts wrong spacing in readings of entries in HanDeDict.

Parameters:
  • entry (tuple) - a dictionary entry
Returns: tuple
the given entry with corrected spacing

FILTER(self, entry)

source code 

Filter to apply to the read entry before writing to table.

Parameters:
  • entry (tuple) - a dictionary entry
Returns: tuple
the given entry with corrected spacing
Overrides: FILTER

getArchiveContentName(self, filePath)

source code 

Function extracting the name of contained file from the zipped archive using the file name. Reimplement and adapt to own needs.

Parameters:
  • filePath - path of file
Returns: str
name of file in archive
Overrides: EDICTFormatBuilder.getArchiveContentName
(inherited documentation)

findFile(self, fileGlobs, fileType=None)

source code 

Tries to locate a file with a given list of possible file names under the classes default data paths.

Uses the newest version of all files found.

Parameters:
  • fileGlobs (str/list of str) - possible file names
  • fileType (str) - textual type of file used in error msg
Returns: str
path to file of first match in search for existing file
Raises:
  • IOError - if no file found
Overrides: TableBuilder.findFile

Class Variable Details [hide private]

FILE_NAMES

Names of file containing the edict formated dictionary.

Value:
['handedict-*.zip', 'handedict-*.tar.bz2', 'handedict.u8']