eleve.leveldb

Provide a Storage (eleve.leveldb.LeveldbStorage) and a Trie (eleve.leveldb.LeveldbTrie) that use LevelDB as disk backend. The implementation over LevelDB is done in python by using plyvel.

eleve.leveldb.to_bytes(o)[source]

Encode the object as a bytes object: - if it’s already a bytes object, don’t do nothing - else, take its string representation and encode it as a bytes

eleve.leveldb.ngram_to_key(ngram)[source]

Convert a ngram to a leveldb key (a bytes object).

The first byte is the length of the ngram, then we have SEPARATOR and the bytes representation of the token, for each token.

class eleve.leveldb.Node(db, key, data=None)[source]

Bases: object

Represents a node of the trie in Leveldb. Loaded by its key. Can update its entropy, and save it in leveldb. Can list its childs.

__init__(db, key, data=None)[source]
Parameters:
  • db – the leveldb object (used to retrieve/save the nodes)
  • (bytes) (key) – the key of the node in the database
  • data – should be generally kept as a None. if you have the data, you can pass them as a bytes object. if you pass False, we won’t try to retrieve them and assume the node doesn’t exists.
iter_childs()[source]
Returns:the childs of the node as other Node objects.
save(db=None)[source]

Save the node in the database.

Parameters:db – You can optionally pass a database if you want to save it here instead of the default database.
update_entropy(terminals)[source]

Update the entropy of the node (and save it if it changed).

Parameters:terminals – a set of bytes. If a token is inside that set, it will count as N different tokens instead of a token with count N.
class eleve.leveldb.LeveldbTrie(path, terminals=[])[source]

Bases: eleve.memory.MemoryTrie

__init__(path, terminals=[])[source]

Create or opent a Trie using leveldb as backend.

root

Returns root node

close()[source]
clear()[source]

Delete the trie that’s in the database.

update_stats()[source]
node(ngram)[source]
add_ngram(ngram, freq=1)[source]
query_count(ngram)[source]
query_entropy(ngram)[source]
query_ev(ngram)[source]
query_autonomy(ngram)[source]
class eleve.leveldb.LeveldbStorage(path, default_ngram_length=None)[source]

Bases: eleve.memory.MemoryStorage

__init__(path, default_ngram_length=None)[source]

Initialize the model.

Parameters:
  • path – Path to the database where to load and store the model. If the path is not existing an empty model will be created.
  • default_ngram_length – the default maximum length of n-gram beeing stored. It will equals 5 for a newly created storage. Note that it may be overriden in add_sentence().
default_ngram_length
close()[source]