twistml.filtering.ldig package

Submodules

twistml.filtering.ldig.da module

class twistml.filtering.ldig.da.DoubleArray(verbose=False)

Bases: object

add_element(s, v)
extend_array(max_cand)
extract_features(st)
get(s)
get_child(c, subtree)
get_subtree(s)
get_value(subtree)
initialize(list)
load(filename)
log(format, param)
save(filename)
shrink_array(max_index)
validate_list(list)

twistml.filtering.ldig.ldig module

twistml.filtering.ldig.ldig.generate_doublearray(file, features)
twistml.filtering.ldig.ldig.htmlentity2unicode(text)
twistml.filtering.ldig.ldig.inference(param, labels, corpus, idlist, trie, options)
class twistml.filtering.ldig.ldig.ldig(model_dir)

Bases: object

debug(args)
detect(options, args)
init(temp_path, corpus_list, lbff, ngram_bound)

Extract features from corpus and generate TRIE(DoubleArray) data - load corpus - generate temporary file for maxsubst - generate double array and save it - parameter: lbff = lower bound of feature frequency

learn(options, args)
load_da()
load_features()
load_labels()
shrink()
twistml.filtering.ldig.ldig.likelihood(param, labels, trie, filelist, options)
twistml.filtering.ldig.ldig.load_corpus(filelist, labels)
twistml.filtering.ldig.ldig.normalize_text(org)
twistml.filtering.ldig.ldig.normalize_twitter(text)

normalization for twitter

twistml.filtering.ldig.ldig.predict(param, events)
twistml.filtering.ldig.ldig.shuffle(idlist)

Module contents

<package summary>

<extended summary>

<module listings>

Author:

Matthias Manhertz

Copyright:
  1. Matthias Manhertz 2015
Licence:

MIT