dsegmenter.bparseg.bparsegmenter¶
-
class
dsegmenter.bparseg.bparsegmenter.
BparSegmenter
(a_featgen=<function featgen>, a_classify=<function classify>, a_model=u'/home/sidorenko/Projects/DiscourseSegmenter/dsegmenter/bparseg/data/bpar.model')[source]¶ Class for perfoming discourse segmentation on constituency trees.
-
DEFAULT_CLASSIFIER
= LinearSVC(C=0.3, class_weight=None, dual=True, fit_intercept=True, intercept_scaling=1, loss='squared_hinge', max_iter=1000, multi_class=u'crammer_singer', penalty='l2', random_state=None, tol=0.0001, verbose=0)¶ classifier object – default classification method
-
DEFAULT_MODEL
= u'/home/sidorenko/Projects/DiscourseSegmenter/dsegmenter/bparseg/data/bpar.model'¶ str – path to default model to use in classification
-
DEFAULT_PIPELINE
= Pipeline(steps=[(u'vectorizer', DictVectorizer(dtype=<type 'numpy.float64'>, separator='=', sort=True, sparse=True)), (u'var_filter', VarianceThreshold(threshold=0.0)), (u'LinearSVC', LinearSVC(C=0.3, class_weight=None, dual=True, fit_intercept=True, intercept_scaling=1, loss='squared_hinge', max_iter=1000, multi_class=u'crammer_singer', penalty='l2', random_state=None, tol=0.0001, verbose=0))])¶ pipeline object – default pipeline object used for classification
-
__init__
(a_featgen=<function featgen>, a_classify=<function classify>, a_model=u'/home/sidorenko/Projects/DiscourseSegmenter/dsegmenter/bparseg/data/bpar.model')[source]¶ Class constructor.
Parameters: - a_featgen (method) – function to be used for feature generation
- a_classify (method) – pointer to 2-arg function which predicts segment class for BitPar tree based on the model and features generated for that tree
- a_model (str) – path to a pre-trained model (previously dumped by joblib) or valid classification object or None
-
__weakref__
¶ list of weak references to the object (if defined)
-
segment
(a_trees)[source]¶ Create discourse segments based on the BitPar trees.
Parameters: a_trees (list) – list of sentence trees to be parsed Returns: constructed segment trees Return type: list
-