dsegmenter.bparseg.constituency_tree

Module providing class for handling constituency syntax trees.

dsegmenter.bparseg.constituency_tree.OP

str – special token used to substitute opening parentheses

dsegmenter.bparseg.constituency_tree.OP_RE

re – regexp for matching opening parentheses

dsegmenter.bparseg.constituency_tree.CP

str – special token used to substitute closing parentheses

dsegmenter.bparseg.constituency_tree.CP_RE

re – regexp for matching closing parentheses

class dsegmenter.bparseg.constituency_tree.CTree[source]

Class for reading and modifying constituency trees.

This class subclasses Tree.

This class extends its parent by one additional public class method parse_lines()

__init__()[source]

Class constructor.

classmethod parse_lines(a_lines, a_one_per_line=False)[source]

Parse input lines and return list of BitPar trees.

Parameters:
  • a_lines (list) – decoded lines of the input file
  • a_one_per_line (bool) – flag indicating whether file is in one sentence per line format
Yields:

constituency trees

class dsegmenter.bparseg.constituency_tree.Tree(*args)[source]

Direct subclass of nltk.tree.ParentedTree providing hashing.

This class extends its parent by an additional method __hash__(), which uses the standard id() method and allows the objects to be stored in hashes, and also overwrites the method prnt_label(), returning the label of the parent tree

__hash__()[source]

Return id of this object.

__init__(*args)[source]

Class constructor (simply delegates to super-class).

Parameters:args (list) – arguments which should be passed to the parent
prnt_label()[source]

Return label of this object.

Returns:label of parent node or empty string if no parent exists
Return type:str