eleve.segment

The segmenter is available by importing eleve.Segmenter. It is used to segment sentences (regroup tokens that goes together).

class eleve.segment.Segmenter(storage, max_ngram_length=None)[source]

Bases: object

__init__(storage, max_ngram_length=None)[source]

Create a segmenter.

Parameters:
  • storage – A storage object that has been trained on a corpus (should have a query_autonomy method).
  • max_ngram_length – The maximum length of n-gram that can be “merged”. It should be strictly smaller to the storage’s n-gram length.
segment(sentence)[source]

Segment a sentence.

Parameters:sentence – A list of tokens.
Returns:A list of sentence fragments. A sentence fragment is a list of tokens.