Represents the state of word persistence in a page. See https://meta.wikimedia.org/wiki/Research:Content_persistence
Parameters: |
|
---|---|
Example: | >>> from pprint import pprint
>>> from mw.lib import persistence
>>>
>>> state = persistence.State()
>>>
>>> pprint(state.process("Apples are red.", revision=1))
([Token(text='Apples', revisions=[1]),
Token(text=' ', revisions=[1]),
Token(text='are', revisions=[1]),
Token(text=' ', revisions=[1]),
Token(text='red', revisions=[1]),
Token(text='.', revisions=[1])],
[Token(text='Apples', revisions=[1]),
Token(text=' ', revisions=[1]),
Token(text='are', revisions=[1]),
Token(text=' ', revisions=[1]),
Token(text='red', revisions=[1]),
Token(text='.', revisions=[1])],
[])
>>> pprint(state.process("Apples are blue.", revision=2))
([Token(text='Apples', revisions=[1, 2]),
Token(text=' ', revisions=[1, 2]),
Token(text='are', revisions=[1, 2]),
Token(text=' ', revisions=[1, 2]),
Token(text='blue', revisions=[2]),
Token(text='.', revisions=[1, 2])],
[Token(text='blue', revisions=[2])],
[Token(text='red', revisions=[1])])
>>> pprint(state.process("Apples are red.", revision=3)) # A revert!
([Token(text='Apples', revisions=[1, 2, 3]),
Token(text=' ', revisions=[1, 2, 3]),
Token(text='are', revisions=[1, 2, 3]),
Token(text=' ', revisions=[1, 2, 3]),
Token(text='red', revisions=[1, 3]),
Token(text='.', revisions=[1, 2, 3])],
[],
[])
|
Modifies the internal state based a change to the content and returns the sets of words added and removed.
Parameters: |
|
---|---|
Returns: | Three Tokens lists |
Represents a list of Token with some useful helper functions.
Example: | >>> from mw.lib.persistence import Token, Tokens
>>>
>>> tokens = Tokens()
>>> tokens.append(Token("foo"))
>>> tokens.extend([Token(" "), Token("bar")])
>>>
>>> tokens[0]
Token(text='foo', revisions=[])
>>>
>>> "".join(tokens.texts())
'foo bar'
|
---|
Generates a sequence of operations using difflib.SequenceMatcher.
Parameters: |
|
---|
Applies operations (delta) to copy items from old to new.
Parameters: |
|
---|---|
Returns: | An iterator over elements matching new but copied from old |