Detection

The primary purpose of this library is to provide facilities to aid in detecting reverting activity. You are provided with two options. mwreverts.detect() takes an iterable of (checksum, revision_data) pairs and returns an iterator of mwreverts.Revert. mwreverts.Detector, on the other hand, provides a process() method that allows you to process revisions one-at-a-time.

mwreverts.detect(checksum_revisions, radius=15)[source]

Detects reverts that occur in a sequence of revisions. Note that, revision data meta will simply be returned in the case of a revert.

This function serves as a convenience wrapper around calls to mwreverts.Detector‘s process() method.

Parameters:
checksum_revisions
: iterable ( (checksum, revision) )

an iterable over tuples of checksum and revision meta data

radius
: int

a positive integer indicating the maximum revision distance that a revert can span.

Return:

a iterator over mwreverts.Revert

Example:
>>> import mwreverts
>>>
>>> checksum_revisions = [
...     ("aaa", {'rev_id': 1}),
...     ("bbb", {'rev_id': 2}),
...     ("aaa", {'rev_id': 3}),
...     ("ccc", {'rev_id': 4})
... ]
>>>
>>> list(mwreverts.detect(checksum_revisions))
[Revert(reverting={'rev_id': 3},
        reverteds=[{'rev_id': 2}],
        reverted_to={'rev_id': 1})]
class mwreverts.Detector(*args, **kwargs)[source]

Detects revert events in a stream of revisions (to the same page) based on matching checksums. To detect reverts, construct an instance of this class and call process() in chronological order.

See https://meta.wikimedia.org/wiki/R:Identity_revert

Parameters:
radius
: int

a positive integer indicating the maximum revision distance that a revert can span.

Example:
>>> import mwreverts
>>> detector = mwreverts.Detector()
>>>
>>> detector.process("aaa", {'rev_id': 1})
>>> detector.process("bbb", {'rev_id': 2})
>>> detector.process("aaa", {'rev_id': 3})
Revert(reverting={'rev_id': 3},
       reverteds=[{'rev_id': 2}],
       reverted_to={'rev_id': 1})
>>> detector.process("ccc", {'rev_id': 4})
process(checksum, revision=None)[source]

Process a new revision and detect a revert if it occurred. Note that you can pass whatever you like as revision and it will be returned in the case that a revert occurs.

Parameters:
checksum
: str

Any identity-machable string-based hash of revision content

revision
: mixed

Revision metadata. Note that any data will just be returned in the case of a revert.

Returns:

a Revert if one occured or None

class mwreverts.Revert(*args, **kwargs)[source]

Represents a revert event. This class behaves like collections.namedtuple. Note that the datatypes of reverting, reverteds and reverted_to is not specified since those types will depend on the revision data provided during revert detection.

Attributes:
reverting

The reverting revision data : mixed

reverteds

The reverted revision data (ordered chronologically) : list( mixed )

reverted_to

The reverted-to revision data : mixed

class mwreverts.DummyChecksum[source]

Used in when checking for reverts when the checksum of the revision of interest is unknown. DummyChecksums won’t match eachother or anything else, but they will match themselves and they are hashable.

>>> dummy1 = DummyChecksum()
>>> dummy1
<#140687347334280>
>>> dummy1 == dummy1
True
>>>
>>> dummy2 = DummyChecksum()
>>> dummy2
<#140687347334504>
>>> dummy1 == dummy2
False
>>>
>>> {"foo", "bar", dummy1, dummy1, dummy2}
{<#140687347334280>, 'foo', <#140687347334504>, 'bar'}