Detection¶
The primary purpose of this library is to provide facilities to aid in
detecting reverting activity. You are provided with two options. mwreverts.detect()
takes an iterable of (checksum, revision_data)
pairs and returns an iterator of mwreverts.Revert
. mwreverts.Detector
, on the other hand, provides a process()
method that allows you to process revisions one-at-a-time.
-
mwreverts.
detect
(checksum_revisions, radius=15)[source]¶ Detects reverts that occur in a sequence of revisions. Note that, revision data meta will simply be returned in the case of a revert.
This function serves as a convenience wrapper around calls to
mwreverts.Detector
‘sprocess()
method.Parameters: - checksum_revisions : iterable ( (checksum, revision) )
an iterable over tuples of checksum and revision meta data
- radius : int
a positive integer indicating the maximum revision distance that a revert can span.
Return: a iterator over
mwreverts.Revert
Example: >>> import mwreverts >>> >>> checksum_revisions = [ ... ("aaa", {'rev_id': 1}), ... ("bbb", {'rev_id': 2}), ... ("aaa", {'rev_id': 3}), ... ("ccc", {'rev_id': 4}) ... ] >>> >>> list(mwreverts.detect(checksum_revisions)) [Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1})]
-
class
mwreverts.
Detector
(*args, **kwargs)[source]¶ Detects revert events in a stream of revisions (to the same page) based on matching checksums. To detect reverts, construct an instance of this class and call
process()
in chronological order.See https://meta.wikimedia.org/wiki/R:Identity_revert
Parameters: - radius : int
a positive integer indicating the maximum revision distance that a revert can span.
Example: >>> import mwreverts >>> detector = mwreverts.Detector() >>> >>> detector.process("aaa", {'rev_id': 1}) >>> detector.process("bbb", {'rev_id': 2}) >>> detector.process("aaa", {'rev_id': 3}) Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1}) >>> detector.process("ccc", {'rev_id': 4})
-
process
(checksum, revision=None)[source]¶ Process a new revision and detect a revert if it occurred. Note that you can pass whatever you like as revision and it will be returned in the case that a revert occurs.
Parameters: - checksum : str
Any identity-machable string-based hash of revision content
- revision : mixed
Revision metadata. Note that any data will just be returned in the case of a revert.
Returns: a
Revert
if one occured or None
-
class
mwreverts.
Revert
(*args, **kwargs)[source]¶ Represents a revert event. This class behaves like
collections.namedtuple
. Note that the datatypes of reverting, reverteds and reverted_to is not specified since those types will depend on the revision data provided during revert detection.Attributes: - reverting
The reverting revision data : mixed
- reverteds
The reverted revision data (ordered chronologically) : list( mixed )
- reverted_to
The reverted-to revision data : mixed
-
class
mwreverts.
DummyChecksum
[source]¶ Used in when checking for reverts when the checksum of the revision of interest is unknown. DummyChecksums won’t match eachother or anything else, but they will match themselves and they are hashable.
>>> dummy1 = DummyChecksum() >>> dummy1 <#140687347334280> >>> dummy1 == dummy1 True >>> >>> dummy2 = DummyChecksum() >>> dummy2 <#140687347334504> >>> dummy1 == dummy2 False >>> >>> {"foo", "bar", dummy1, dummy1, dummy2} {<#140687347334280>, 'foo', <#140687347334504>, 'bar'}