The primary purpose of this library is to provide facilities to aid in sessionizing chronological sequences of activities into mwsessions.Session. You are provided with two options. mwsessions.sessionize() takes an iterable of (user, timestamp, event_data) triples and returns an iterator of mwsessions.Session. mwsessions.Sessionizer, on the other hand, provides a process() method that allows you to process events one-at-a-time.
Clusters user sessions from a sequence of user events. Note that, event data will simply be returned in the case of a revert.
This function serves as a convenience wrapper around calls to Cache‘s process() method.
Parameters: |
|
---|---|
Returns: | a iterator over Session |
Example: | >>> import mwsessions
>>>
>>> user_events = [
... ("Willy on wheels", 20150101000000, {'rev_id': 1}),
... ("Walter", 20150101000001, {'rev_id': 2}),
... ("Willy on wheels", 20150101000001, {'rev_id': 3}),
... ("Walter", 100035, {'rev_id': 4}),
... ("Willy on wheels", 103602, {'rev_id': 5})
... ]
>>>
>>> for user, events in mwsessions.sessionize(user_events):
... (user, events)
...
('Willy on wheels', [{'rev_id': 1}, {'rev_id': 3}])
('Walter', [{'rev_id': 2}, {'rev_id': 4}])
('Willy on wheels', [{'rev_id': 5}])
|
Constructs an object that manages state for sessionization. Since sessions expire once activities stop for at least cutoff seconds, this class manages a cache of active sessions and uses that to process new events.
Parameters: |
|
---|---|
Example: | >>> from mw.lib import sessions
>>>
>>> cache = sessions.Cache(cutoff=3600)
>>>
>>> list(cache.process("Willy on wheels", 100000, {'rev_id': 1}))
[]
>>> list(cache.process("Walter", 100001, {'rev_id': 2}))
[]
>>> list(cache.process("Willy on wheels", 100001, {'rev_id': 3}))
[]
>>> list(cache.process("Walter", 100035, {'rev_id': 4}))
[]
>>> list(cache.process("Willy on wheels", 103602, {'rev_id': 5}))
[Session(user='Willy on wheels',
events=[{'rev_id': 1}, {'rev_id': 3}])]
>>> list(cache.get_active_sessions())
[Session(user='Walter', events=[{'rev_id': 2}, {'rev_id': 4}]),
Session(user='Willy on wheels', events=[{'rev_id': 5}])]
|
Represents a user session (a cluster over events for a user). This class behaves like collections.namedtuple. Note that the datatypes of events, is not specified since those types will depend on the revision data provided during revert detection.
Attributes: |
|
---|