Data Model

A timeline is a chronological list of site activity that is relevant to the viewer. Users indirectly control the content of their timelines by “following” other users or by “watching” projects.

References:

The primary constructs required for timelines are:

  • A graph of connections between nodes in the network. In our case, the nodes will be Users and Projects (at least at first).
  • An Activity object, representing something that happens on the site.

The Graph

The graph is a list of nodes, each node having inbound edges (from other nodes that are watching or following it) and outbound edges (to other nodes that it is watching or following).

The graph will be represented in a mongo collection:

[{"node_id": "user:4f9f4cbf0594ca4156000a5e",
  "followers": [... list of node ids ...],
  "following": [... list of node ids ...]},
  ...
]

According to this scheme, each graph edge is stored twice. So, for example, when a user watches a project, the project node would be added to the user’s following list, and the user node would be added to the project’s followers list.

In python code, any object that implements the NodeBase interface can participate in the graph, and can therefore follow or be followed by other nodes. You can do this easily by subclassing your object from Node (or using it as a mixin), and overriding the node_id property:

class User(MappedClass, Node):
    @property
    def node_id(self):
        """A string that uniquely identifies this node in the network."""
        return "user:%s" % self._id

Activities

An activity consists of an actor, a verb, an object, and optionally, a target. This is best illustrated with an example:

John   posted a comment on ticket #42.
----   ------   -------    ----------
actor  verb     object     target

The datatypes of these are as follows: * actor - Node, ActivityObject * verb - str * object - ActivityObject * target - ActivityObject

To use an object in an Activity, subclass from ActivityObject (or use it as a mixin), and override the required properties:

class Ticket(VersionedArtifact, ActivityObject):
    @property
    def activity_name(self):
        """Display name for this object, in the context of an Activity."""
        return "Ticket #%s" % self.ticket_num

    @property
    def activity_url(self):
        """URL of this object."""
        return self.url()

Note that the actor must be an ActivityObject and a Node, since it must be a participant in the network in order to perform an activity.

Note

ActivityObject + Node might be combined into an Actor interface.

Activities will be represented in a mongo collection:

[{"node_id": "user:4f9f4cbf0594ca4156000a5e",
  "actor": {
      "activity_name": "John",
      "activity_url":  "http://..."},
  "verb": "posted",
  "object": {
      "activity_name": "comment",
      "activity_url":  "http://..."},
  "target": {
      "activity_name": "ticket #42",
      "activity_url":  "http://..."},
  "published": ISODate("2011-10-12T14:54:02.069Z")}
 },
  ...
]

As you can see, an activity is related to a Node via a node_id when it is stored. A copy of the activity is stored for every Node involved in the activity.

Timelines

Timelines (or “activity streams”) are ordered lists of activities. We want to be able to see a timeline of activity for a Node in our graph. For example, when I look at my timeline, I want to see my recent activities as well as the activities of the people I am following.

When someone else looks at my timeline, they probably want to see only those activities that I performed (where I was the actor). When they look at the timeline of a project, they probably want to see all recent activity on that project, regardless of actor.

Timelines are requested from the Aggregator:

Aggregator.get_timeline(node, page=0, limit=100, actor_only=False)

As you can see, timelines are requested for a particular node, and can be paged and limited in size. actor_only=True means, “only give me activities that this node performed.”

The Aggregator aggregates activities together from nodes in the network that are connected. If I am following someone, I want to see their activity in my timeline. It’s the Aggregator’s job to combine these activities together into one timeline.

In general we want the aggregation to be done out-of-band so that when a timeline is requested, it’s already prepared. This can be achieved by using a task processing system. The Aggregator can also be subclassed to customize the way timelines are created. More on these topics later.

Future Goals

  • Get rid of the global _director var
  • The Aggregator is meant to be subclassed and changed/extended, but there is no way to tell the library to use an external Aggregator. Provide a hook for doing so.
  • When this library was started, we were having performance issues with Ming, so raw pymongo was used. With Ming performing well now, it may be worthwhile to revisit that decision. Using Ming could arguably make the code easier to understand and maintain.
  • It would be really nice to neatly decouple the persistence code from the rest of the library so that alternate storage backends could be implemented and used.
  • Timeline aggregation may be costly, and should be performed in a separate process. Provide a hook point that will be called when an aggregation is needed, so client code can use whatever task processing system is desired.

Table Of Contents

Previous topic

Welcome to activitystream’s documentation!

Next topic

API

This Page