revscoring.datasources.revision_oriented
Implements a set of datasources oriented off of a single revision. This is
useful for extracting features of edit and article quality.
-
revscoring.datasources.revision_oriented.revision = {revision}
Represents the base revision of interest. Implements this structure:
Supporting classes
-
class revscoring.datasources.revision_oriented.Revision(name, include_parent=True, include_user=True, include_user_info=True, include_user_last_revision=False, include_page=True, include_page_creation=False, include_content=False)[source]
Represents a revision
-
id = None
int : Revision ID
-
timestamp = None
mwtypes.Timestamp : Timestamp the revision was saved
str : The comment saved with the revision
-
byte_len = None
int : The length of the revision content in bytes
-
minor = None
bool : Was the revision flagged as minor?
-
content_model = None
str : Describes the format of revision content
-
text = None
str : The decoded (Unicode) text of the revision content
-
parent = None
Revision : The
parent (aka “previous”) revision of the page.
-
page = None
Page : The
page in which the revision was saved.
-
user = None
User : The
user who saved the revision.
-
diff = None
Diff : The
difference between this revision and the parent revision.
-
class revscoring.datasources.revision_oriented.Diff(name)[source]
Represents the difference between two sequential revisions.
-
class revscoring.datasources.revision_oriented.Page(name, include_creation=False)[source]
Represents a revision’s page
-
id = None
int : The page’s ID
-
title = None
str : The page’s title (namespace stripped)
-
namespace = None
Namespace : The
namespace information.
-
creation = None
Revision : The
first revision to the page.
-
class revscoring.datasources.revision_oriented.Namespace(name)[source]
Represents a page’s namespace
-
id = None
int : The namespace’s ID
-
name = None
str : The name of the namespace
-
class revscoring.datasources.revision_oriented.User(name, include_info=True, include_last_revision=False)[source]
Represents a user’s id and name/ip
-
id = None
int : The id of the user who saved the edit. 0 for IPs.
-
text = None
str : The user’s name or IP address
-
info = None
UserInfo :
Information about the user.
-
last_revision = None
Revision : The
last revision the user saved before the revision of reference.
-
class revscoring.datasources.revision_oriented.UserInfo(name)[source]
Represents a user’s information
-
editcount = None
int : A count of edits the user has ever saved
-
registration = None
mwtypes.Timestamp : The date the user registered
-
groups = None
set ( str ) : The groups the user is a member of
-
emailable = None
bool : True if the users is emailable, False otherwise
-
gender = None
str : A string representing the user’s gender preference.