revscoring.features
This module implements a set of revscoring.Feature
for use in scoring revisions. revscoring.Feature
lists can be provided to a revscoring.dependencies.solve(), or
more commonly, to a revscoring.Extractor to obtain simple
numerical/boolean values that can be used when modeling revision
scores. The provided features are split conceptually into a set of modules:
Feature collections
- revision_oriented
- Basic features of revisions. E.g. revision.user.text_matches(r'.*Bot')
- bytes
- Features of the number of bytes of content, byte length of characters,
etc.
- temporal
- Features of the time between events of a interest. E.g.
revision.user.last_revision.seconds_since
- wikibase
- Features of wikibase items and changes made to them. E.g.
revision.diff.property_changed('P31')
- wikitext
- Features of wikitext content and differences between revisions. E.g.
revision.diff.uppercase_words_added
Functions
-
revscoring.features.trim(features, context=None)
Trims a feature set down to a bare set of Feature by
removing Modifier and
Constant.
Parameters: |
- features : list ( revscoring.Feature )
A feature list to trim
- context : dict | set
A context to apply while trimming
|
Base classes
-
class revscoring.Feature(name, process=None, *, returns=None, depends_on=None)
Represents a predictive feature.
Parameters: |
- name : str
The name of the feature
- process : func
A function that will generate a feature value
- return_type : type
A type to compare the return of this function to.
- dependencies : list`(`hashable)
An ordered list of dependencies that correspond
to the *args of process
|
-
class revscoring.features.Modifier(name, process=None, *, returns=None, depends_on=None)
Represents a modification of one or more predictive feature.
Parameters: |
- name : str
The name of the feature
- process : func
A function that will generate a feature value
- return_type : type
A type to compare the return of this function to.
- dependencies : list`(`hashable)
An ordered list of dependencies that correspond
to the *args of process
|
-
class revscoring.features.Constant(value, name=None)
A special sub-type of revscoring.Feature that returns a constant value.
Parameters: |
- value : mixed
Any type of potential feature value
- name : str
A name to give the feature
|