This module provides a set of wikiclass.Extractor s that implement a strategy for identifying article quality labeling events historically. These labelings are used as training data to build prediction models.
This extractor looks for instances of templates that contain “class=<some class>” on article talk pages (namespace = 1) and parses the template name to obtain a project.
Implements an labeling event extraction strategy.
Parameters: |
|
---|
Processes an mw.xml_dump.Page and returns a generator of first-observations of a project/label pair.
Parameters: |
|
---|
Implements a template-based extraction strategy based on a from_template function that takes a template and returns a (project, label) pair.
Parameters: |
|
---|
Processes an mw.xml_dump.Page and returns a generator of first-observations of a project/label pair.
Parameters: |
|
---|
Extracts a set of labels for a version of text by parsing templates.
Parameters: |
|
---|---|
Returns: | An iterator over (project, label) pairs |