revscoring.scorer_models
This module contains a collection of models that implement a simple function:
score(). Currently, all models are
a subclass of revscoring.scorer_models.MLScorerModel
which means that they also implement
train() and
test() methods. See
revscoring.scorer_models.statistics for stats that can be applied to
models.
Support Vector Classifiers
A collection of Support Vector Machine type classifier models.
-
class revscoring.scorer_models.LinearSVC(*args, **kwargs)
Implements a Support Vector Classifier model with a Linear kernel.
Params: |
- features : list ( revscoring.Feature )
The features that the model will be trained on
- version : str
A version string representing the version of the model
- **kwargs
Passed to sklearn.svm.SVC
|
-
class revscoring.scorer_models.RBFSVC(*args, **kwargs)
Implements a Support Vector Classifier model with an RBF kernel.
Params: |
- features : list ( revscoring.Feature )
The features that the model will be trained on
- version : str
A version string representing the version of the model
- **kwargs
Passed to sklearn.svm.SVC
|
-
class revscoring.scorer_models.SVC(features, version=None, svc=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None, **kwargs)
Implements a Support Vector Classifier model.
Params: |
- features : list ( revscoring.Feature )
The features that the model will be trained on
- version : str
A version string representing the version of the model
- **kwargs
Passed to sklearn.svm.SVC
|
Naive Bayes Classifiers
A collection of Naive Bayes type classifier models.
-
class revscoring.scorer_models.GaussianNB(*args, **kwargs)
Implements a Gaussian Naive Bayes model.
-
class revscoring.scorer_models.MultinomialNB(*args, **kwargs)
Implements a Multinomial Naive Bayes model.
-
class revscoring.scorer_models.BernoulliNB(*args, **kwargs)
Implements a Bernoulli Naive Bayes model.
Random Forest
A collection of Random Forest type classifier models.
-
class revscoring.scorer_models.RF(features, *, version=None, rf=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None, **kwargs)
Implements a Random Forest model.
Gradient Boosting
A collection of Gradient Boosting type classifier models.
-
class revscoring.scorer_models.GradientBoosting(features, *, version=None, gb=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None, **kwargs)
Implements a Gradient Boosting model.
Abstract classes
-
class revscoring.ScorerModel(features, version=None, stats=None)
A model used to score a revision based on a set of features.
-
dump(f)
Writes serialized model information to a file.
-
format_info(format='str')
Returns formatted information about the model.
-
info()
Returns a raw dict containing all information about the model.
-
classmethod load(f)
Reads serialized model information from a file.
-
score(feature_values)
Make a prediction or otherwise use the model to generate a score.
Parameters: |
- feature_values : collection(mixed)
an ordered collection of values that correspond to the
Feature s provided to the constructor
|
Returns: | A dict of statistics
|
-
class revscoring.scorer_models.MLScorerModel(*args, **kwargs)
A machine learned model used to score a revision based on a set of
features.
Machine learned models are trained and tested against labeled data.
-
classmethod from_config(config, name, section_key='scorer_models')
Constructs a model from configuration.
-
test(values_labels, test_statistics=None, store_stats=False)
Tests the model against a labeled data. Note that test data should be
withheld from from train data.
Parameters: |
- values_labels : iterable (( <feature_values>, <label> ))
an iterable of labeled data Where <values_labels> is an ordered
collection of predictive values that correspond to the
Feature s provided to the constructor
- test_statistics : list ( TestStatistic )
a list of test statistics to apply
- store_stats : bool
should the new test statistics overwrite the old (or
non-existent)
|
Returns: | A dictionary of test results.
|
-
train(values_labels)
Trains the model on labeled data.
Parameters: |
- values_scores : iterable (( <feature_values>, <label> ))
an iterable of labeled data Where <values_labels> is an ordered
collection of predictive values that correspond to the
Feature s provided to the constructor
|
Returns: | A dictionary of model statistics.
|
-
class revscoring.scorer_models.ScikitLearnClassifier(features, classifier_model, version=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None)
-
score(feature_values)
Generates a score for a single revision based on a set of extracted
feature_values.
Parameters: |
- feature_values : collection(mixed)
an ordered collection of values that correspond to the
Feature s provided to the constructor
|
Returns: | A dict with the fields:
|
-
test(values_labels, test_statistics=None, store_stats=True)
Returns: | A dictionary of test statistics with the fields:
- n – The number of observations tested against
- accuracy – The accuracy of classification
- table – A truth table for classification
- test_statistics – A map of test statistic values
|
-
train(values_labels, **kwargs)
Returns: | A dictionary with the fields:
- seconds_elapsed – Time in seconds spent fitting the model
|