revscoring.scorer_models

This module contains a collection of models that implement a simple function: score(). Currently, all models are a subclass of revscoring.scorer_models.MLScorerModel which means that they also implement train() and test() methods. See revscoring.scorer_models.statistics for stats that can be applied to models.

Support Vector Classifiers

A collection of Support Vector Machine type classifier models.

class revscoring.scorer_models.LinearSVC(*args, **kwargs)

Implements a Support Vector Classifier model with a Linear kernel.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.svm.SVC

class revscoring.scorer_models.RBFSVC(*args, **kwargs)

Implements a Support Vector Classifier model with an RBF kernel.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.svm.SVC

class revscoring.scorer_models.SVC(features, version=None, svc=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None, **kwargs)

Implements a Support Vector Classifier model.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.svm.SVC

Naive Bayes Classifiers

A collection of Naive Bayes type classifier models.

class revscoring.scorer_models.GaussianNB(*args, **kwargs)

Implements a Gaussian Naive Bayes model.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.naive_bayes.GaussianNB

class revscoring.scorer_models.MultinomialNB(*args, **kwargs)

Implements a Multinomial Naive Bayes model.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.naive_bayes.MultinomialNB

class revscoring.scorer_models.BernoulliNB(*args, **kwargs)

Implements a Bernoulli Naive Bayes model.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.naive_bayes.BernoulliNB

Random Forest

A collection of Random Forest type classifier models.

class revscoring.scorer_models.RF(features, *, version=None, rf=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None, **kwargs)

Implements a Random Forest model.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.ensemble.RandomForestClassifier

Gradient Boosting

A collection of Gradient Boosting type classifier models.

class revscoring.scorer_models.GradientBoosting(features, *, version=None, gb=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None, **kwargs)

Implements a Gradient Boosting model.

Params:
features : list ( revscoring.Feature )

The features that the model will be trained on

version : str

A version string representing the version of the model

**kwargs

Passed to sklearn.ensemble.GradientBoostingClassifier

Abstract classes

class revscoring.ScorerModel(features, version=None, stats=None)

A model used to score a revision based on a set of features.

dump(f)

Writes serialized model information to a file.

format_info(format='str')

Returns formatted information about the model.

info()

Returns a raw dict containing all information about the model.

classmethod load(f)

Reads serialized model information from a file.

score(feature_values)

Make a prediction or otherwise use the model to generate a score.

Parameters:
feature_values : collection(mixed)

an ordered collection of values that correspond to the Feature s provided to the constructor

Returns:

A dict of statistics

class revscoring.scorer_models.MLScorerModel(*args, **kwargs)

A machine learned model used to score a revision based on a set of features.

Machine learned models are trained and tested against labeled data.

classmethod from_config(config, name, section_key='scorer_models')

Constructs a model from configuration.

test(values_labels, test_statistics=None, store_stats=False)

Tests the model against a labeled data. Note that test data should be withheld from from train data.

Parameters:
values_labels : iterable (( <feature_values>, <label> ))

an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the Feature s provided to the constructor

test_statistics : list ( TestStatistic )

a list of test statistics to apply

store_stats : bool

should the new test statistics overwrite the old (or non-existent)

Returns:

A dictionary of test results.

train(values_labels)

Trains the model on labeled data.

Parameters:
values_scores : iterable (( <feature_values>, <label> ))

an iterable of labeled data Where <values_labels> is an ordered collection of predictive values that correspond to the Feature s provided to the constructor

Returns:

A dictionary of model statistics.

class revscoring.scorer_models.ScikitLearnClassifier(features, classifier_model, version=None, balanced_sample_weight=False, scale=False, center=False, test_statistics=None)
score(feature_values)

Generates a score for a single revision based on a set of extracted feature_values.

Parameters:
feature_values : collection(mixed)

an ordered collection of values that correspond to the Feature s provided to the constructor

Returns:

A dict with the fields:

  • predicion – The most likely class

  • probability – A mapping of probabilities for input classes

    corresponding to the classes the classifier was trained on. Generating this probability is slower than a simple prediction.

test(values_labels, test_statistics=None, store_stats=True)
Returns:

A dictionary of test statistics with the fields:

  • n – The number of observations tested against
  • accuracy – The accuracy of classification
  • table – A truth table for classification
  • test_statistics – A map of test statistic values
train(values_labels, **kwargs)
Returns:

A dictionary with the fields:

  • seconds_elapsed – Time in seconds spent fitting the model

Revision Scoring

Navigation

Related Topics