7.1.3. mclearn.classifier.learning_curve

mclearn.classifier.learning_curve(data, feature_cols, target_col, classifier, train_sizes, test_sizes=200000, random_state=None, balanced=True, normalise=True, degree=1, pickle_path=None)[source]

Compute the learning curve of a classiifer.

Parameters:
  • data (DataFrame) – The DataFrame containing all the data.
  • feature_cols (array) – A list of column names in data that are used as features.
  • target_col (str) – The column name of the target.
  • classifier (Classifier object) – A classifier object that will be used to train and test the data. It should have the same interface as scikit-learn classifiers.
  • train_sizes (array) – The list of the sample sizes that the classifier will be trained on.
  • test_sizes (int or list of ints) – The sizes of the test set.
  • random_state (int) – The value of the Random State (used for reproducibility).
  • normalise (boolean) – Whether we should first normalise the data to zero mean and unit variance.
  • degree (int) – If greater than 1, the data will first be polynomially transformed with the given degree.
  • pickle_path (str) – The path where the values of the learning curve will be saved.
Returns:

lc_accuracy_test – The list of balanced accuracy scores for the given sample sizes.

Return type:

array