yellowbrick package

Submodules

yellowbrick.anscombe module

Plots Anscombe’s Quartet as an illustration of the importance of visualization.

yellowbrick.anscombe.anscombe()[source]

Creates 2x2 grid plot of the 4 anscombe datasets for illustration.

yellowbrick.base module

Abstract base classes and interface for Yellowbrick.

class yellowbrick.base.ModelVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.Visualizer

A model visualization accepts as input an unfitted Scikit-Learn estimator(s) and enables the user to visualize the performance of models across a range of hyperparameter values (e.g. using VisualGridsearch and ValidationCurve).

fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: dict

keyword arguments passed to Scikit-Learn API.

predict(X)[source]
class yellowbrick.base.MultiModelMixin(models, ax=None, **kwargs)[source]

Bases: object

Does predict for each of the models and generates subplots.

generate_subplots()[source]

Generates the subplots for the number of given models.

predict(X, y)[source]

Returns a generator containing the predictions for each of the internal models (using cross_val_predict and a CV=12).

class yellowbrick.base.ScoreVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.Visualizer

Base class to follow an estimator in a visual pipeline.

Draws the score for the fitted model.

draw(X, y)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: dict

keyword arguments passed to Scikit-Learn API.

predict(X)[source]
class yellowbrick.base.Visualizer(ax=None, **kwargs)[source]

Bases: sklearn.base.BaseEstimator

The root of the visual object hierarchy that defines how yellowbrick creates, stores, and renders visual artifacts using matplotlib.

Inherits from Scikit-Learn’s BaseEstimator class.

The base class for feature visualization and model visualization primarily ensures that styling arguments are passed in.

draw(**kwargs)[source]

Rendering function

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:

kwargs: dict

generic keyword arguments.

fit(X, y=None, **kwargs)[source]

Fits a transformer to X and y

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: dict

keyword arguments passed to Scikit-Learn API.

fit_draw(X, y=None, **kwargs)[source]

Fits a transformer to X and y then returns visualization of features or fitted model.

fit_draw_poof(X, y=None, **kwargs)[source]
gca()[source]

Creates axes if they don’t already exist

poof(outpath=None, **kwargs)[source]

The user calls poof, which is the primary entry point for producing a visualization.

Visualizes either data features or fitted model scores

Parameters:

outpath: string

path or None. Save figure to disk or if None show in window

kwargs: generic keyword arguments.

set_title(title=None)[source]

Sets the title on the current axes.

yellowbrick.bestfit module

Uses Scikit-Learn to compute a best fit function, then draws it in the plot.

yellowbrick.bestfit.draw_best_fit(X, y, ax, estimator='linear', **kwargs)[source]

Uses Scikit-Learn to fit a model to X and y then uses the resulting model to predict the curve based on the X values. This curve is drawn to the ax (matplotlib axis) which must be passed as the third variable.

The estimator function can be one of the following:

‘linear’: Uses OLS to fit the regression ‘quadratic’: Uses OLS with Polynomial order 2 ‘exponential’: Not implemented yet ‘log’: Not implemented yet ‘select_best’: Selects the best fit via MSE

The remaining keyword arguments are passed to ax.plot to define and describe the line of best fit.

yellowbrick.bestfit.fit_exponential(X, y)[source]

Fits an exponential curve to the data.

yellowbrick.bestfit.fit_linear(X, y)[source]

Uses OLS to fit the regression.

yellowbrick.bestfit.fit_log(X, y)[source]

Fit a logrithmic curve to the data.

yellowbrick.bestfit.fit_quadratic(X, y)[source]

Uses OLS with Polynomial order 2.

yellowbrick.bestfit.fit_select_best(X, y)[source]

Selects the best fit of the estimators already implemented by choosing the model with the smallest mean square error metric for the trained values.

yellowbrick.classifier module

Visualizations related to evaluating Scikit-Learn classification models

class yellowbrick.classifier.ClassBalance(model, ax=None, classes=None, **kwargs)[source]

Bases: yellowbrick.classifier.ClassificationScoreVisualizer

Class balance chart that shows the support for each class in the fitted classification model displayed as a bar plot. It is initialized with a fitted model and generates a class balance chart on draw.

Parameters:

ax: axes

the axis to plot the figure on.

model: estimator

Scikit-Learn estimator object. Should be an instance of a classifier, else __init__() will raise an exception.

classes: list

A list of class names for the legend. If classes is None and a y value is passed to fit then the classes are selected from the target vector.

kwargs: dict

Keyword arguments passed to the super class. Here, used to colorize the bars in the histogram.

These parameters can be influenced later on in the visualization

process, but can and should be set as early as possible.

draw()[source]

Renders the class balance chart across the axis.

Returns:ax : the axis with the plotted figure
finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: keyword arguments passed to Scikit-Learn API.

Returns:

self : instance

Returns the instance of the classification score visualizer

score(X, y=None, **kwargs)[source]

Generates the Scikit-Learn precision_recall_fscore_support

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

Returns:

ax : the axis with the plotted figure

class yellowbrick.classifier.ClassificationReport(model, ax=None, classes=None, **kwargs)[source]

Bases: yellowbrick.classifier.ClassificationScoreVisualizer

Classification report that shows the precision, recall, and F1 scores for the model. Integrates numerical scores as well color-coded heatmap.

draw(y, y_pred)[source]

Renders the classification report across each axis.

Parameters:

y : ndarray or Series of length n

An array or series of target or class values

y_pred : ndarray or Series of length n

An array or series of predicted target values

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

kwargs: keyword arguments passed to Scikit-Learn API.

score(X, y=None, **kwargs)[source]

Generates the Scikit-Learn classification_report

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

class yellowbrick.classifier.ClassificationScoreVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.ScoreVisualizer

class yellowbrick.classifier.ROCAUC(model, ax=None, **kwargs)[source]

Bases: yellowbrick.classifier.ClassificationScoreVisualizer

Plot the ROC to visualize the tradeoff between the classifier’s sensitivity and specificity.

draw(y, y_pred)[source]

Renders ROC-AUC plot. Called internally by score, possibly more than once

Parameters:

y : ndarray or Series of length n

An array or series of target or class values

y_pred : ndarray or Series of length n

An array or series of predicted target values

Returns

——

ax : the axis with the plotted figure

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
score(X, y=None, **kwargs)[source]

Generates the predicted target values using the Scikit-Learn estimator.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target or class values

Returns

——

ax : the axis with the plotted figure

yellowbrick.classifier.class_balance(model, X, y=None, ax=None, classes=None, **kwargs)[source]

Quick method:

Displays the support for each class in the fitted classification model displayed as a bar plot.

This helper function is a quick wrapper to utilize the ClassBalance ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a classifier)

classes : list of strings

The names of the classes in the target

Returns:

ax : matplotlib axes

Returns the axes that the class balance plot was drawn on.

yellowbrick.classifier.classification_report(model, X, y=None, ax=None, classes=None, **kwargs)[source]

Quick method:

Displays precision, recall, and F1 scores for the model. Integrates numerical scores as well color-coded heatmap.

This helper function is a quick wrapper to utilize the ClassificationReport ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a classifier)

classes : list of strings

The names of the classes in the target

Returns:

ax : matplotlib axes

Returns the axes that the classification report was drawn on.

yellowbrick.classifier.roc_auc(model, X, y=None, ax=None, **kwargs)[source]

Quick method:

Displays the tradeoff between the classifier’s sensitivity and specificity.

This helper function is a quick wrapper to utilize the ROCAUC ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a classifier)

Returns:

ax : matplotlib axes

Returns the axes that the roc-auc curve was drawn on.

yellowbrick.exceptions module

Exceptions hierarchy for the yellowbrick library

exception yellowbrick.exceptions.ModelError[source]

Bases: yellowbrick.exceptions.YellowbrickError

A problem when interacting with sklearn or the ML framework.

exception yellowbrick.exceptions.VisualError[source]

Bases: yellowbrick.exceptions.YellowbrickError

A problem when interacting with matplotlib or the display framework.

exception yellowbrick.exceptions.YellowbrickError[source]

Bases: Exception

The root exception for all yellowbrick related errors.

exception yellowbrick.exceptions.YellowbrickTypeError[source]

Bases: yellowbrick.exceptions.YellowbrickError, TypeError

There was an unexpected type or none for a property or input.

exception yellowbrick.exceptions.YellowbrickValueError[source]

Bases: yellowbrick.exceptions.YellowbrickError, ValueError

A bad value was passed into a function.

yellowbrick.pipeline module

Implements a visual pipeline that subclasses Scikit-Learn pipelines.

class yellowbrick.pipeline.VisualPipeline(steps)[source]

Bases: sklearn.pipeline.Pipeline

Pipeline of transforms and visualizers with a final estimator.

Sequentially apply a list of transforms, visualizers, and a final estimator which may be evaluated by additional visualizers. Intermediate steps of the pipeline must be ‘transforms’, that is, they must implement fit and transform methods. The final estimator only needs to implement fit.

Any step that implements draw or poof methods can be called sequentially directly from the VisualPipeline, allowing multiple visual diagnostics to be generated, displayed, and saved on demand. If draw or poof is not called, the visual pipeline should be equivalent to the simple pipeline to ensure no reduction in performance.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. These steps can be visually diagnosed by visualizers at every point in the pipeline.

Parameters:

steps : list

List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator. Any intermediate step can be a FeatureVisualizer and the last step can be a ScoreVisualizer.

Attributes

named_steps (dict) Read-only attribute to access any step parameter by user given name. Keys are step names and values are step parameters.
visual_steps (dict) Read-only attribute to access any visualizer in he pipeline by user given name. Keys are step names and values are visualizer steps.
draw(*args, **kwargs)[source]

Calls draw on steps (including the final estimator) that has a draw method and passes the args and kwargs to that draw function.

fit_transform_poof(X, y=None, **kwargs)[source]

Fit the model and transforms and then call poof.

poof(*args, **kwargs)[source]

Calls poof on steps (including the final estimator) that has a poof method and passes the args and kwargs to that poof function.

visual_steps

yellowbrick.regressor module

Visualizations related to evaluating Scikit-Learn regressor models

class yellowbrick.regressor.PredictionError(model, ax=None, **kwargs)[source]

Bases: yellowbrick.regressor.RegressionScoreVisualizer

Plot the actual targets from the dataset against the predicted values generated by our model(s).

draw(y, y_pred)[source]
Parameters:

y : ndarray or Series of length n

An array or series of target or class values

y_pred : ndarray or Series of length n

An array or series of predicted target values

Returns

——

ax : the axis with the plotted figure

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
score(X, y=None, **kwargs)[source]

Originally score for prediction error was conceived as generating y_pred by calling the sklearn function cross_val_predict on the model, X, y, and the specified number of folds, e.g.:

y_pred = cv.cross_val_predict(model, X, y, cv=12)

With the new API, there’s not much for score to do.

Parameters:

X : array-like

X (also X_test) are the dependent variables of test set to predict

y : array-like

y (also y_test) is the independent actual variables to score against

Returns

——

ax : the axis with the plotted figure

class yellowbrick.regressor.RegressionScoreVisualizer(model, ax=None, **kwargs)[source]

Bases: yellowbrick.base.ScoreVisualizer

class yellowbrick.regressor.ResidualsPlot(model, ax=None, **kwargs)[source]

Bases: yellowbrick.regressor.RegressionScoreVisualizer

A residual plot shows the residuals on the vertical axis and the independent variable on the horizontal axis.

If the points are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate.

draw(y_pred, residuals, train=False, **kwargs)[source]
Parameters:

y_pred : ndarray or Series of length n

An array or series of predicted target values

residuals : ndarray or Series of length n

An array or series of the difference between the predicted and the target values

train : boolean

If False, draw assumes that the residual points being plotted are from the test data; if True, draw assumes the residuals are the train data.

Returns

——

ax : the axis with the plotted figure

finalize(**kwargs)[source]

Finalize executes any subclass-specific axes finalization steps. The user calls poof and poof calls finalize.

Parameters:kwargs: generic keyword arguments.
fit(X, y=None, **kwargs)[source]
Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features

y : ndarray or Series of length n

An array or series of target values

kwargs: keyword arguments passed to Scikit-Learn API.

score(X, y=None, train=False, **kwargs)[source]

Generates predicted target values using the Scikit-Learn estimator.

Parameters:

X : array-like

X (also X_test) are the dependent variables of test set to predict

y : array-like

y (also y_test) is the independent actual variables to score against

train : boolean

If False, score assumes that the residual points being plotted are from the test data; if True, score assumes the residuals are the train data.

Returns

——

ax : the axis with the plotted figure

yellowbrick.regressor.prediction_error(model, X, y=None, ax=None, **kwargs)[source]

Quick method:

Plot the actual targets from the dataset against the predicted values generated by our model(s).

This helper function is a quick wrapper to utilize the PredictionError ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a regressor)

Returns:

ax : matplotlib axes

Returns the axes that the prediction error plot was drawn on.

yellowbrick.regressor.residuals_plot(model, X, y=None, ax=None, **kwargs)[source]

Quick method:

Plot the residuals on the vertical axis and the independent variable on the horizontal axis.

This helper function is a quick wrapper to utilize the ResidualsPlot ScoreVisualizer for one-off analysis.

Parameters:

X : ndarray or DataFrame of shape n x m

A matrix of n instances with m features.

y : ndarray or Series of length n

An array or series of target or class values.

ax : matplotlib axes

The axes to plot the figure on.

model : the Scikit-Learn estimator (should be a regressor)

Returns:

ax : matplotlib axes

Returns the axes that the residuals plot was drawn on.

yellowbrick.utils module

Utility functions and helpers for the Yellowbrick library.

class yellowbrick.utils.docutil(func)[source]

Bases: object

This decorator can be used to apply the doc string from another function to the decorated function. This is used for our single call wrapper functions who implement the visualizer API without forcing the user to jump through all the hoops. The docstring of both the visualizer and the single call wrapper should be identical, this decorator ensures that we only have to edit one doc string.

Usage:

@docutil(Visualizer.__init__)
def visualize(*args, **kwargs):
    pass

The basic usage is that you instantiate the decorator with the function whose docstring you want to copy, then apply that decorator to the the function whose docstring you would like modified.

Note that this decorator performs no wrapping of the target function.

yellowbrick.utils.get_model_name(model)[source]

Detects the model name for a Scikit-Learn model or pipeline

Parameters:

model: class or instance

The object to determine the name for

yellowbrick.utils.is_classifier(estimator)[source]

Returns True if the given estimator is (probably) a classifier.

Parameters:

estimator: class or instance

The object to test whether or not is a Scikit-Learn classifier.

yellowbrick.utils.is_dataframe(obj)[source]

Returns True if the given object is a Pandas Data Frame.

Parameters:

obj: instance

The object to test whether or not is a Pandas DataFrame.

yellowbrick.utils.is_estimator(model)[source]

Determines if a model is an estimator using issubclass and isinstance.

Parameters:

model: class or instance

The object to test whether or not is a Scikit-Learn estimator.

yellowbrick.utils.is_regressor(estimator)[source]

Returns True if the given estimator is (probably) a regressor.

Parameters:

model: class or instance

The object to test whether or not is a Scikit-Learn regressor.

yellowbrick.utils.isclassifier(estimator)

Returns True if the given estimator is (probably) a classifier.

Parameters:

estimator: class or instance

The object to test whether or not is a Scikit-Learn classifier.

yellowbrick.utils.isdataframe(obj)

Returns True if the given object is a Pandas Data Frame.

Parameters:

obj: instance

The object to test whether or not is a Pandas DataFrame.

yellowbrick.utils.isestimator(model)

Determines if a model is an estimator using issubclass and isinstance.

Parameters:

model: class or instance

The object to test whether or not is a Scikit-Learn estimator.

yellowbrick.utils.isregressor(estimator)

Returns True if the given estimator is (probably) a regressor.

Parameters:

model: class or instance

The object to test whether or not is a Scikit-Learn regressor.

yellowbrick.utils.memoized(fget)[source]

Return a property attribute for new-style classes that only calls its getter on the first access. The result is stored and on subsequent accesses is returned, preventing the need to call the getter any more.

Parameters:

fget: function

The getter method to memoize for subsequent access.

See also

python-memoized-property
python-memoized-property

yellowbrick.version module

Maintains version and package information for deployment.

yellowbrick.version.get_version(short=False)[source]

Prints the version.

Module contents

A suite of visual analysis and diagnostic tools to facilitate feature selection, model selection, and parameter tuning for machine learning.