User Testing Instructions¶
We are looking for people to help us Alpha test the Yellowbrick project! Helping is simple: simply create a notebook that applies the concepts in this Getting Started guide to a small-to-medium size dataset of your choice. Run through the examples with the dataset, and try to change options and customize as much as possible. After you’ve exercised the code with your examples, respond to our alpha testing survey!
Step One: Dataset¶
Select a multivariate dataset of your own; the more (e.g. different) datasets that we can run through Yellowbrick, the more likely we’ll discover edge cases and exceptions! Note that your dataset must be well-suited to modeling with Scikit-Learn. In particular we recommend you choose a dataset whose target is suited to the following supervised learning tasks:
- Regression (target is a continuous variable)
- Classification (target is a discrete variable)
There are datasets that are well suited to both types of analysis; either way you can use the testing methodology from this notebook for either type of task (or both). In order to find a dataset, we recommend you try the following places:
You’re more than welcome to choose a dataset of your own, but we do ask that you make at least the notebook containing your testing results publicly available for us to review. If the data is also public (or you’re willing to share it with the primary contributors) that will help us figure out bugs and required features much more easily!
Step Two: Notebook¶
Create a notebook in a GitHub repository. We suggest the following:
- Fork the Yellowbrick repository
- Under the
examples
directory, create a directory named with your GitHub username - Create the notebook in examples/USERNAME/testing.ipynb
Alternatively, you could just send us a notebook via Gist or your own repository. However, if you fork Yellowbrick, you can initiate a pull request to have your example added to our gallery!
Step Three: Model with Yellowbrick and Scikit-Learn¶
Add the following to the notebook:
- A title in markdown
- A description of the dataset and where it was obtained
- A section that loads the data into a Pandas dataframe or NumPy matrix
Then conduct the following modeling activities:
- Feature analysis using Scikit-Learn and Yellowbrick
- Estimator fitting using Scikit-Learn and Yellowbrick
You can follow along with our examples
directory (check out
examples.ipynb)
or even create your own custom visualizers! The goal is that you create
an end-to-end model from data loading to estimator(s) with visualizers
along the way.
IMPORTANT: please make sure you record all errors that you get and any tracebacks you receive for step four!
Step Four: Feedback¶
Finally, submit feedback via the Google Form we have created:
https://goo.gl/forms/naoPUMFa1xNcafY83
This form is allowing us to aggregate multiple submissions and bugs so that we can coordinate the creation and management of issues. If you are the first to report a bug or feature request, we will make sure you’re notified (we’ll tag you using your Github username) about the created issue!
Step Five: Thanks!¶
Thank you for helping us make Yellowbrick better! We’d love to see pull requests for features you think would be extend the library. We’ll also be doing a user study that we would love for you to participate in. Stay tuned for more great things from Yellowbrick!