7. API Reference¶
This contains the API of all the functions of mclearn. The content is the same as what you would get from the docstrings, namely the parameters and the return type of each function. If this is the first time you’re using mclearn, you might like to instead go through the User Guide for an in-depth explanation of the algorithms.
As a convenience, all functions listed below can be accessed directly from the top-level module.
7.1. Classifiers¶
Some general standard classifier routines for astronomical data.
train_classifier |
Standard classifier routine. |
print_classification_result |
Train the specified classifier and print out the results. |
learning_curve |
Compute the learning curve of a classiifer. |
compute_all_learning_curves |
Compute the learning curves with the most popular classifiers. |
grid_search |
A general grid search routine. |
grid_search_svm_rbf |
Do a grid search on SVM wih an RBF kernel. |
grid_search_svm_sigmoid |
Do a grid search on SVM wih a sigmoid kernel. |
grid_search_svm_poly_degree |
Do a grid search on a Linear SVM given the specified polynomial transformation. |
grid_search_svm_poly |
Do a grid search on SVM with polynomial transformation of the features. |
grid_search_logistic_degree |
Do a grid search on Logistic Regression given the specified polynomial transformation. |
grid_search_logistic |
Do a grid search on Logistic Regression. |
predict_unlabelled_objects |
Predict the classes of unlabelled objects given a classifier. |
7.2. Active Learner¶
The main routine of all active learning algorithms.
active_learn |
Conduct active learning and return a learning curve. |
run_active_learning_with_heuristic |
Experiment routine with a partciular classifier heuristic. |
active_learning_experiment |
Run an active learning experiment with specified heuristics. |
7.3. Active Learning Heuristics¶
Heuristics used to query the most uncertain candidate out of the unlabelled pool.
random_h |
Return a random candidate. |
entropy_h |
Return the candidate whose prediction vector displays the greatest Shannon entropy. |
margin_h |
Return the candidate with the smallest margin. |
qbb_margin_h |
Return the candidate with the smallest average margin. |
qbb_kl_h |
Return the candidate with the largest average KL divergence from the mean. |
compute_A |
Compute the A matrix in the variance estimation technique. |
compute_F |
Compute the F matrix in the variance estimation technqiue. |
compute_pool_variance |
Estimate the variance of the pool. |
pool_variance_h |
Return the candidate that will minimise the expected variance of the predictions. |
compute_pool_entropy |
Estimate the variance of the pool. |
pool_entropy_h |
Return the candidate that will minimise the expected entropy of the predictions. |
7.4. Performance Measures¶
Various measures that evaluate the performance of a classifier.
naive_accuracy |
Compute the naive accuracy rate. |
get_beta_parameters |
Extract the beta parameters from a confusion matrix. |
convolve_betas |
Convolves k Beta distributions. |
balanced_accuracy_expected |
Compute the expected value of the posterior balanced accuracy. |
beta_sum_pdf |
Compute the pdf of the sum of beta distributions. |
beta_avg_pdf |
Compute the pdf of the average of the k beta distributions. |
beta_sum_cdf |
Compute the cdf of the sum of the k beta distributions. |
beta_avg_pdf |
Compute the pdf of the average of the k beta distributions. |
beta_avg_inv_cdf |
Compute the inverse cdf of the average of the k beta distributions. |
recall |
Compute the recall from a confusion matrix. |
precision |
Compute the precision from a confusion matrix. |
compute_balanced_accuracy |
Compute the accuracy of a classifier based on some test set. |
7.5. Photometric Data¶
Procedures specific to photometric data.
reddening_correction_sfd98 |
Compute the reddening values using the SFD98 correction set. |
reddening_correction_sf11 |
Compute the reddening values using the SF11 correction set. |
reddening_correction_w14 |
Compute the reddening values using the W14 correction set. |
correct_magnitudes |
Correct the values of magntidues given a correction set. |
compute_colours |
Compute specified combinations of colours. |
fetch_sloan_data |
Run an SQL query on the Sloan Sky Server. |
fetch_filter |
Get a filter from the internet. |
fetch_spectrum |
Get a spectrum from the internet. |
clean_up_subclasses |
Clean up the names of the subclasses in the SDSS dataset. |
7.6. Data Preprocessing¶
Useful general-purpose preprocessing functions.
normalise_z |
Normalise each feature to have zero mean and unit variance. |
normalise_unit_var |
Normalise each feature to have unit variance. |
normalise_01 |
Normalise each feature to unit interval. |
draw_random_sample |
Split the data into a train set and test set of a given size. |
balanced_train_test_split |
Split the data into a balanced training set and test set of some given size. |
csv_to_hdf |
Convert csv files to a HDF5 table. |
7.7. Visualisations¶
Selected plots commonly used in astronomy and active learning.
plot_class_distribution |
Plot the distribution of the classes. |
plot_scores |
Make a barplot of the scores of some performance measure. |
plot_balanced_accuracy_violin |
Make a violin plot of the balanced posterior accuracy. |
plot_learning_curve |
Plot the learning curve. |
plot_average_learning_curve |
Plot the average learning curve from many trials. |
plot_hex_map |
Plot the density of objects on a hex map. |
plot_recall_maps |
Plot the recall map. |
plot_filters_and_spectrum |
Plot ugriz filters and spectrum in the same figure. |
plot_scatter_with_classes |
Plot a scater plot of the classes. |
reshape_grid_socres |
Reshape the scores to be used as input for the heathap. |
plot_validation_accuracy_heatmap |
Plot heatmap of the validation accuracy from a grid search. |