Functions related to computing statistics on a set of data.
Calculate the points of the ROC curve from a set of labels and evaluations of a classifier.
Uses the single-pass efficient algorithm of Fawcett (2006). This assumes a binary classification task.
| Parameters: |
|
|---|---|
| Returns: | Points on the ROC curve |
| Return type: | 1D ndarray of float |
See also
sklearn.metrics.roc_curve()
Calculate area under the ROC curve (AUC) from a set of target labels and predicted labels of a classifier.
| Parameters: |
|
|---|---|
| Returns: | AUC value |
| Return type: | float |
See also
sklearn.metrics.auc()
Calculate discriminability (d’) measure.
| Parameters: |
|
|---|---|
| Return type: | float |
Compute the Principal Component Analysis (PCA) transformation for a dataset.
The first k rows of the transformation correspond to a projection onto a k-dimensional surface, chosen such that the L2 approximation error on the training set is minimized. This transformation is given as:
where \(\mu\) is the mean of the training data, and A has columns given by the eigenvectors of the training data’s covariance matrix. Eigenvectors are sorted by descending eigenvalue, which provides that the first k rows of the transformation are the optimal linear transformation under L2 approximation error. Returns the transform, and the standard deviation for each axis of the training data.
Usage:
>>> T, S = Pca(X)
where X is a matrix of training data, T is the transformation matrix, S is the array of standard devations. To transform a data point W given as an array, use:
>>> Y = numpy.dot(T, W)
| Parameters: | X (2D ndarray of float) – Input data, with variables given by columns and observations given by rows. |
|---|
See also
This function was adapted from similar code by Jan Erik Solem. Also see sklearn.decomposition.PCA().