Models SvmModel


class SvmModel

Entity SvmModel

Attributes

last_read_date Read-only property - Last time this model’s data was accessed.
name Set or get the name of the model object.
status Read-only property - Current model life cycle status.

Methods

__init__(self[, name, _info]) [ALPHA] Create a ‘new’ instance of a Support Vector Machine model.
predict(self, frame[, observation_columns]) [ALPHA] Predict the labels for the data points
publish(self) [BETA] Creates a tar file that will be used as input to the scoring engine
test(self, frame, label_column[, observation_columns]) [ALPHA] Predict test frame labels and return metrics.
train(self, frame, label_column, observation_columns[, intercept, ...]) [ALPHA] Build SVM with SGD model
__init__(self, name=None)

[ALPHA] Create a ‘new’ instance of a Support Vector Machine model.

Parameters:

name : unicode (default=None)

User supplied name.

Returns:

: Model

A new instance of SvmModel

Support Vector Machine [R57] is a supervised algorithm used to perform binary classification. A Support Vector Machine constructs a high dimensional hyperplane which is said to achieve a good separation when a hyperplane has the largest distance to the nearest training-data point of any class. This model runs the MLLib implementation of SVM [R58] with SGD [R59] optimizer. The SVMWithSGD model is initialized, trained on columns of a frame, used to predict the labels of observations in a frame, and tests the predicted labels against the true labels. During testing, labels of the observations are predicted and tested against the true labels using built-in binary Classification Metrics.

footnotes

[R57]https://en.wikipedia.org/wiki/Support_vector_machine
[R58]https://spark.apache.org/docs/1.5.0/mllib-linear-methods.html#linear-support-vector-machines-svms
[R59]https://en.wikipedia.org/wiki/Stochastic_gradient_descent

Examples

Consider the following model trained and tested on the sample data set in frame ‘frame’.

Consider the following frame containing three columns.

>>> frame.inspect()
[#]  data   label
=================
[0]  -48.0  1
[1]  -75.0  1
[2]  -63.0  1
[3]  -57.0  1
[4]   73.0  0
[5]  -33.0  1
[6]  100.0  0
[7]  -54.0  1
[8]   78.0  0
[9]   48.0  0
>>> model = ta.SvmModel()
[===Job Progress===]
>>> train_output = model.train(frame, 'label', ['data'])
[===Job Progress===]
>>> predicted_frame = model.predict(frame, ['data'])
[===Job Progress===]
>>> predicted_frame.inspect()
[#]  data   label  predicted_label
==================================
[0]  -48.0  1                    1
[1]  -75.0  1                    1
[2]  -63.0  1                    1
[3]  -57.0  1                    1
[4]   73.0  0                    0
[5]  -33.0  1                    1
[6]  100.0  0                    0
[7]  -54.0  1                    1
[8]   78.0  0                    0
[9]   48.0  0                    0
>>> test_metrics = model.test(predicted_frame, 'predicted_label')
[===Job Progress===]
>>> test_metrics
Precision: 1.0
Recall: 1.0
Accuracy: 1.0
FMeasure: 1.0
Confusion Matrix:
            Predicted_Pos  Predicted_Neg
Actual_Pos              7              0
Actual_Neg              0              7