Models SvmModel¶
-
class
SvmModel
¶ Entity SvmModel
Attributes
last_read_date Read-only property - Last time this model’s data was accessed. name Set or get the name of the model object. status Read-only property - Current model life cycle status. Methods
__init__(self[, name, _info]) [ALPHA] Create a ‘new’ instance of a Support Vector Machine model. predict(self, frame[, observation_columns]) [ALPHA] Predict the labels for the data points publish(self) [BETA] Creates a tar file that will be used as input to the scoring engine test(self, frame, label_column[, observation_columns]) [ALPHA] Predict test frame labels and return metrics. train(self, frame, label_column, observation_columns[, intercept, ...]) [ALPHA] Build SVM with SGD model
-
__init__
(self, name=None)¶ [ALPHA] Create a ‘new’ instance of a Support Vector Machine model.
Parameters: name : unicode (default=None)
User supplied name.
Returns: : Model
A new instance of SvmModel
Support Vector Machine [R57] is a supervised algorithm used to perform binary classification. A Support Vector Machine constructs a high dimensional hyperplane which is said to achieve a good separation when a hyperplane has the largest distance to the nearest training-data point of any class. This model runs the MLLib implementation of SVM [R58] with SGD [R59] optimizer. The SVMWithSGD model is initialized, trained on columns of a frame, used to predict the labels of observations in a frame, and tests the predicted labels against the true labels. During testing, labels of the observations are predicted and tested against the true labels using built-in binary Classification Metrics.
footnotes
[R57] https://en.wikipedia.org/wiki/Support_vector_machine [R58] https://spark.apache.org/docs/1.5.0/mllib-linear-methods.html#linear-support-vector-machines-svms [R59] https://en.wikipedia.org/wiki/Stochastic_gradient_descent Examples
Consider the following model trained and tested on the sample data set in frame ‘frame’.
Consider the following frame containing three columns.
>>> frame.inspect() [#] data label ================= [0] -48.0 1 [1] -75.0 1 [2] -63.0 1 [3] -57.0 1 [4] 73.0 0 [5] -33.0 1 [6] 100.0 0 [7] -54.0 1 [8] 78.0 0 [9] 48.0 0
>>> model = ta.SvmModel() [===Job Progress===] >>> train_output = model.train(frame, 'label', ['data']) [===Job Progress===]
>>> predicted_frame = model.predict(frame, ['data']) [===Job Progress===] >>> predicted_frame.inspect() [#] data label predicted_label ================================== [0] -48.0 1 1 [1] -75.0 1 1 [2] -63.0 1 1 [3] -57.0 1 1 [4] 73.0 0 0 [5] -33.0 1 1 [6] 100.0 0 0 [7] -54.0 1 1 [8] 78.0 0 0 [9] 48.0 0 0
>>> test_metrics = model.test(predicted_frame, 'predicted_label') [===Job Progress===]
>>> test_metrics Precision: 1.0 Recall: 1.0 Accuracy: 1.0 FMeasure: 1.0 Confusion Matrix: Predicted_Pos Predicted_Neg Actual_Pos 7 0 Actual_Neg 0 7