Table Of Contents

SvmModel __init__


__init__(self, name=None)

[ALPHA] Create a ‘new’ instance of a Support Vector Machine model.

Parameters:

name : unicode (default=None)

User supplied name.

Returns:

: Model

A new instance of SvmModel

Support Vector Machine [R60] is a supervised algorithm used to perform binary classification. A Support Vector Machine constructs a high dimensional hyperplane which is said to achieve a good separation when a hyperplane has the largest distance to the nearest training-data point of any class. This model runs the MLLib implementation of SVM [R61] with SGD [R62] optimizer. The SVMWithSGD model is initialized, trained on columns of a frame, used to predict the labels of observations in a frame, and tests the predicted labels against the true labels. During testing, labels of the observations are predicted and tested against the true labels using built-in binary Classification Metrics.

footnotes

[R60]https://en.wikipedia.org/wiki/Support_vector_machine
[R61]https://spark.apache.org/docs/1.5.0/mllib-linear-methods.html#linear-support-vector-machines-svms
[R62]https://en.wikipedia.org/wiki/Stochastic_gradient_descent

Examples

Consider the following model trained and tested on the sample data set in frame ‘frame’.

Consider the following frame containing three columns.

>>> frame.inspect()
[#]  data   label
=================
[0]  -48.0  1
[1]  -75.0  1
[2]  -63.0  1
[3]  -57.0  1
[4]   73.0  0
[5]  -33.0  1
[6]  100.0  0
[7]  -54.0  1
[8]   78.0  0
[9]   48.0  0
>>> model = ta.SvmModel()
[===Job Progress===]
>>> train_output = model.train(frame, 'label', ['data'])
[===Job Progress===]
>>> predicted_frame = model.predict(frame, ['data'])
[===Job Progress===]
>>> predicted_frame.inspect()
[#]  data   label  predicted_label
==================================
[0]  -48.0  1                    1
[1]  -75.0  1                    1
[2]  -63.0  1                    1
[3]  -57.0  1                    1
[4]   73.0  0                    0
[5]  -33.0  1                    1
[6]  100.0  0                    0
[7]  -54.0  1                    1
[8]   78.0  0                    0
[9]   48.0  0                    0
>>> test_metrics = model.test(predicted_frame, 'predicted_label')
[===Job Progress===]
>>> test_metrics
Precision: 1.0
Recall: 1.0
Accuracy: 1.0
FMeasure: 1.0
Confusion Matrix:
            Predicted_Pos  Predicted_Neg
Actual_Pos              7              0
Actual_Neg              0              7