RandomForestRegressorModel __init__¶
-
__init__
(self, name=None)¶ Create a ‘new’ instance of a Random Forest Regressor model.
Parameters: name : unicode (default=None)
User supplied name.
Returns: : Model
A new instance of RandomForestRegressor Model
Random Forest [R55] is a supervised ensemble learning algorithm used to perform regression. A Random Forest Regressor model is initialized, trained on columns of a frame, and used to predict the value of each observation in the frame. This model runs the MLLib implementation of Random Forest [R56]. During training, the decision trees are trained in parallel. During prediction, the average over-all tree’s predicted value is the predicted value of the random forest.
footnotes
[R55] https://en.wikipedia.org/wiki/Random_forest [R56] https://spark.apache.org/docs/1.5.0/mllib-ensembles.html#random-forests Examples
Consider the following model trained and tested on the sample data set in frame ‘frame’.
Consider the following frame containing three columns.
>>> frame.inspect() [#] Class Dim_1 Dim_2 ======================================= [0] 1 19.8446136104 2.2985856384 [1] 1 16.8973559126 2.6933495054 [2] 1 5.5548729596 2.7777687995 [3] 0 46.1810010826 3.1611961917 [4] 0 44.3117586448 3.3458963222 [5] 0 34.6334526911 3.6429838715 >>> model = ta.RandomForestRegressorModel() [===Job Progress===] >>> train_output = model.train(frame, 'Class', ['Dim_1', 'Dim_2'], num_trees=1, impurity="variance", max_depth=4, max_bins=100) [===Job Progress===] >>> train_output {u'impurity': u'variance', u'max_bins': 100, u'observation_columns': [u'Dim_1', u'Dim_2'], u'num_nodes': 3, u'max_depth': 4, u'seed': -1632404927, u'num_trees': 1, u'label_column': u'Class', u'feature_subset_category': u'all'} >>> train_output['num_nodes'] 3 >>> train_output['label_column'] u'Class' >>> predicted_frame = model.predict(frame, ['Dim_1', 'Dim_2']) [===Job Progress===] >>> predicted_frame.inspect() [#] Class Dim_1 Dim_2 predicted_value ======================================================== [0] 1 19.8446136104 2.2985856384 1.0 [1] 1 16.8973559126 2.6933495054 1.0 [2] 1 5.5548729596 2.7777687995 1.0 [3] 0 46.1810010826 3.1611961917 0.0 [4] 0 44.3117586448 3.3458963222 0.0 [5] 0 34.6334526911 3.6429838715 0.0 >>> model.publish() [===Job Progress===]