KMeansModel train¶
-
train
(self, frame, observation_columns, column_scalings, k=2, max_iterations=20, epsilon=0.0001, initialization_mode='k-means||')¶ [BETA] Creates KMeans Model from train frame.
Parameters: frame : Frame
A frame to train the model on.
observation_columns : list
Columns containing the observations.
column_scalings : list
Column scalings for each of the observation columns. The scaling value is multiplied by the corresponding value in the observation column.
k : int32 (default=2)
Desired number of clusters. Default is 2.
max_iterations : int32 (default=20)
Number of iterations for which the algorithm should run. Default is 20.
epsilon : float64 (default=0.0001)
Distance threshold within which we consider k-means to have converged. Default is 1e-4. If all centers move less than this Euclidean distance, we stop iterating one run.
initialization_mode : unicode (default=k-means||)
The initialization technique for the algorithm. It could be either “random” to choose random points as initial clusters, or “k-means||” to use a parallel variant of k-means++. Default is “k-means||”.
Returns: : dict
- dictionary
A dictionary with trained KMeans model with the following keys:
‘cluster_size’ : dictionary with ‘Cluster:id’ as the key and the corresponding cluster size is the value ‘within_set_sum_of_squared_error’ : The set of sum of squared error for the model.
Creating a KMeans Model using the observation columns.
Examples
See here for examples.