Hierarchical Clustering algorithm derived from the R package ‘amap’ [Amap].
Hierarchical Cluster.
Initialize Hierarchical Cluster.
Input
- method - [string] the distance measure to be used
- ‘euclidean’
- link - [string] the agglomeration method to be used
- ‘single’
- ‘complete’
- ‘mcquitty’
- ‘median’
Example:
>>> import numpy as np
>>> import mlpy
>>> data = np.array([[1.0, 1.1, 2.0, 3.2, 3.4],
... [1.5, 1.8, 2.8, 3.1, 3.2]])
>>> hc = mlpy.HCluster()
>>> hc.compute(data)
>>> hc.ia
array([-4, -1, -3, 2])
>>> hc.ib
array([-5, -2, 1, 3])
>>> hc.heights
array([ 0.2236068 , 0.31622776, 1.4560219 , 2.94108844])
>>> hc.cut(0.5)
array([1, 1, 2, 3, 3])
Compute Hierarchical Cluster.
Input
- x - [2D numpy array float] (feature x sample) input data
Output
self.ia - [1D numpy array float] merge
self.ib - [1D numpy array float] merge
Element i of merge describes the merging of clusters at step i of the clustering. If an element j is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.
self.heights - [1D numpy array float] a set of n-1 non-decreasing real values. The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration.
Cuts the tree into several groups by specifying the cut height.
Input
- ht - [float] height where the tree should be cut
Output
- gm - [1D numpy array integer] group memberships. Groups are in 1, ..., N
[Amap] | amap: Another Multidimensional Analysis Package, http://cran.r-project.org/web/packages/amap/index.html |