Graph kclique_percolation¶
-
kclique_percolation
(self, clique_size, community_property_label)¶ [ALPHA] Find groups of vertices with similar attributes.
Parameters: clique_size : int32
The sizes of the cliques used to form communities. Larger values of clique size result in fewer, smaller communities that are more connected. Must be at least 2.
community_property_label : unicode
Name of the community property of vertex that will be updated/created in the graph. This property will contain for each vertex the set of communities that contain that vertex.
Returns: : dict
Dictionary of vertex label and frame, Execution time.
Community Detection Using the K-Clique Percolation Algorithm
Overview
Modeling data as a graph captures relations, for example, friendship ties between social network users or chemical interactions between proteins. Analyzing the structure of the graph reveals collections (often termed ‘communities’) of vertices that are more likely to interact amongst each other. Examples could include a community of friends in a social network or a collection of highly interacting proteins in a cellular process.
Trusted Analytics Platform provides community detection using the k-Clique percolation method first proposed by Palla et. al. [R1] that has been widely used in many contexts.
K-Clique Percolation
K-clique percolation is a method for detecting community structure in graphs. Here we provide mathematical background on how communities are defined in the context of the k-clique percolation algorithm.
A clique is a group of vertices in which every vertex is connected (via undirected edge) with every other vertex in the clique. This graphically looks like a triangle or a structure composed of triangles:
A clique is certainly a community in the sense that its vertices are all connected, but, it is too restrictive for most purposes, since it is natural some members of a community may not interact.
Mathematically, a k-clique has vertices, each with common edges, each of which connects to another vertex in the k-clique. The k-clique percolation method forms communities by taking unions of k-cliques that have vertices in common.
K-Clique Example
In the graph below, the cliques are the sections defined by their triangular appearance and the 3-clique communities are {1, 2, 3, 4} and {4, 5, 6, 7, 8}. The vertices 9, 10, 11, 12 are not in 3-cliques, therefore they do not belong to any community. Vertex 4 belongs to two distinct (but overlapping) communities.
Distributed Implementation of K-Clique Community Detection
The implementation of k-clique community detection in Trusted Analytics Platform is a fully distributed implementation that follows the map-reduce algorithm proposed in Varamesh et. al. [R2] .
It has the following steps:
- All k-cliques are enumerated.
- k-cliques are used to build a “clique graph” by declaring each k-clique to be a vertex in a new graph and placing edges between k-cliques that share k-1 vertices in the base graph.
- A connected component analysis is performed on the clique graph. Connected components of the clique graph correspond to k-clique communities in the base graph.
- The connected components information for the clique graph is projected back down to the base graph, providing each vertex with the set of k-clique communities to which it belongs.
Notes
Spawns a number of Spark jobs that cannot be calculated before execution (it is bounded by the diameter of the clique graph derived from the input graph). For this reason, the initial loading, clique enumeration and clique-graph construction steps are tracked with a single progress bar (this is most of the time), and then successive iterations of analysis of the clique graph are tracked with many short-lived progress bars, and then finally the result is written out.
Footnotes
[R1] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435:814, 2005 ( See http://hal.elte.hu/cfinder/wiki/papers/communitylettm.pdf ) [R2] Varamesh, A.; Akbari, M.K.; Fereiduni, M.; Sharifian, S.; Bagheri, A., “Distributed Clique Percolation based community detection on social networks using MapReduce,” Information and Knowledge Technology (IKT), 2013 5th Conference on, vol., no., pp.478,483, 28-30 May 2013