Graph graphx_connected_components¶
-
graphx_connected_components
(self, output_vertex_property_name='connectedComponentId')¶ Implements the connected components computation on a graph by invoking graphx api.
Parameters: output_vertex_property_name : unicode (default=connectedComponentId)
The name of the column containing the connected component value.
Returns: : dict
- Dictionary containing the vertex type as the key and the corresponding
vertex’s frame with a connected component column. Call dictionary_name[‘label’] to get the handle to frame whose vertex type is label.
Pulls graph from underlying store, sends it off to the ConnectedComponentGraphXDefault, and then writes the output graph back to the underlying store.
Connected Components (CC)
Connected components are disjoint subgraphs in which all vertices are connected to all other vertices in the same component via paths, but not connected via paths to vertices in any other component. The connected components algorithm uses message passing along a specified edge type to find all of the connected components of a graph and label each edge with the identity of the component to which it belongs. The algorithm is specific to an edge type, hence in graphs with several different types of edges, there may be multiple, overlapping sets of connected components.
The algorithm works by assigning each vertex a unique numerical index and passing messages between neighbors. Vertices pass their indices back and forth with their neighbors and update their own index as the minimum of their current index and all other indices received. This algorithm continues until there is no change in any of the vertex indices. At the end of the algorithm, the unique levels of the indices denote the distinct connected components. The complexity of the algorithm is proportional to the diameter of the graph.
Examples
>>> graph = ta.Graph()
>>> graph.define_vertex_type('source') [===Job Progress===] >>> graph.vertices['source'].add_vertices(vertex_frame, 'source', 'label') [===Job Progress===] >>> graph.define_edge_type('edges','source', 'source', directed=False) [===Job Progress===] >>> graph.edges['edges'].add_edges(edge_frame, 'source', 'dest', ['weight']) [===Job Progress===] >>> result = graph.graphx_connected_components() [===Job Progress===] >>> result['source'].inspect() [#] _vid _label source label connectedComponentId ====================================================== [0] 5 source 5 5.0 1 [1] 1 source 1 1.0 1 [2] 2 source 2 1.0 1 [3] 3 source 3 5.0 1 [4] 4 source 4 5.0 1
>>> graph.edges['edges'].inspect() [#] _eid _src_vid _dest_vid _label weight ============================================== [0] 6 1 2 edges 1 [1] 7 1 3 edges 1 [2] 9 1 4 edges 1 [3] 8 2 3 edges 1 [4] 10 4 5 edges 1