Commands graph/graphx_pagerank¶
Determine which vertices are the most important.
POST /v1/commands/¶
GET /v1/commands/:id¶
Request¶
Route
POST /v1/commands/
Body
name: | graph/graphx_pagerank |
---|---|
arguments: | graph : Graph
output_property : unicode
input_edge_labels : list (default=None)
max_iterations : int32 (default=None)
reset_probability : float64 (default=None)
convergence_tolerance : float64 (default=None)
|
Headers
Authorization: test_api_key_1
Content-type: application/json
Description
Pulls graph from underlying store, sends it off to the PageRankRunner, and then writes the output graph back to the underlying store.
** Experimental Feature **
Basics and Background
PageRank is a method for determining which vertices in a directed graph are the most central or important. PageRank gives each vertex a score which can be interpreted as the probability that a person randomly walking along the edges of the graph will visit that vertex.
The calculation of PageRank is based on the supposition that if a vertex has many vertices pointing to it, then it is “important”, and that a vertex grows in importance as more important vertices point to it. The calculation is based only on the network structure of the graph and makes no use of any side data, properties, user-provided scores or similar non-topological information.
PageRank was most famously used as the core of the Google search engine for many years, but as a general measure of centrality in a graph, it has other uses to other problems, such as recommendation systems and analyzing predator-prey food webs to predict extinctions.
Background references
- Basic description and principles: Wikipedia: PageRank
- Applications to food web analysis: Stanford: Applications of PageRank
- Applications to recommendation systems: PLoS: Computational Biology
Mathematical Details of PageRank Implementation
The Trusted Analytics Platform implementation of PageRank satisfies the following equation at each vertex of the graph:
- Where:
- — a vertex— outbound degree of the vertex v— PageRank score of the vertex v— set of vertices pointing to the vertex v— total number of vertices in the graph— user specified damping factor (also known as reset probability)
Termination is guaranteed by two mechanisms.
- The user can specify a convergence threshold so that the algorithm will terminate when, at every vertex, the difference between successive approximations to the PageRank score falls below the convergence threshold.
- The user can specify a maximum number of iterations after which the algorithm will terminate.
Response¶
Status
200 OK
Body
Returns information about the command. See the Response Body for Get Command here below. It is the same.
GET /v1/commands/:id¶
Request¶
Route
GET /v1/commands/18
Body
(None)
Headers
Authorization: test_api_key_1
Content-type: application/json
Response¶
Status
200 OK
Body
dict
dict((vertex_dictionary, (label, Frame)), (edge_dictionary,(label,Frame))).
Dictionary containing dictionaries of labeled vertices and labeled edges.
For the vertex_dictionary the vertex type is the key and the corresponding vertex’s frame with a new column storing the page rank value for the vertex. Call vertex_dictionary[‘label’] to get the handle to frame whose vertex type is label.
For the edge_dictionary the edge type is the key and the corresponding edge’s frame with a new column storing the page rank value for the edge. Call edge_dictionary[‘label’] to get the handle to frame whose edge type is label.