Frames EdgeFrame¶

class EdgeFrame¶

A list of Edges owned by a Graph.

An EdgeFrame is similar to a Frame but with a few important differences:

EdgeFrames are not instantiated directly by the user, instead they are created by defining an edge type in a graph
Each row of an EdgeFrame represents an edge in a graph
EdgeFrames have many of the same methods as Frames but not all
EdgeFrames have extra methods not found on Frames (e.g. add_edges())
EdgeFrames have a dependency on one or two VertexFrames (adding an edge to an EdgeFrame requires either vertices to be present or for the user to specify create_missing_vertices=True)
EdgeFrames have special system columns (_eid, _label, _src_vid, _dest_vid) that are maintained automatically by the system and cannot be modified by the user
“Columns” on an EdgeFrame can also be thought of as “properties” on Edges

Attributes

column_names	Column identifications in the current frame.
last_read_date	Last time this frame’s data was accessed.
name	Set or get the name of the frame object.
row_count	Number of rows in the current frame.
schema	Current frame column names and types.
status	Current frame life cycle status.

Methods

__init__(self[, graph, label, src_vertex_label, ...])	Examples
add_columns(self, func, schema[, columns_accessed])	Add columns to current frame.
add_edges(self, source_frame, column_name_for_source_vertex_id, ...[, ...])	Add edges to a graph.
assign_sample(self, sample_percentages[, sample_labels, ...])	Randomly group rows into user-defined classes.
bin_column(self, column_name, cutoffs[, include_lowest, strict_binning, ...])	Classify data into user-defined groups.
bin_column_equal_depth(self, column_name[, num_bins, ...])	Classify column into groups with the same frequency.
bin_column_equal_width(self, column_name[, num_bins, ...])	Classify column into same-width groups.
categorical_summary(self, *column_inputs)	[ALPHA] Compute a summary of the data in a column(s) for categorical or numerical data types.
classification_metrics(self, label_column, pred_column[, ...])	Model statistics of accuracy, precision, and others.
column_median(self, data_column[, weights_column])	Calculate the (weighted) median of a column.
column_mode(self, data_column[, weights_column, max_modes_returned])	Evaluate the weights assigned to rows.
column_summary_statistics(self, data_column[, ...])	Calculate multiple statistics for a column.
copy(self[, columns, where, name])	Create new frame from current frame.
correlation(self, data_column_names)	Calculate correlation for two columns of current frame.
correlation_matrix(self, data_column_names[, matrix_name])	Calculate correlation matrix for two or more columns.
count(self, where)	Counts the number of rows which meet given criteria.
covariance(self, data_column_names)	Calculate covariance for exactly two columns.
covariance_matrix(self, data_column_names[, matrix_name])	Calculate covariance matrix for two or more columns.
cumulative_percent(self, sample_col)	[BETA] Add column to frame with cumulative percent sum.
cumulative_sum(self, sample_col)	[BETA] Add column to frame with cumulative percent sum.
daal_pca(self, column_names[, method])	[ALPHA] <Missing Doc>
dot_product(self, left_column_names, right_column_names, ...[, ...])	[ALPHA] Calculate dot product for each row in current frame.
download(self[, n, offset, columns])	Download frame data from the server into client workspace as a pandas dataframe
drop_columns(self, columns)	Remove columns from the frame.
drop_duplicates(self[, unique_columns])	Modify the current frame, removing duplicate rows.
drop_rows(self, predicate)	Erase any row in the current frame which qualifies.
ecdf(self, column[, result_frame_name])	Builds new frame with columns for data and distribution.
entropy(self, data_column[, weights_column])	Calculate the Shannon entropy of a column.
export_to_csv(self, folder_name[, separator, count, offset])	Write current frame to HDFS in csv format.
export_to_hbase(self, table_name[, key_column_name, family_name])	Write current frame to HBase table.
export_to_hive(self, table_name)	Write current frame to Hive table.
export_to_jdbc(self, table_name[, connector_type, url, driver_name, ...])	Write current frame to JDBC table.
export_to_json(self, folder_name[, count, offset])	Write current frame to HDFS in JSON format.
filter(self, predicate)	Select all rows which satisfy a predicate.
flatten_column(self, column[, delimiter])	[DEPRECATED] Note that flatten_column() has been deprecated. Use flatten_columns() instead.
flatten_columns(self, columns[, delimiters])	Spread data to multiple rows based on cell data.
get_error_frame(self)	Get a frame with error recordings.
group_by(self, group_by_columns, *aggregation_arguments)	[BETA] Create summarized frame.
histogram(self, column_name[, num_bins, weight_column_name, bin_type])	[BETA] Compute the histogram for a column in a frame.
inspect(self[, n, offset, columns, wrap, truncate, round, width, margin, ...])	Pretty-print of the frame data
join(self, right, left_on[, right_on, how, name])	[BETA] Join operation on one or two frames, creating a new frame.
quantiles(self, column_name, quantiles)	New frame with Quantiles and their values.
rename_columns(self, names)	Rename columns for edge frame.
sort(self, columns[, ascending])	[BETA] Sort the data in a frame.
sorted_k(self, k, column_names_and_ascending[, reduce_tree_depth])	[ALPHA] Get a sorted subset of the data.
take(self, n[, offset, columns])	Get data subset.
tally(self, sample_col, count_val)	[BETA] Count number of times a value is seen.
tally_percent(self, sample_col, count_val)	[BETA] Compute a cumulative percent count.
top_k(self, column_name, k[, weights_column])	Most or least frequent column values.
unflatten_column(self, columns[, delimiter])	[DEPRECATED] Note that unflatten_column() has been deprecated. Use unflatten_columns() instead.
unflatten_columns(self, columns[, delimiter])	Compacts data from multiple rows based on cell data.

__init__(self, graph=None, label=None, src_vertex_label=None, dest_vertex_label=None, directed=None)¶

Examples

Parameters:

Parameters:	graph : (default=None) label : (default=None) src_vertex_label : (default=None) dest_vertex_label : (default=None) directed : (default=None)

graph : (default=None)

label : (default=None)

src_vertex_label : (default=None)

dest_vertex_label : (default=None)

directed : (default=None)

Given a data file /movie.csv, create a frame to match this data and move the data to the frame. Create an empty graph and define some vertex and edge types.

>>>>>> my_csv = ta.CsvFile("/movie.csv", schema= [('user_id', int32),
...                                     ('user_name', str),
...                                     ('movie_id', int32),
...                                     ('movie_title', str),
...                                     ('rating', str)])

>>> my_frame = ta.Frame(my_csv)
>>> my_graph = ta.Graph()
>>> my_graph.define_vertex_type('users')
>>> my_graph.define_vertex_type('movies')
>>> my_graph.define_edge_type('ratings','users','movies',directed=True)

Add data to the graph from the frame:

>>>>>> my_graph.vertices['users'].add_vertices(my_frame, 'user_id', ['user_name'])
>>> my_graph.vertices['movies].add_vertices(my_frame, 'movie_id', ['movie_title])

Create an edge frame from the graph, and add edge data from the frame.

>>>>>> my_edge_frame = graph.edges['ratings']
>>> my_edge_frame.add_edges(my_frame, 'user_id', 'movie_id', ['rating']

Retrieve a previously defined graph and retrieve an EdgeFrame from it:

>>>>>> my_old_graph = ta.get_graph("your_graph")
>>> my_new_edge_frame = my_old_graph.edges["your_label"]

Calling methods on an EdgeFrame:

>>>>>> my_new_edge_frame.inspect(20)

Copy an EdgeFrame to a frame using the copy method:

>>>>>> my_new_frame = my_new_edge_frame.copy()

Quick search

Table Of Contents

Frames EdgeFrame¶