Dark v0.6.0 documentation

Shaping the data

«  Aggregates   ::   Contents   ::   Discovery  »

Shaping the data

The whole library is built mainly for the function cast(). It can be used to build HTML tables, etc. For interactive shell there is a wrapper function cast_cons() which redirects arguments to cast() and prints results as a nice-looking ASCII table.

dark.shaping.cast(basic_query, factor_names=None, pivot_factors=None, *aggregates)

Creates a table summarizing data grouped by given factors. Calculates aggregated values. If aggregate is not defined, all items in the query are counted. Pivoting (i.e. using factor levels as columns) is also supported.

The name “cast” stands for “casting melt data” and is a reference to Hadley Wickham’s package reshape for R language, though internally these packages have little in common.

Parameters:
  • basic_query – a Query instance (pre-filtered or not) on which the table is going to be built.
  • factor_names – optional list of keys by which data will be grouped. Their names will go into the table heading, and their values will be used to calculate aggregated values. If more than one factor is specified, they will be grouped hierarchically from left to right.
  • pivot_factors – optional list of keys which values will go into the table heading along with factor names so that extra columns with aggregated values will be added for each possible factor level (key value).
  • aggregates – optional list of Aggregate instances. Some aggregates require a factor name (i.e. key). Examples: Count(), Sum(‘price’). Aggregates will be calculated for each combination of factors and for each pivoted a factor level. If aggregates are not specified, Count instance is added.
Returns:

a list of lists, i.e. a table.

See tests for usage examples.

dark.shaping.cast_cons(*args, **kwargs)

Wrapper for cast function for usage from console. Prints a simplified table using ASCII art.

dark.shaping.stdev(query, key)

Prints standard deviation for given key in given query.

dark.shaping.summary(query, key)

Prints a summary for given key in given query. (see summary function in R language).

«  Aggregates   ::   Contents   ::   Discovery  »