Backends for browsing aggregates of various data sources
SQL backend uses SQLAlchemy for generating queries. It supports all databases that the SQLAlchemy supports such as:
SnowflakeBrowser is a SQL-based AggregationBrowser implementation that can aggregate star and snowflake schemas without need of having explicit view or physical denormalized table.
Attributes:
Tuning:
Limitations:
Returns a built-in function for aggregate
Execute the statement, optionally log it. Returns the result cursor.
Get a single fact with key key_value from cube.
Number of SQL queries: 1.
Return all facts from cell, might be ordered and paginated.
Number of SQL queries: 1.
Return SQL features. Currently they are all the same for every cube, however in the future they might depend on the SQL engine or other factors.
Returns details for path in dimension. Can be used for multi-dimensional “breadcrumbs” in a used interface.
Number of SQL queries: 1.
Return aggregated result.
Arguments:
Query tuning:
Result is paginated by page_size and ordered by order.
Number of database queries:
Notes:
Return values for dimension with level depth depth. If depth is None, all levels are returned.
Number of database queries: 1.
Change the browser’s locale
Tests whether the statement can be constructed.
Validate physical representation of model. Returns a list of dictionaries with keys: type, issue, object.
Types might be: join or attribute.
The join issues are:
The attribute issues are:
Creates a SQL query statement builder object – a controller-like object that incrementally constructs the statement.
Result attributes:
Returns an expression that performs the aggregation of measure aggregate. The result’s label is the aggregate’s name. aggregate has to be MeasureAggregate instance.
If aggregate function is post-aggregation calculation, then None is returned.
Aggregation function names are case in-sensitive.
If coalesce_measure is True then selected measure column is wrapped in COALESCE(column, 0).
Builds a statement to aggregate the cell.
Algorithm description:
All the tables have one of the two relationship to the fact: master/match or detail. Every table connected to a table that has “detail” relationship is considered also in the “detail” relationship towards the fact. Therefore we have two join zones: all master or detail tables from the core, directly connected to the fact table and rest of the table connected to the core through outer detail relationship.
Depending on the query it is decided whether we are fine with just joining everything together into single join or we need to separate the fact master core from the outer details:
+------+ +-----+
| fact |--(match)--| dim +
+------+ +-----+
Master Fact |
===============|========================
Outer Details | +-----+
+------(detail)-| dim |
+-----+
The outer details part is RIGHT OUTER JOINed to the fact. Since there are no tables any more, the original table keys for joins to the outer details were exposed and specially labeled as __masterkeyXX where XX is a sequence number of the key. The join_expression JOIN constructing method receives the map of the keys and replaces the original tables with connections to the columns already selected in the master fact.
Note
Limitation: we can not have a Cut (condition) where keys (path elements) are from both join zones. Whole cut should be within one zone: either the master fact or outer details.
Appends condition to the generated statement.
Returns a set of attributes included in the cell.
Returns a list of tuples (cut, attributes) where attributes is list of attributes involved in the cut.
Returns list of expressions for aggregates from aggregates that are computed using the SQL statement.
Returns either a physical column for the attribute or a reference to a column from the master fact if it exists.
Returns a SQL condition for the cell.
Returns a Condition tuple (attributes, conditions, group_by) dimension dim point at path. It is a compound condition - one equality condition for each path element in form: level[i].key = path[i]
Constructs conditions for all cuts in the cell. Returns a list of SQL conditional expressions.
Builds a statement for denormalized view. whereclause is same as SQLAlchemy whereclause for sqlalchemy.sql.expression.select(). attributes is list of logical references to attributes to be selected. If it is None then all attributes are used. condition_attributes contains list of attributes that are not going to be selected, but are required for WHERE condition.
Set expand_locales to True to expand all localized attributes.
Selects only fact with given id
Returns a column that represents the fact key.
Returns a list of measures for aggregates. This method is used in constructing the master fact.
Prepares dimension members statement.
Returns a SQL statement which is ordered according to the order. If the statement contains attributes that have natural order specified, then the natural order is used, if not overriden in the order.
order sohuld be prepared using AggregationBrowser.prepare_order().
dimension_levels is list of considered dimension levels in form of tuples (dimension, hierarchy, levels). For each level it’s sort key is used.
Returns paginated statement if page is provided, otherwise returns the same statement.
Return a condition for a hierarchical range (from_path, to_path). Return value is a Condition tuple.
Returns an attribute from a semi-additive dimension, if defined for the cube. Cubes allows one semi-additive dimension.
A snowflake schema mapper for a cube. The mapper creates required joins, resolves table names and maps logical references to tables and respective columns.
Attributes:
mappings is a dictionary where keys are logical attribute references and values are table column references. The keys are mostly in the form:
The values might be specified as strings in the form table.column (covering most of the cases) or as a dictionary with keys schema, table and column for more customized references.
Returns physical reference as tuple for attribute, which should be an instance of cubes.model.Attribute. If there is no dimension specified in attribute, then fact table is assumed. The returned tuple has structure: (schema, table, column).
The algorithm to find physicl reference is as follows:
IF localization is requested:
IF is attribute is localizable:
IF requested locale is one of attribute locales
USE requested locale
ELSE
USE default attribute locale
ELSE
do not localize
IF mappings exist:
GET string for logical reference
IF locale:
append '.' and locale to the logical reference
IF mapping value exists for localized logical reference
USE value as reference
IF no mappings OR no mapping was found:
column name is attribute name
IF locale:
append '_' and locale to the column name
IF dimension specified:
# Example: 'date.year' -> 'date.year'
table name is dimension name
IF there is dimension table prefix
use the prefix for table name
ELSE (if no dimension is specified):
# Example: 'date' -> 'fact.date'
table name is fact table name
Convert attributes to physical attributes. If expand_locales is True then physical reference for every attribute locale is returned.
Get relevant joins to the attributes - list of joins that are required to be able to acces specified attributes. attributes is a list of three element tuples: (schema, table, attribute).
Return list of references to all tables. Keys are aliased tables: (schema, aliased_table_name) and values are real tables: (schema, table_name). Included is the fact table and all tables mentioned in joins.
To get list of all physical tables where aliased tablesare included only once:
finder = JoinFinder(cube, joins, fact_name)
tables = set(finder.table_map().keys())
Returns a list of tables – tuples (schema, table) that contain attributes.
Browser for another Slicer server.
Creates a Mixpanel aggregation browser.
Requirements and limitations:
Return SQL features. Currently they are all the same for every cube, however in the future they might depend on the SQL engine or other factors.