toytable.table module

The Table Class

The Table class (eztable.Table) is the most important component of this package. It is intended to represent a ‘normal’ table - that is one which has no special colums or has not undergone any transformations (e.g. project, expand).

Table.__init__(schema, data=None)[source]

Every Table object has a schema. In it’s simplest form, the schema can be nothing more than a list of string column-names. Specifying a schema this way will produce a non-typed table, in which any Python type can be stored in any column.

Alternativly the schema can include type information. Instead of Specifying the schema item as a string, use (column_name, type), where type is a python type or class object, for example int, str.

It is expected that most of the values stored in the table will be simple objects or native types such as numbers and strings, however it is also possible to store any python object as long as they are hashable.

Parameters:
  • schema (list) – Column names as a sequence of strings, or (‘col_name’, type)
  • data (list of lists) – Optional rows of data to initialize the table.
class eztable.Table(schema, data=None)[source]

Bases: object

The basic table class. Table objects contain any Python data type, however some features may be unavailable if the types are non-hashable.

add_index(cols)[source]

Create a new index on a set of columns.

Indexes are list-like objects which can be used to speed-up access to rows of data. Indexes improve the preformance of operations (e.g. joins).

The Table class only holds a weak-reference to this object, hence the user must retain a reference to the index in order to prevent it from being garbage collected.

Parameters:cols (List of strings.) – Column names to be included into the index.
aggregate(keys, aggregations)[source]

Summarize a table by grouping by one or more keys, and then apply aggregation functions to generate additional summarized columns.

Aggregations are specified as a list of triples in the form: (column name (str), column type (type), column function (callable))

The column function should be a function that returns the type specified in the 2nd column. It’s input will be each of the sub-tables speified by the grouping keys.

>>> from eztable import table_literal
>>> t = table_literal('''
... | Attack(str)   | Pokemon(str) | Level Obtained(int) | Attack Type(str) |
... | Thunder Shock | Pikachu      | 1                   | Electric         |
... | Tackle        | Pikachu      | 1                   | Normal           |
... | Tail Whip     | Pikachu      | 1                   | Normal           |
... | Growl         | Pikachu      | 5                   | Normal           |
... | Quick Attack  | Pikachu      | 10                  | Normal           |
... | Thunder Wave  | Pikachu      | 13                  | Electric         |
... | Electro Ball  | Pikachu      | 18                  | Electric         |
... | Charm         | Pikachu      | 0                   | Fairy            |
... | Sweet Kiss    | Pikachu      | 0                   | Fairy            |
... ''')
>>>
>>> agg = t.aggregate(
...     keys=('Pokemon', 'Attack Type'),
...     aggregations = [
...         ('Count', int, lambda t:len(t))
...     ]
... )
>>>
>>> print agg
| Pokemon (str) | Attack Type (str) | Count (int) |
| Pikachu       | Normal            | 4           |
| Pikachu       | Electric          | 3           |
| Pikachu       | Fairy             | 2           |
Parameters:
  • keys (List of strings) – List of column names to group by
  • aggregations (list of tuples) – List of aggregations to calculate
anti_project(*col_names)[source]

Returns a new DerivedTable in which the named columns have been removed.

Unless the new table is materialised it shares the same data as the table it was made from, hence extending or appending from the new table will also modify the projected table.

Ordering of the original columns will be retained, except that the specified columns will no longer be accessible.

Parameters:col_names (List of strings) – List of colun names to remove
append(row)[source]

Append a single row to this table. The row must match the table’s schema, typically this means that the row should have the same number of items, however if types were specified then the type of each positional element must conform to the required type of the corresponding schema column.

Parameters:row (List of objects, types must correspond with schema) – A single table row to be added
column_names

Get the table’s column names as a list of strings.

column_types

Get the table’s column types as a list of types.

copy()[source]

Create a ‘materialised’ copy of this table.

This converts all dynamically generated columns into StaticColumn objects.

expand(name, input_columns, fn, col_type=<class 'object'>)[source]

Returns a new DerivedTable in which a new calculated column has been added.

This column’s value is determined by a function and a set of input columns.

Parameters:
  • name (str) – The name of the new derived coulumn.
  • input_columns (list of str) – The input column names.

:param fn; A function or lambda :param col_type: Optionally, constrain the value of this column by type

expand_const(name, value, type=<class 'object'>)[source]

Returns a new DerivedTable in which a single column of static data has been added.

Parameters:
  • name (str) – The name of the new column to be added
  • value (object) – The constant value of the new column
  • type (type) – Optional, specify a type constraint for the new column
extend(iterable)[source]

Append all rows in iterable to this table. Each row must conform to this table’s schema. :param iterable: Iterator from which to extract rows :type iterable: iterable

get_row(key)[source]

Get a single row from the table. :param key: Row index :type key: int

hash(name, input_columns)[source]

A convenience function that expands the table with a new hash column.

inner_join(keys, other, other_keys=None)[source]

Left join the other table onto this, return a table.

Parameters:
  • keys – List of column names which will be matched.
  • other – the other table to join on to this table.
  • other_keys – Optional list of foreign keys
left_join(keys, other, other_keys=None)[source]

Left join the other table onto this, return a table.

Parameters:
  • keys – List of column names which will be matched.
  • other – the other table to join on to this table.
  • other_keys – Optional list of foreign keys
normalize(normalizations)[source]

Return a version of the table with columns normalized.

Parameters:normalizations – dict mapping column names to their normalized range (typically 1).
Returns:A derived eztable.Table with the normalizations applied.
project(*col_names)[source]

Returns a new DerivedTable in which only the named columns remain in the order specified by col_names.

Parameters:col_names (List of strings) – List of column names to keep
rename(old_names, new_names)[source]

Rename columns in the table. Does not affect the order of columns.

Parameters:
  • old_names – list of column names to rename.
  • new_names – list of the new names to asign to the renamed columns.
restrict(col_names, fn=None)[source]

Return a new DerivedTable object in which all visible rows satisfy some kind of logical constraint given by fn.

Parameters:
  • col_names (list of strings) – List of column names to feed into fn
  • fn (fuunction or lambda) – Should return True for any retained row.
schema

Get the table’s schema. This is a list of (name (string), type) tuples.

split()[source]
standardize(standardizations)[source]
to_csv(output_file, dialect='excel', descriptions=False)[source]

Save this table to a file in CSV format (or any dialect variation supported by Python’s CSV library).

Parameters:
  • output_file – A file or file-like object (not a filename)
  • dialect – Any previously registered CSV writer dialect name
  • descriptions – Set to True if you want to include column descriptions rather than column-names
Returns:

None

Table Literals

eztable.table_literal(repr_string, default_type=<class 'str'>)[source]

Create a eztable.Table object from a multi-line string expression. The input format is exactly the same as the Table class’s repr format:

>>> from eztable import table_literal
>>> pokedex = table_literal("""
...     | Pokemon (str) | Level (int) | Owner (str) | Type (str) |
...     | Pikchu        | 12          | Ash Ketchum | Electric   |
...     | Bulbasaur     | 16          | Ash Ketchum | Grass      |
...     | Charmander    | 19          | Ash Ketchum | Fire       |
...     | Raichu        | 23          | Lt. Surge   | Electric   |
...     """)
>>>
>>> print pokedex.column_names
['Pokemon', 'Level', 'Owner', 'Type']
>>> print pokedex.column_types
[<type 'str'>, <type 'int'>, <type 'str'>, <type 'str'>]
>>> print list(pokedex.Pokemon)
['Pikchu', 'Bulbasaur', 'Charmander', 'Raichu']

Since table literal expressions are strings all columns require a type to be explicitly specified. All untyped columns are presumed to be strings.

The type column can be any importable object or function capable of building the contents of the column from the string values of each element in the column.

Parameters:
  • repr_string (str) – table definition
  • default_type (type) – optional type to apply to columns where no type is given