Define Models

A Basic Definition

Every model inherits from BaseModel, and needs at least a hash key:

>>> from bloop import BaseModel, Column, UUID

>>> class User(BaseModel):
...     id = Column(UUID, hash_key=True)
...
>>> User
<Model[User]>
>>> User.id
<Column[User.id=hash]>

Let's add some columns, a range key, and a GSI:

>>> from bloop import (
...     BaseModel, Boolean, Column, DateTime,
...     GlobalSecondaryIndex, String, UUID)
...
>>> class User(BaseModel):
...     id = Column(UUID, hash_key=True)
...     version = Column(String, range_key=True)
...     email = Column(String)
...     created_on = Column(DateTime)
...     verified = Column(Boolean)
...     profile = Column(String)
...     by_email = GlobalSecondaryIndex(projection="keys", hash_key="email")
...
>>> User
<Model[User]>
>>> User.by_email
<GSI[User.by_email=keys]>

Then create the table in DynamoDB:

>>> from bloop import Engine
>>> engine = Engine()
>>> engine.bind(User)

Hint

Alternatively, we could have called engine.bind(BaseModel) to bind all non-abstract models that subclass BaseModel. If any model doesn't match its backing table, TableMismatch is raised.

Note

Models must be hashable. If you implement __eq__ without __hash__, Bloop will inject the first hash method it finds by walking the model's class.mro().

Creating Instances

The default __init__ takes **kwargs and applies them by each column's model name:

>>> import arrow, uuid
>>> user = User(
...     id=uuid.uuid4(),
...     version="1",
...     email="user@domain.com",
...     created_at=arrow.now())
>>> user.email
'user@domain.com'
>>> user
User(created_on=<Arrow [2016-10-29T22:08:08.930137-07:00]>, ...)

A local object's hash and range keys don't need values until you're ready to interact with DynamoDB:

>>> user = User(email="u@d.com", version="1")
>>> engine.save(user)
MissingKey: User(email='u@d.com') is missing hash_key: 'id'
>>> user.id = uuid.uuid4()
>>> engine.save(user)

Metadata

Table Configuration

You can provide an inner Meta class to configure the model's DynamoDB table:

>>> class Tweet(BaseModel):
...     class Meta:
...         table_name = "custom-table-name"
...         read_units = 200
...     user = Column(Integer, hash_key=True)
...
>>> Tweet.Meta.read_units
200
>>> Tweet.Meta.keys
{<Column[Tweet.user=hash]}
>>> Tweet.Meta.indexes
set()

Table configuration defaults are:

class Meta:
    abstract = False
    table_name = __name__  # model class name
    read_units = 1
    write_units = 1
    stream = None

If abstract is true, no backing table will be created in DynamoDB. Instances of abstract models can't be saved or loaded. Currently, abstract models and inheritance don't mix. In the future, abstract models may be usable as mixins.

The default table_name is simply the model's __name__. This is useful for mapping a model to an existing table, or mapping multiple models to the same table:

class Employee(BaseModel):
    class Meta:
        table_name = "employees-uk"
    ...

Default read_units and write_units are 1. These do not include provisioned throughput for any GlobalSecondaryIndex, which have their own read_units` and write_units`.

Finally, stream can be used to enable DynamoDBStreams on the table. By default streaming is not enabled, and this is None. To enable a stream with both new and old images, use:

class Meta:
    stream = {
        "include": ["new", "old"]
    }

See the Streams section of the user guide to get started. Streams are awesome.

Model Introspection

When a new model is created, a number of attributes are computed and stored in Meta. These can be used to generalize conditions for any model, or find columns by their name in DynamoDB.

These top-level properties can be used to describe the model in broad terms:

  • model -- The model this Meta is attached to
  • columns -- The set of all columns in the model
  • keys -- The set of all table keys in the model (hash key, or hash and range keys)
  • indexes -- The set of all indexes (gsis, lsis) in the model

Additional properties break down the broad categories, such as splitting indexes into gsis and lsis:

  • hash_key -- The table hash key
  • range_key -- The table range key or None
  • gsis -- The set of all GlobalSecondaryIndex in the model
  • lsis -- The set of all LocalSecondaryIndex in the model
  • projection A pseudo-projection for the table, providing API parity with an Index

Here's the User model we just defined:

>>> User.Meta.hash_key
<Column[User.id=hash]>
>>> User.Meta.gsis
{<GSI[User.by_email=keys]>}
>>> User.Meta.keys
{<Column[User.version=range]>,
 <Column[User.id=hash]>}
>>> User.Meta.columns
{<Column[User.created_on]>,
 <Column[User.profile]>,
 <Column[User.verified]>,
 <Column[User.id=hash]>,
 <Column[User.version=range]>,
 <Column[User.email]>}

Using Generic Models

A common pattern involves saving an item only if it doesn't exist. Instead of creating a specific condition for every model, we can use keys to make a function for any model:

from bloop import Condition

def if_not_exist(obj):
    condition = Condition()
    for key in obj.Meta.keys:
        condition &= key.is_(None)
    return condition

Now, saving only when an object doesn't exist is as simple as:

engine.save(some_obj, condition=if_not_exist(some_obj))

(This is also available in the patterns section of the user guide).

Columns

Every Column must have a Type that is used to load and dump values to and from DynamoDB. The typedef argument can be a type class, or a type instance. When you provide a class, the Column will create an instance by calling the constructor without args. This is a convenience for common types that do not require much configuration. The following are functionally equivalent:

Column(Integer)
Column(Integer())

Some types require an argument, such as Set. Sets must have an inner type so they can map to a string set, number set, or binary set. For example:

# FAILS: Set must have a type
Column(Set)

# GOOD: Set will instantiate the inner type
Column(Set(Integer))
Column(Set(Integer()))

To make a column the model's hash or range key, use hash_key=True or range_key=True. The usual rules apply: a column can't be both, there can't be more than one of each, and there must be a hash key.

class Impression(BaseModel):
    referrer = Column(String, hash_key=True)
    version = Column(Integer, range_key=True)

By default values will be stored in DynamoDB under the name of the column in the model definition (its model_name). If you want to conserve read and write units, you can use shorter names for attributes in DynamoDB (attribute names are counted against your provisioned throughput). Like the table_name in Meta, the optional name parameter lets you use descriptive model names without binding you to those names in DynamoDB. This is also convenient when mapping an existing table, or multi-model tables where an attribute can be interpreted multiple ways.

The following model is identical to the one just defined, except that each attribute is stored using a short name:

class Impression(BaseModel):
    referrer = Column(String, hash_key=True, name="ref")
    version = Column(Integer, range_key=True, name="v")

Locally, the model names "referrer" and "version" are still used. An instance would be constructed as usual:

>>> click = Impression(
...     referrer="google.com",
...     version=get_current_version())
>>> engine.save(click)

Indexes

Indexes provide additional ways to query and scan your data. If you have not used indexes, you should first read the Developer's Guide on Improving Data Access with Secondary Indexes.

GlobalSecondaryIndex

Every GlobalSecondaryIndex must declare a projection, which describes the columns projected into the index. Only projected columns are loaded from queries and scans on the index, and non-projected columns can't be used in filter expressions. A projection can be "all" for all columns in the model; "keys" for the hash and range columns of the model and the index; or a list of Column objects or their model names. If you specify a list of columns, key columns will always be included.

class HeavilyIndexed(BaseModel):
    ...
    by_email = GlobalSecondaryIndex("all", hash_key="email")
    by_username = GlobalSecondaryIndex("keys", hash_key="username")
    by_create_date = GlobalSecondaryIndex(
        ["email", "username"], hash_key="created_on")

A GlobalSecondaryIndex must have a hash_key, and can optionall have a range_key. This can either be the model_name of a column, or the column object itself:

class Impression(BaseModel):
    id = Column(UUID, hash_key=True)
    referrer = Column(String)
    version = Column(Integer)
    created_on = Column(DateTime)

    by_referrer = GlobalSecondaryIndex("all", hash_key=referrer)
    by_version = GlobalSecondaryIndex("keys", hash_key="version")

Unlike LocalSecondaryIndex, a GSI does not share its throughput with the table. You can specify the read_units and write_units of the GSI. Both default to 1:

GlobalSecondaryIndex("all", hash_key=version, read_units=500, write_units=20)

As with Column you can provide a name for the GSI in DynamoDB. This can be used to map to an existing index while still using a pythonic model name locally:

class Impression(BaseModel):
    ...
    by_email = GlobalSecondaryIndex("keys", hash_key=email, name="index_email")

See also

Global Secondary Indexes in the DynamoDB Developer Guide

LocalSecondaryIndex

LocalSecondaryIndex is similar to GlobalSecondaryIndex in its use, but has different requirements. LSIs always have the same hash key as the model, and it can't be changed. The model must have a range key, and the LSI must specify a range_key:

LocalSecondaryIndex("all", range_key=created_on)

You can specify a name to use in DynamoDB, just like Column and GSI:

class Impression(BaseModel):
    url = Column(String, hash_key=True)
    user_agent = Column(String, range_key=True, name="ua")
    visited_at = Column(DateTime, name="at")

    by_date = LocalSecondaryIndex(
    "keys", range_key=visited_at, name="index_date")

The final optional parameter is strict, which defaults to True. This controls whether DynamoDB may incur additional reads on the table when querying the LSI for columns outside the projection. Bloop enforces this by evaluating the key, filter, and projection conditions against the index's allowed columns and raises an exception if it finds any non-projected columns.

It is recommended that you leave strict=True, to prevent accidentally consuming twice as many read units with an errant projection or filter condition. Since this is local to Bloop and not part of the index definition in DynamoDB, you can always disable and re-enable it in the future.

See also

Local Secondary Indexes in the DynamoDB Developer Guide