In this tutorial we will demonstrate what goes into creating an application that uses u1db as a backend. We will use code samples from the simple todo list application ‘Cosas’ as our example. The full source code to Cosas can be found in the u1db source tree. It comes with a user interface, but we will only focus on the code that interacts with u1db here.
First we need to define what we’ll actually store in u1db. For a todo list application, it makes sense to have each todo item or task be a single document in the database, so that we can use indexes to find individual tasks with specific properties.
We’ll subclass Document, and define some properties that we think our tasks need to have. There are no schema’s in u1db, which means we can always change the structure of the underlying json document at a later time. (Though that does likely mean we will have to migrate older documents for them to still work with the new code.)
Let’s give our Task objects a title, a (boolean) done property, and a list of tags, so that the json representation of a task would look something like this:
'{"title": "the task at hand",
"done": false,
"tags": ["urgent", "priority 1", "today"]}'
We can define Task as follows:
import u1db
class Task(u1db.Document):
"""A todo item."""
def _get_title(self):
"""Get the task title."""
return self.content.get('title')
def _set_title(self, title):
"""Set the task title."""
self.content['title'] = title
title = property(_get_title, _set_title, doc="Title of the task.")
def _get_done(self):
"""Get the status of the task."""
return self.content.get('done', False)
def _set_done(self, value):
"""Set the done status."""
self.content['done'] = value
done = property(_get_done, _set_done, doc="Done flag.")
def _get_tags(self):
"""Get tags associated with the task."""
return self.content.setdefault('tags', [])
def _set_tags(self, tags):
"""Set tags associated with the task."""
self.content['tags'] = list(set(tags))
tags = property(_get_tags, _set_tags, doc="Task tags.")
As you can see, Document objects come with a .content property, which is a Python dictionary. This is where we look up or store all data pertaining to the task.
We can now create tasks, set their titles:
>>> example_task = Task()
>>> example_task.title = "Create a Task class."
>>> example_task.title
'Create a Task class.'
their tags:
>>> example_task.tags
[]
>>> example_task.tags = ['develoment']
>>> example_task.tags
['develoment']
and their done status:
>>> example_task.done
False
>>> example_task.done = True
>>> example_task.done
True
This is all we need the task object to do: as long as we have a way to store all its data in the .content dictionary, the super class will take care of converting that into JSON so it can be stored in the database.
For convenience, we can create a function that returns a fresh copy of the content that would make up an empty task:
EMPTY_TASK = {"title": "", "done": False, "tags": []}
get_empty_task = lambda: copy.deepcopy(EMPTY_TASK)
Now that we have tasks defined, we will probably want to query the database using their properties. To that end, we will need to use indexes. Let’s define two for now, one to query by tags, and one to query by done status. We’ll define some global constants with the name and the definition of the indexes, which will make them easier to refer to in the rest of the code:
TAGS_INDEX = 'tags'
DONE_INDEX = 'done'
INDEXES = {
TAGS_INDEX: ['tags'],
DONE_INDEX: ['bool(done)'],
}
INDEXES is just a regular dictionary, with the names of the indexes as keys, and the index definitions, which are lists of expressions as values. (We chose to use lists since an index can be defined on multiple fields, though both of the indexes defined above only index a single field.)
The tags index will index any document that has a top level field tags and index its value. Our tasks will have a list value under tags which means that u1db will index each task for each of the values in the list in this index. So a task with the following content:
{
"title": "Buy sausages and vimto",
"tags": ["shopping", "food"],
"done": false
}
Would be indexed under both "food" and "shopping".
The done index will index any document that has a boolean value in a top level field with the name done.
We will see how the indexes are actually created and queried below.
To store and retrieve our task objects we’ll need a u1db Database. We can make a little helper function to get a reference to our application’s database, and create it if it doesn’t already exist:
from dirspec.basedir import save_data_path
def get_database():
"""Get the path that the database is stored in."""
return u1db.open(
os.path.join(save_data_path("cosas"), "cosas.u1db"), create=True,
document_factory=Task)
There are a few things to note here: First of all, we use lp:dirspec to handle where to find or put the database in a way that works across platforms. This is not something specific to u1db, so you could choose to use it for your own application or not: u1db.open() will happily take any filesystem path. Secondly, we pass our Task class into the document_factory argument of u1db.open(). This means that any time we get documents from the database, it will return Task objects, so we don’t have to do the conversion in our code.
Now we create a TodoStore class that will handle all interactions with the database:
class TodoStore(object):
"""The todo application backend."""
def __init__(self, db):
self.db = db
def initialize_db(self):
"""Initialize the database."""
# Ask the database for currently existing indexes.
db_indexes = dict(self.db.list_indexes())
# Loop through the indexes we expect to find.
for name, expression in INDEXES.items():
if name not in db_indexes:
# The index does not yet exist.
self.db.create_index(name, *expression)
continue
if expression == db_indexes[name]:
# The index exists and is up to date.
continue
# The index exists but the definition is not what expected, so we
# delete it and add the proper index expression.
self.db.delete_index(name)
self.db.create_index(name, *expression)
The initialize_db() method checks whether the database already has the indexes we defined above and if it doesn’t or if the definition is different than the one we have, the index is (re)created. We will call this method every time we start the application, to make sure all the indexes are up to date. Creating an index is a matter of calling create_index() with a name and the expressions that define the index. This will immediately index all documents already in the database, and afterwards any that are added or updated.
def get_all_tags(self):
"""Get all tags in use in the entire database."""
return [key[0] for key in self.db.get_index_keys(TAGS_INDEX)]
The get_index_keys() method gets a list of all indexed values from an index. In this case it will give us a list of all tags that have been used in the database, which can be useful if we want to present them in the user interface of our application.
def get_tasks_by_tags(self, tags):
"""Get all tasks that have every tag in tags."""
if not tags:
# No tags specified, so return all tasks.
return self.get_all_tasks()
# Get all tasks for the first tag.
results = dict(
(doc.doc_id, doc) for doc in
self.db.get_from_index(TAGS_INDEX, tags[0]))
# Now loop over the rest of the tags (if any) and remove from the
# results any document that does not have that particular tag.
for tag in tags[1:]:
# Get the ids of all documents with this tag.
ids = [
doc.doc_id for doc in self.db.get_from_index(TAGS_INDEX, tag)]
for key in results.keys():
if key not in ids:
# Remove the document from result, because it does not have
# this particular tag.
del results[key]
if not results:
# If results is empty, we're done: there are no
# documents with all tags.
return []
return results.values()
This method gives us a way to query the database by a set of tags. We loop through the tags one by one and then filter out any documents that don’t have that particular tag.
def get_task(self, doc_id):
"""Get a task from the database."""
task = self.db.get_doc(doc_id)
if task is None:
# No document with that id exists in the database.
raise KeyError("No task with id '%s'." % (doc_id,))
if task.is_tombstone():
# The document id exists, but the document's content was previously
# deleted.
raise KeyError("Task with id %s was deleted." % (doc_id,))
return task
get_task is a thin wrapper around get_doc() that takes care of raising appropriate exceptions when a document does not exist or has been deleted. (Deleted documents leave a ‘tombstone’ behind, which is necessary to make sure that synchronisation of the database with other replicas does the right thing.)
def new_task(self, title=None, tags=None):
"""Create a new task document."""
if tags is None:
tags = []
# We make a fresh copy of a pristine task with no title.
content = get_empty_task()
# If we were passed a title or tags, or both, we set them in the object
# before storing it in the database.
if title or tags:
content['title'] = title
content['tags'] = tags
# Store the document in the database. Since we did not set a document
# id, the database will store it as a new document, and generate
# a valid id.
return self.db.create_doc(content)
Here we use the convenience function defined above to initialize the content, and then set the properties that were passed into new_task. We call create_doc() to create a new document from the content. This creates the document in the database, assigns it a new unique id (unless we pass one in,) and returns a fully initialized Task object. (Since we made that the database’s factory.)
def get_all_tasks(self):
return self.db.get_from_index(DONE_INDEX, "*")
Since the DONE_INDEX indexes anything that has a value in the field “done”, and all tasks do (either True or False), it’s a good way to get all tasks out of the database, especially since it will sort them by done status, so we’ll get all the active tasks first.
Synchronisation has to be initiated by the application, either periodically, while it’s running, or by having the user initiate it. Any u1db.Database can be synchronised with any other, either by file path or URL. Cosas gives the user the choice between manually synchronising or having it happen automatically, every 30 minutes, for as long as it is running.
from ubuntuone.platform.credentials import CredentialsManagementTool
def get_ubuntuone_credentials(self):
cmt = CredentialsManagementTool()
return cmt.find_credentials()
def _synchronize(self, creds=None):
target = self.sync_target
assert target.startswith('http://') or target.startswith('https://')
if creds is not None: # convert into expected form
creds = {'oauth': {
'token_key': creds['token'],
'token_secret': creds['token_secret'],
'consumer_key': creds['consumer_key'],
'consumer_secret': creds['consumer_secret']
}}
self.store.db.sync(target, creds=creds)
# refresh the UI to show changed or new tasks
self.refresh_filter()
def synchronize(self, finalize):
if self.sync_target == 'https://u1db.one.ubuntu.com/~/cosas':
d = self.get_ubuntuone_credentials()
d.addCallback(self._synchronize)
d.addCallback(finalize)
else:
self._synchronize()
finalize()
When synchronising over http(s), servers can (and usually will) require OAuth authentication. The code above shows how to acquire and pass in the oauth credentials for the Ubuntu One server, in case you want your application to synchronize with that.
After synchronising with another replica, it is possible that one or more conflicts have arisen, if both replicas independently made changes to the same document. Your application should probably check for conflicts after every synchronisation, and offer the user a way to resolve them.
Look at the Conflicts class in cosas/ui.py to see an example of how this could be presented to the user. The idea is that you show the conflicting versions to the user, let them pick one, and then call resolve_doc() with the preferred version, and all the revisions of the conflicting versions it is meant to resolve.
def resolve(self, doc, revs):
self.store.db.resolve_doc(doc, revs)
# refresh the UI to show the resolved version
self.refresh_filter()
cosas example application.
A todo item.
Done flag.
Task tags.
Title of the task.
The todo application backend.
Get all tags in use in the entire database.
Get all tasks that have every tag in tags.
Extract the tags from the text.
User interface for the cosas example application.
Main window of our application.
Create a link between the task with id doc_id and the tag, and add a new button for tag if it was not already there.
Show the frame with the tag buttons.
Remove the link between the task with id doc_id and the tag, and remove the button for tag if it no longer has any tasks associated with it.
Show the frame with the tag buttons.
Process any changed tags for this item.