Flask-CouchDB¶

Flask-CouchDB makes it easy to use the powerful CouchDB database with Flask.

Installation¶

First, you need CouchDB. If you’re on Linux, your distribution’s package manager probably has a CouchDB package to install. (On Debian, Ubuntu, and Fedora, the package is simply called couchdb. On other distros, search your distribution’s repositories.) Windows and Mac have some unofficial installers available, so check CouchDB: The Definitive Guide (see the Additional Reference section) for details. On any other environment, you will probably need to build from source.

Once you have the actual CouchDB database installed, you can install Flask-CouchDB. If you have pip (recommended),

$ pip install Flask-CouchDB

On the other hand, if you can only use easy_install, use

$ easy_install Flask-CouchDB

Both of these will automatically install the couchdb-python library Flask-CouchDB needs to work if the proper version is not already installed.

Getting Started¶

To get started, create an instance of the CouchDBManager class. This is used to set up the connection every request and ensure that your database exists, design documents are synced, and the like. Then, you call its setup() method with the app to register the necessary handlers.

manager = CouchDBManager()
# ...add document types and view definitions...
manager.setup(app)

The database to connect with is determined by the configuration. The COUCHDB_SERVER application config value indicates the actual server to connect to (for example, http://localhost:5984/), and COUCHDB_DATABASE indicates the database to use on the server (for example, webapp).

By default, the database will be checked to see if it exists - and views will be synchronized to their design documents - on every request. However, this can (and should) be changed - see Database Sync Behavior for more details.

Since the manager does not actually do anything until it is set up, it is safe (and useful) for it to be created in advance, separately from the application it is used with.

Basic Use¶

On every request, the database is available as g.couch. You will, of course, want to check couchdb-python’s documentation of the couchdb.client.Database class for more detailed instructions, but some of the most useful methods are:

# creating
document = dict(title="Hello", content="Hello, world!")
g.couch[some_id] = document

# retrieving
document = g.couch[some_id]     # raises error if it doesn't exist
document = g.couch.get(some_id) # returns None if it doesn't exist

# updating
g.couch.save(document)

If you use this style of DB manipulation a lot, it might be useful to create your own LocalProxy, as some people (myself included) find the g. prefix annoying or unelegant.

couch = LocalProxy(lambda: g.couch)

You can then use couch just like you would use g.couch.

Writing Views¶

If you register views with the CouchDBManager, they will be synchronized to the database, so you can always be sure they can be called properly. They are created with the ViewDefinition class.

View functions can have two parts - a map function and a reduce function. The “map” function takes documents and emits any number of rows. Each row has a key, a value, and the document ID that generated it. The “reduce” function takes a list of keys, a list of values, and a parameter describing whether it is in “rereduce” mode. It should return a value that reduces all the values down into one single value. For maximum portability, most view functions are written in JavaScript, though more view servers - including a Python one - can be installed on the server.

The ViewDefinition class works like this:

active_users_view = ViewDefinition('users', 'active', '''\
    function (doc) {
        if (doc.active) {
            emit(doc.username, doc)
        };
    }''')

'users' is the design document this view is a part of, and 'active' is the name of the view. This particular view only has a map function. If you had a reduce function, you would pass it after the map function:

tag_counts_view = ViewDefinition('blog', 'tag_counts', '''\
    function (doc) {
        doc.tags.forEach(function (tag) {
            emit(tag, 1);
        });
    }''', '''\
    function (keys, values, rereduce) {
        return sum(values);
    }''', group=True)

The group=True is a default option. You can pass it when calling the view, but since it causes only one row to be created for each unique key value, it makes sense as the default for our view.

To get the results of a view, you can call its definition. Within a request, it will automatically use g.couch, but you can still pass in a value explicitly. They return a couchdb.client.ViewResults object, which will actually fetch the results once it is iterated over. You can also use getitem and getslice notation to apply a range of keys. For example:

active_users_view()     # rows for every active user
active_users_view['a':'b']  # rows for every user between a and b
tag_count()             # rows for every tag
tag_count['flask']      # one row for just the 'flask' tag

To make sure that you can call the views, though, you need to add them to the CouchDBManager with the add_viewdef method.

manager.add_viewdef((active_users_view, tag_count_view))

This does not cover writing views in detail. A good reference for writing views is the Introduction to CouchDB views page on the CouchDB wiki.

Mapping Documents to Objects¶

With the Document class, you can map raw JSON objects to Python objects, which can make it easier to work with your data. You create a document class in a similar manner to ORMs such as Django, Elixir, and SQLObject.

class BlogPost(Document):
    doc_type = 'blogpost'

    title = TextField()
    content = TextField()
    author = TextField()
    created = DateTimeField(default=datetime.datetime.now)
    tags = ListField(TextField())
    comments_allowed = BooleanField(default=True)

You can then create and edit documents just like you would a plain old object, and then save them back to the database with the store method.

post = BlogPost(title='Hello', content='Hello, world!', author='Steve')
post.id = uuid.uuid4().hex
post.store()

To retrieve a document, use the load method. It will return None if the document with the given ID could not be found.

post = BlogPost.load(some_id)
if post is None:
    abort(404)
return render_template(post=post)

If a doc_type attribute is set on the class, all documents created with that class will have their doc_type field set to its value. You can use this to tell different document types apart in view functions (see Adding Views for examples).

Complex Fields¶

One advantage of using JSON objects is that you can include complex data structures right in your document classes. For example, the tags field in the example above uses a ListField:

tags = ListField(TextField())

This lets you have a list of tags, as strings. You can also use DictField. If you provide a mapping to the dict field (probably using the Mapping.build method), it lets you have another, nested data structure, for example:

author = DictField(Mapping.build(
    name=TextField(),
    email=TextField()
))

And you can use it just like it’s a nested document:

post.author.name = 'Steve Person'
post.author.email = 'sperson@example.com'

On the other hand, if you use it with no mapping, it’s just a plain old dict:

metadata = DictField()

You can combine the two fields, as well. For example, if you wanted to include comments on the post:

comments = ListField(DictField(Mapping.build(
    text=TextField(),
    author=TextField(),
    approved=BooleanField(default=False),
    published=DateTimeField(default=datetime.datetime.now)
)))

Adding Views¶

The ViewField class can be used to add views to your document classes. You create it just like you do a normal ViewDefinition, except you attach it to the class and you don’t have to give it a name, just a design document (it will automatically take the name of its attribute):

tagged = ViewField('blog', '''\
    function (doc) {
        if (doc.doc_type == 'blogpost') {
            doc.tags.forEach(function (tag) {
                emit(tag, doc);
            });
        };
    }''')

When you access it, either from the class or from an instance, it will return a ViewDefinition, which you can call like normal. The results will automatically be wrapped in the document class.

BlogPost.tagged()           # a row for every tag on every document
BlogPost.tagged['flask']    # a row for every document tagged 'flask'

If the value of your view is not a document (for example, in most reduce views), you can pass Row as the wrapper. A Row has attributes for the key, value, and id of a row.

tag_counts = ViewDefinition('blog', '''\
    function (doc) {
        if (doc.doc_type == 'blogpost') {
            doc.tags.forEach(function (tag) {
                emit(tag, 1);
            });
        };
    }''', '''\
    function (keys, values, rereduce) {
        return sum(values);
    }''', wrapper=Row, group=True)

With that view, for example, you could use:

# print all tag counts
for row in tag_counts():
    print '%d posts tagged %s' % (row.value, row.key)

# print a single tag's count
row = tag_counts[some_tag].rows[0]
print '%d posts tagged %s' % (row.value, row.key)

To schedule all of the views on a document class for synchronization, use the CouchDBManager.add_document method. All the views will be added when the database syncs.

manager.add_document(BlogPost)

Pagination¶

In any Web application with large datasets, you are going to want to paginate your results. The paginate function lets you do this.

The particular style of pagination used is known as linked-list pagination. This means that instead of a page number, the page is indicated by a reference to a particular item (the first one on the page). The advantages of linked-list paging include:

Much more efficient on CouchDB by a wide margin - numbered paging scales poorly on large datasets
The items won’t change on the user: if another item is added at the beginning of the dataset, and the user clicks Next, an item from the previous page won’t get pushed onto the next one

Unfortunately, there are also drawbacks:

The only way to navigate through is with next/previous links - you can’t “skip ahead” without precomputing the page references
The start reference is more obtrusive in a URL than the page number

In this case, however, the efficiency issue is the major deciding factor.

To paginate, you need a ViewResults instance, like the one you would get from calling or slicing a ViewDefinition or ViewField. Then, you call paginate with the view results, the number of items per page, and the start value given for that page (if there is one).

page = paginate(BlogPost.tagged[tag], 10, request.args.get('start'))

It will return a Page instance. That contains the items, as well as the start values of the next and previous pages (if there are any). As noted in the above example, the best practice is to put the start reference in the query string. You can display the page in the template with something like:

<ul>
{% for item in page.items %}
    display item...
{% endfor %}
</ul>

{% if page.prev %}<a href="{{ url_for('display', start=page.prev) }}">Previous</a>{% endif %}
{% if page.next %}<a href="{{ url_for('display', start=page.next) }}">Next</a>{% endif %}

taking advantage of the fact that url_for converts unknown parameters into query string arguments.

If you really need numbered paging using limit/skip in your application, it’s easy enough to implement. (For example, browsing through the posts in a forum thread would get tiresome if you had to click through five next links just to reach the last post.) A good implementation of numbered paging is in the Flask-SQLAlchemy extension (specifically, the BaseQuery.paginate method and Pagination object), so you can look there for some ideas as to the mechanics. The mechanics of using the limit and skip options are described on the CouchDB wiki.

If you choose to go this route, though:

Only use this for datasets that aren’t likely to grow infinitely. For example, posts in a particular forum thread aren’t likely to keep on going forever. (Even in a gigantic forum like, say, Ubuntu Forums, you’re not going to get more than 5000 posts per thread.) The number of threads in a single board, though, might grow ad infinitum (and threads are better located with searching anyway), so they are probably not the best choice for numbered pages.
Use a separate “counting” view with a reduce query to determine the total number of items, instead of fetching the entire result set from your main view. (See the tag_counts view in the Adding Views section for an example of how to do this.)

Database Sync Behavior¶

By default, the database is “synced” by a callback on every request. During the sync:

The manager checks whether the database exists, and if it does not, it creates it.
All the view definitions registered on Document classes or just on their own are synchronized to their design documents.
Any on_sync callbacks are run.

The default behavior is intended to ensure a minimum of effort to get up and running correctly. However, it is very inefficient, as a number of possibly unnecessary HTTP requests may be made during the sync. As such, you can turn automatic syncing off.

If you don’t want to disable it at the code level, it can be disabled at the configuration level. After you have run a single request, or synced manually, you can set the DISABLE_AUTO_SYNCING config option to True. It will prevent the database from syncing on every request, even if it is enabled in the code.

A more prudent method is to pass the auto_sync=False option to the CouchDBManager constructor. This will prevent per-request syncing even if it is not disabled in the config. Then, you can manually call the sync() method (with an app) and it will sync at that time. (You have to have set up the app before then, so it’s best to put this either in an app factory function or the server script - somewhere you can guarantee the app has already been configured.) For example:

app = Flask(__name__)
# ...configure the app...
manager.setup(app)
manager.sync(app)

API Documentation¶

This documentation is automatically generated from the sourcecode. This covers the entire public API (i.e. everything that can be star-imported from flaskext.couchdb). Some of these have been directly imported from the original couchdb-python module.

The Manager¶

class flaskext.couchdb.CouchDBManager(auto_sync=True)¶

This manages connecting to the database every request and synchronizing the view definitions to it.

Parameters:	auto_sync – Whether to automatically sync the database every request. (Defaults to `True`.)

add_document(dc)¶

This adds all the view definitions from a document class so they will be added to the database when it is synced.

Parameters:	dc – The class to add. It should be a subclass of `Document`.

add_viewdef(viewdef)¶

This adds standalone view definitions (it should be a ViewDefinition instance or list thereof) to the manager. It will be added to the design document when it it is synced.

Parameters:	viewdef – The view definition to add. It can also be a tuple or list.

all_viewdefs()¶: This iterates through all the view definitions registered generally and the ones on specific document classes.

connect_db(app)¶

This connects to the database for the given app. It presupposes that the database has already been synced, and as such an error will be raised if the database does not exist.

Parameters:	app – The app to get the settings from.

on_sync(fn)¶

This adds a callback to run when the database is synced. The callbacks are passed the live database (so they should use that instead of relying on the thread locals), and can do pretty much whatever they want. The design documents have already been synchronized. Callbacks are called in the order they are added, but you shouldn’t rely on that.

If you can reliably detect whether it is necessary, this may be a good place to add default data. However, the callbacks could theoretically be run on every request, so it is a bad idea to insert the default data every time.

Parameters:	fn – The callback function to add.

setup(app)¶

This method sets up the request/response handlers needed to connect to the database on every request.

Parameters:	app – The application to set up.

sync(app)¶

This syncs the database for the given app. It will first make sure the database exists, then synchronize all the views and run all the callbacks with the connected database.

It will run any callbacks registered with on_sync, and when the views are being synchronized, if a method called update_design_doc exists on the manager, it will be called before every design document is updated.

Parameters:	app – The application to synchronize with.

View Definition¶

class flaskext.couchdb.ViewDefinition(design, name, map_fun, reduce_fun=None, language='javascript', wrapper=None, **defaults)¶

get_doc(db)¶

Retrieve and return the design document corresponding to this view definition from the given database.

Parameters:	db – the `Database` instance
Returns:	a `client.Document` instance, or `None` if the design document does not exist in the database
Return type:	`Document`

sync(db)¶

Ensure that the view stored in the database matches the view defined by this instance.

Parameters:	db – the `Database` instance

static sync_many(db, views, remove_missing=False, callback=None)¶

Ensure that the views stored in the database that correspond to a given list of ViewDefinition instances match the code defined in those instances.

This function might update more than one design document. This is done using the CouchDB bulk update feature to ensure atomicity of the operation.

Parameters:

Parameters:	db – the `Database` instance views – a sequence of `ViewDefinition` instances remove_missing – whether views found in a design document that are not found in the list of `ViewDefinition` instances should be removed callback – a callback function that is invoked when a design document gets updated; the callback gets passed the design document as only parameter, before that doc has actually been saved back to the database

db – the Database instance
views – a sequence of ViewDefinition instances
remove_missing – whether views found in a design document that are not found in the list of ViewDefinition instances should be removed
callback – a callback function that is invoked when a design document gets updated; the callback gets passed the design document as only parameter, before that doc has actually been saved back to the database

class flaskext.couchdb.Row¶: Representation of a row as returned by database views.

Documents¶

class flaskext.couchdb.Document(*args, **kwargs)¶

This class can be used to represent a single “type” of document. You can use this to more conveniently represent a JSON structure as a Python object in the style of an object-relational mapper.

You populate a class with instances of Field for all the attributes you want to use on the class. In addition, if you set the doc_type attribute on the class, every document will have a doc_type field automatically attached to it with that value. That way, you can tell different document types apart in views.

id¶: The document ID

items()¶

Return the fields as a list of (name, value) tuples.

This method is provided to enable easy conversion to native dictionary objects, for example to allow use of mapping.Document instances with client.Database.update.

>>> class Post(Document):
...     title = TextField()
...     author = TextField()
>>> post = Post(id='foo-bar', title='Foo bar', author='Joe')
>>> sorted(post.items())
[('_id', 'foo-bar'), ('author', u'Joe'), ('title', u'Foo bar')]

Returns:	a list of `(name, value)` tuples

classmethod load(id, db=None)¶

This is used to retrieve a specific document from the database. If a database is not given, the thread-local database (g.couch) is used.

For compatibility with code used to the parameter ordering used in the original CouchDB library, the parameters can be given in reverse order.

Parameters:	id – The document ID to load. db – The database to use. Optional.

classmethod query(db, map_fun, reduce_fun, language='javascript', **options)¶

Execute a CouchDB temporary view and map the result values back to objects of this mapping.

Note that by default, any properties of the document that are not included in the values of the view will be treated as if they were missing from the document. If you want to load the full document for every row, set the include_docs option to True.

rev¶

The document revision.

Return type:	basestring

store(db=None)¶

This saves the document to the database. If a database is not given, the thread-local database (g.couch) is used.

Parameters:	db – The database to use. Optional.

classmethod view(db, viewname, **options)¶

Execute a CouchDB named view and map the result values back to objects of this mapping.

class flaskext.couchdb.Field(name=None, default=None)¶

Basic unit for mapping a piece of data between Python and JSON.

Instances of this class can be added to subclasses of Document to describe the mapping of a document.

class flaskext.couchdb.Mapping(**values)¶

Pagination¶

flaskext.couchdb.paginate(view, count, start=None)¶

This implements linked-list pagination. You pass in the view to use, the number of items per page, and the JSON-encoded start value for the page, and it will return a Page instance.

Since this is “linked-list” style pagination, it only allows direct navigation using next and previous links. However, it is also very fast and efficient.

You should probably use the start values as a query parameter (e.g. ?start=whatever).

Parameters:	view – A `ViewResults` instance. (You get this by calling, slicing, or subscripting a `ViewDefinition` or `ViewField`.) count – The number of items to put on a single page. start – The start value of the page, as a string.

class flaskext.couchdb.Page(items, next=None, prev=None)¶

This represents a single page of items. They are created by the paginate function.

items¶: A list of the actual items returned from the view.

next¶: The start value for the next page, if there is one. If not, this is None. It is JSON-encoded, but not URL-encoded.

prev¶: The start value for the previous page, if there is one. If not, this is None.

Field Types¶

class flaskext.couchdb.TextField(name=None, default=None)¶: Mapping field for string values.

class flaskext.couchdb.IntegerField(name=None, default=None)¶: Mapping field for integer values.

class flaskext.couchdb.FloatField(name=None, default=None)¶: Mapping field for float values.

class flaskext.couchdb.LongField(name=None, default=None)¶: Mapping field for long integer values.

class flaskext.couchdb.DecimalField(name=None, default=None)¶: Mapping field for decimal values.

class flaskext.couchdb.BooleanField(name=None, default=None)¶: Mapping field for boolean values.

class flaskext.couchdb.DateTimeField(name=None, default=None)¶

Mapping field for storing date/time values.

>>> field = DateTimeField()
>>> field._to_python('2007-04-01T15:30:00Z')
datetime.datetime(2007, 4, 1, 15, 30)
>>> field._to_json(datetime(2007, 4, 1, 15, 30, 0, 9876))
'2007-04-01T15:30:00Z'
>>> field._to_json(date(2007, 4, 1))
'2007-04-01T00:00:00Z'

class flaskext.couchdb.DateField(name=None, default=None)¶

Mapping field for storing dates.

>>> field = DateField()
>>> field._to_python('2007-04-01')
datetime.date(2007, 4, 1)
>>> field._to_json(date(2007, 4, 1))
'2007-04-01'
>>> field._to_json(datetime(2007, 4, 1, 15, 30))
'2007-04-01'

class flaskext.couchdb.TimeField(name=None, default=None)¶

Mapping field for storing times.

>>> field = TimeField()
>>> field._to_python('15:30:00')
datetime.time(15, 30)
>>> field._to_json(time(15, 30))
'15:30:00'
>>> field._to_json(datetime(2007, 4, 1, 15, 30))
'15:30:00'

class flaskext.couchdb.ListField(field, name=None, default=None)¶

Field type for sequences of other fields.

>>> from couchdb import Server
>>> server = Server()
>>> db = server.create('python-tests')

>>> class Post(Document):
...     title = TextField()
...     content = TextField()
...     pubdate = DateTimeField(default=datetime.now)
...     comments = ListField(DictField(Mapping.build(
...         author = TextField(),
...         content = TextField(),
...         time = DateTimeField()
...     )))

>>> post = Post(title='Foo bar')
>>> post.comments.append(author='myself', content='Bla bla',
...                      time=datetime.now())
>>> len(post.comments)
1
>>> post.store(db) 
<Post ...>
>>> post = Post.load(db, post.id)
>>> comment = post.comments[0]
>>> comment['author']
'myself'
>>> comment['content']
'Bla bla'
>>> comment['time'] 
'...T...Z'

>>> del server['python-tests']

class flaskext.couchdb.DictField(mapping=None, name=None, default=None)¶

Field type for nested dictionaries.

>>> from couchdb import Server
>>> server = Server()
>>> db = server.create('python-tests')

>>> class Post(Document):
...     title = TextField()
...     content = TextField()
...     author = DictField(Mapping.build(
...         name = TextField(),
...         email = TextField()
...     ))
...     extra = DictField()

>>> post = Post(
...     title='Foo bar',
...     author=dict(name='John Doe',
...                 email='john@doe.com'),
...     extra=dict(foo='bar'),
... )
>>> post.store(db) 
<Post ...>
>>> post = Post.load(db, post.id)
>>> post.author.name
u'John Doe'
>>> post.author.email
u'john@doe.com'
>>> post.extra
{'foo': 'bar'}

>>> del server['python-tests']

Additional Reference¶

For actually getting started with CouchDB and finding out if you want to use it, you should read the official CouchDB Website.
The CouchDB wiki is another good source of information on the CouchDB API and how to write views.
Flask-CouchDB is based on the excellent couchdb-python library. Its documentation can help you understand what is really going on behind the the scenes.
CouchDB - The Definitive Guide is a book published by O’Reilly and made freely available on its Web site. It doesn’t cover developing with CouchDB using client libraries very much, but it contains a good amount of insight into how CouchDB works.

Changelog¶

Version 0.2¶

Added paginate and Page.
Added doc_type.

Backwards Compatibility: Nothing introduced in this release breaks backwards compatibility in itself. However, if you add a doc_type attribute to your class and use it in your views, it won’t update your existing data to match. You will have to add the doc_type field to all the documents already in your database, either by hand or using a script, so they will still show up in your view results.

Version 0.1.1¶

Fixed a bug preventing synchronization of multiple views from Document classes.
Removed a leftover print statement in the after-request code.