.. _high-level-api:

The high-level API
##################

The U1DB API has three separate sections: document storage and retrieval,
querying, and sync. Here we describe the high-level API. Remember that you will
need to choose an implementation, and exactly how this API is defined is
implementation-specific, in order that it fits with the language's conventions.

Document storage and retrieval
------------------------------

U1DB stores documents. A document is a set of nested key-values; basically,
anything you can express with JSON. Implementations are likely to provide
a Document object "wrapper" for these documents; exactly how the wrapper works
is implementation-defined.

Creating documents
^^^^^^^^^^^^^^^^^^

To create a document, use :py:meth:`~u1db.Database.create_doc` or
:py:meth:`~u1db.Database.create_doc_from_json`. Code examples below are from
:ref:`reference-implementation` in Python. :py:meth:`~u1db.Database.create_doc`
takes a dictionary-like object, and
:py:meth:`~u1db.Database.create_doc_from_json` a JSON string.

.. testsetup ::

    import os, tempfile
    old_dir = os.path.realpath('.')
    tmp_dir = tempfile.mkdtemp()
    os.chdir(tmp_dir)

.. doctest ::

    >>> import u1db
    >>> db = u1db.open("mydb1.u1db", create=True)
    >>> doc = db.create_doc({"key": "value"}, doc_id="testdoc")
    >>> doc.content
    {'key': 'value'}
    >>> doc.doc_id
    'testdoc'


Retrieving documents
^^^^^^^^^^^^^^^^^^^^

The simplest way to retrieve documents from a u1db is by calling
:py:meth:`~u1db.Database.get_doc` with a ``doc_id``. This will return a
:py:class:`~u1db.Document` object [#]_.

.. doctest ::

    >>> import u1db
    >>> db = u1db.open("mydb4.u1db", create=True)
    >>> doc = db.create_doc({"key": "value"}, doc_id="testdoc")
    >>> doc1 = db.get_doc("testdoc")
    >>> doc1.content
    {u'key': u'value'}
    >>> doc1.doc_id
    'testdoc'

And it's also possible to retrieve many documents by ``doc_id``.

.. doctest ::

    >>> import u1db
    >>> db = u1db.open("mydb5.u1db", create=True)
    >>> doc1 = db.create_doc({"key": "value"}, doc_id="testdoc1")
    >>> doc2 = db.create_doc({"key": "value"}, doc_id="testdoc2")
    >>> for doc in db.get_docs(["testdoc2","testdoc1"]):
    ...     print doc.doc_id
    testdoc2
    testdoc1

Note that :py:meth:`u1db.Database.get_docs` returns the documents in the order
specified.

Editing existing documents
^^^^^^^^^^^^^^^^^^^^^^^^^^

Editing an *existing* document is done with ``put_doc()``. This is separate
from ``create_doc()`` so as to avoid accidental overwrites. ``put_doc()`` takes
a ``Document`` object, because the object encapsulates revision information for
a particular document. This revision information must match what is stored in
the database, so we can make sure you are not overwriting another version
of the document that you dont know about (eg, new documents that came from
a background sync while you were editing your copy).

.. doctest ::

    >>> import u1db
    >>> db = u1db.open("mydb2.u1db", create=True)
    >>> doc1 = db.create_doc({"key1": "value1"}, doc_id="doc1")

    >>> # the next line should fail because it's creating a doc that already exists
    >>> db.create_doc({"key1fail": "value1fail"}, doc_id="doc1")
    Traceback (most recent call last):
        ...
    RevisionConflict

    >>> # Now editing the doc with the doc object we got back...
    >>> doc1.content["key1"] = "edited"
    >>> db.put_doc(doc1) # doctest: +ELLIPSIS
    '...'
    >>> doc2 = db.get_doc(doc1.doc_id)
    >>> doc2.content
    {u'key1': u'edited'}


Finally, deleting a document is done with :py:meth:`~u1db.Database.delete_doc`.

.. doctest ::

    >>> import u1db
    >>> db = u1db.open("mydb3.u1db", create=True)
    >>> doc = db.create_doc({"key": "value"})
    >>> db.delete_doc(doc) # doctest: +ELLIPSIS
    '...'
    >>> db.get_doc(doc.doc_id)
    >>> doc = db.get_doc(doc.doc_id, include_deleted=True)
    >>> doc.content

Document functions
^^^^^^^^^^^^^^^^^^

* :py:meth:`~u1db.Database.create_doc`
* :py:meth:`~u1db.Database.create_doc_from_json`
* :py:meth:`~u1db.Database.put_doc`
* :py:meth:`~u1db.Database.get_doc`
* :py:meth:`~u1db.Database.get_docs`
* :py:meth:`~u1db.Database.get_all_docs`
* :py:meth:`~u1db.Database.delete_doc`
* :py:meth:`~u1db.Database.whats_changed`

Querying
--------

To retrieve documents other than by ``doc_id``, you query the database.
Querying a U1DB is done by means of an index. To retrieve only some documents
from the database based on certain criteria, you must first create an index,
and then query that index.

An index is created from ''index expressions''. An index expression names one
or more fields in the document. A simple example follows: view many more
examples here.

Given a database with the following documents:

.. doctest ::

    >>> import u1db
    >>> db1 = u1db.open("mydb6.u1db", create=True)
    >>> jb = db1.create_doc({"firstname": "John", "surname": "Barnes", "position": "left wing"})
    >>> jm = db1.create_doc({"firstname": "Jan", "surname": "Molby", "position": "midfield"})
    >>> ah = db1.create_doc({"firstname": "Alan", "surname": "Hansen", "position": "defence"})
    >>> jw = db1.create_doc({"firstname": "John", "surname": "Wayne", "position": "filmstar"})

an index expression of ``"firstname"`` will create an index that looks
(conceptually) like this

====================== ========
index expression value document
====================== ========
Alan                   ah
Jan                    jm
John                   jb
John                   jw
====================== ========

and that index is created with:

.. doctest ::

    >>> db1.create_index("by-firstname", "firstname")
    >>> sorted(db1.get_index_keys('by-firstname'))
    [(u'Alan',), (u'Jan',), (u'John',)]

-- that is, create an index with a name and one or more index expressions.
(Exactly how to pass the name and the list of index expressions is something
specific to each implementation.)

Index expressions
^^^^^^^^^^^^^^^^^

An index expression describes how to get data from a document; you can think of
it as describing a function which, when given a document, returns a value,
which is then used as the index key.

**Name a field.** A basic index expression is a dot-delimited list of nesting
fieldnames, so the index expression ``field.sub1.sub2`` applied to a document
with below content:

.. doctest ::

    >>> import u1db
    >>> db = u1db.open('mydb7.u1db', create=True)
    >>> db.create_index('by-subfield', 'field.sub1.sub2')
    >>> doc1 = db.create_doc({"field": {"sub1": {"sub2": "hello", "sub3": "not selected"}}})
    >>> db.get_index_keys('by-subfield')
    [(u'hello',)]

gives the index key "hello", and therefore an entry in the index of

========= ====
Index key doc
========= ====
hello     doc1
========= ====

**Name a list.** If an index expression names a field whose contents is a list
of strings, the document will have multiple entries in the index, one per entry
in the list. So, the index expression ``field.tags`` applied to a document with
content:

.. doctest ::

    >>> import u1db
    >>> db = u1db.open('mydb8.u1db', create=True)
    >>> db.create_index('by-tags', 'field.tags')
    >>> doc2 = db.create_doc({"field": {"tags": [ "tag1", "tag2", "tag3" ]}})
    >>> sorted(db.get_index_keys('by-tags'))
    [(u'tag1',), (u'tag2',), (u'tag3',)]

gives index entries

========= ====
Index key doc
========= ====
tag1      doc2
tag2      doc2
tag3      doc2
========= ====

**Subfields of objects in a list.** If an index expression points at subfields
of objects in a list, the document will have multiple entries in the index, one
for each object in the list that specifies the denoted subfield. For instance
the index expression ``managers.phone_number`` applied to a document
with content:

.. doctest ::

    >>> import u1db
    >>> db = u1db.open('mydb9.u1db', create=True)
    >>> db.create_index('by-phone-number', 'managers.phone_number')
    >>> doc3 = db.create_doc(
    ...    {"department": "department of redundancy department",
    ...    "managers": [
    ...        {"name": "Mary", "phone_number": "12345"},
    ...        {"name": "Katherine"},
    ...        {"name": "Rob", "phone_number": "54321"}]})
    >>> sorted(db.get_index_keys('by-phone-number'))
    [(u'12345',), (u'54321',)]


would give index entries:

========= ====
Index key doc
========= ====
12345     doc3
54321     doc3
========= ====

**Transformation functions.** An index expression may be wrapped in any number
of transformation functions. A function transforms the result of the contained
index expression: for example, if an expression ``name.firstname`` generates
"John" when applied to a document, then ``lower(name.firstname)`` generates
"john".

Available transformation functions are:

* ``lower(index_expression)`` - lowercase the value
* ``split_words(index_expression)`` - split the value on whitespace; will act
  like a list and add multiple entries to the index
* ``number(index_expression, width)`` - takes an integer value, and turns it
  into a string, left padded with zeroes, to make it at least as wide as
  width; or nothing if the field type is not an integer.
* ``bool(index_expression)`` - takes a boolean value and turns it into '0' if
  false and '1' if true, or nothing if the field type is not boolean.
* ``combine(index_expression1, index_expression2, ...)`` - Combine the values
  of an arbitrary number of sub expressions into a single index.

So, the index expression ``splitwords(lower(field.name))`` applied to
a document with content:

.. doctest ::

    >>> import u1db
    >>> db = u1db.open('mydb10.u1db', create=True)
    >>> db.create_index('by-split-lower', 'split_words(lower(field.name))')
    >>> doc4 = db.create_doc({"field": {"name": "Bruce David Grobbelaar"}})
    >>> sorted(db.get_index_keys('by-split-lower'))
    [(u'bruce',), (u'david',), (u'grobbelaar',)]

gives index entries

========== ====
Index key  doc
========== ====
bruce      doc3
david      doc3
grobbelaar doc3
========== ====


Querying an index
^^^^^^^^^^^^^^^^^

Pass an index key or a tuple of index keys (if the index is on multiple fields)
to ``get_from_index``; the last index key in each tuple (and *only* the last
one) can end with an asterisk, which matches initial substrings. So, querying
our ``by-firstname`` index from above:

.. doctest ::

    >>> johns = [d.doc_id for d in db1.get_from_index("by-firstname", "John")]
    >>> assert(jw.doc_id in johns)
    >>> assert(jb.doc_id in johns)
    >>> assert(jm.doc_id not in johns)

will return the documents with ids: 'jw', 'jb'.

``get_from_index("by_firstname", "J*")`` will match all index keys beginning
with "J", and so will return the documents with ids: 'jw', 'jb', 'jm'.

.. doctest ::

    >>> js = [d.doc_id for d in db1.get_from_index("by-firstname", "J*")]
    >>> assert(jw.doc_id in js)
    >>> assert(jb.doc_id in js)
    >>> assert(jm.doc_id in js)

Index functions
^^^^^^^^^^^^^^^

* :py:meth:`~u1db.Database.create_index`
* :py:meth:`~u1db.Database.delete_index`
* :py:meth:`~u1db.Database.get_from_index`
* :py:meth:`~u1db.Database.get_range_from_index`
* :py:meth:`~u1db.Database.get_index_keys`
* :py:meth:`~u1db.Database.list_indexes`

Synchronising
-------------

U1DB is a syncable database. Any U1DB can be synced with any U1DB server; most
U1DB implementations are capable of being run as a server. Synchronising brings
both the server and the client up to date with one another; save data into a
local U1DB whether online or offline, and then sync when online.

Pass an HTTP URL to sync with that server.

Synchronising databases which have been independently changed may produce
conflicts.  Read about the U1DB conflict policy and more about synchronising at
:ref:`conflicts`.

Running your own U1DB server is implementation-specific.
:ref:`reference-implementation` is able to be run as a server.

Dealing with conflicts
----------------------

Synchronising a database can result in conflicts; if your user changes the same
document in two different places and then syncs again, that document will be
''in conflict'', meaning that it has incompatible changes. If this is the case,
:py:attr:`~u1db.Document.has_conflicts` will be true, and put_doc to a
conflicted doc will give a ``ConflictedDoc`` error. To get a list of conflicted
versions of the document, do :py:meth:`~u1db.Database.get_doc_conflicts`.
Deciding what the final unconflicted document should look like is obviously
specific to the user's application; once decided, call
:py:meth:`~u1db.Database.resolve_doc` to resolve and set the final resolved
content.

Synchronising Functions
^^^^^^^^^^^^^^^^^^^^^^^

* :py:meth:`~u1db.Database.sync`
* :py:meth:`~u1db.Database.get_doc_conflicts`
* :py:meth:`~u1db.Database.resolve_doc`

.. rubric:: footnotes

.. [#] Alternatively if a factory function was passed into
    :py:func:`u1db.open`, :py:meth:`~u1db.Database.get_doc` will return
    whatever type of object the factory function returns.

.. testcleanup ::

    os.chdir(old_dir)
    os.remove(os.path.join(tmp_dir, "mydb1.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb2.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb3.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb4.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb5.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb6.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb7.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb8.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb9.u1db"))
    os.remove(os.path.join(tmp_dir, "mydb10.u1db"))
    os.rmdir(tmp_dir)