Querying Views¶
View
Object¶
-
class
couchbase.views.iterator.
View
[source]¶ -
__init__
(parent, design, view, row_processor=None, include_docs=False, query=None, streaming=True, **params)[source]¶ Construct a iterable which can be used to iterate over view query results.
Parameters: - parent (
Bucket
) – The parent Bucket object - design (string) – The design document
- view (string) – The name of the view within the design document
- row_processor (callable) – See
row_processor
for more details. - include_docs (boolean) –
If set, the document itself will be retrieved for each row in the result. The default algorithm uses
get_multi()
for each page (i.e. everystreaming
results).The
reduce
family of attributes must not be active, as results froreduce
views do not have corresponding doc IDs (as these are aggregation functions). - query – If set, should be a
Query
orSpatialQuery
object. It is illegal to use this in conjunction with additionalparams
- params – Extra view options. This may be used to pass view
arguments (as defined in
Query
) without explicitly constructing aQuery
object. It is illegal to use this together with thequery
argument. If you wish to ‘inline’ additional arguments to the providedquery
object, use the query’supdate()
method instead.
This object is an iterator - it does not send out the request until the first item from the iterator is request. See
__iter__()
for more details on what this object returns.Simple view query, with no extra options:
# c is the Bucket object. for result in View(c, "beer", "brewery_beers"): print("emitted key: {0}, doc_id: {1}" .format(result.key, result.docid))
Execute a view with extra query options:
# Implicitly creates a Query object view = View(c, "beer", "by_location", limit=4, reduce=True, group_level=2)
Execute a spatial view:
from couchbase.views.params import SpatialQuery # .... q = SpatialQuery() q.start_range = [ -119.9556, 38.7056 ] q.end_range = [ -118.8122, 39.7086 ] view = View(c, 'geodesign', 'spatialview', query=q) for row in view: print('Location is {0}'.format(row.geometry))
Pass a Query object:
q = Query( stale=False, inclusive_end=True, mapkey_range=[ ["21st_ammendment_brewery_cafe"], ["21st_ammendment_brewery_cafe", Query.STRING_RANGE_END] ] ) view = View(c, "beer", "brewery_beer", query=q)
Add extra parameters to query object for single call:
view = View(c, "beer", "brewery_beer", query=q.update(debug=True, copy=True))
Include documents with query:
view = View(c, "beer", "brewery_beer", query=q, include_docs=True) for result in view: print("Emitted key: {0}, Document: {1}".format( result.key, result.doc.value))
- parent (
-
__iter__
()[source]¶ Returns a row for each query. The type of the row depends on the
row_processor
being used.Raise: ViewEngineError
If an error was encountered while processing the view, and the
on_error
attribute was not set to continue.If continue was specified, a warning message is printed to the screen (via
warnings.warn
and operation continues). To inspect the error, examineerrors
Raise: AlreadyQueriedError
If this object was already iterated over and the last result was already returned.
-
Attributes¶
couchbase.views.iterator.
errors
¶Errors returned from the view engine itself
couchbase.views.iterator.
indexed_rows
¶Number of total rows indexed by the view. This is the number of results before any filters or limitations applied. This is only valid once the iteration has started
couchbase.views.iterator.
row_processor
¶An object to handle a single page of the paginated results. This object should be an instance of a class conforming to the
RowProcessor
interface. By default, it is an instance ofRowProcessor
itself.
couchbase.views.iterator.
raw
¶The actual
couchbase.bucket.HttpResult
object. Note that this is only the last result returned. If using paginated views, the view comprises several such objects, and is cleared each time a new page is fetched.
couchbase.views.iterator.
design
¶Name of the design document being used
couchbase.views.iterator.
view
¶Name of the view being queired
couchbase.views.iterator.
include_docs
¶Whether documents are fetched along with each row
Row Processing¶
-
class
couchbase.views.iterator.
RowProcessor
[source]¶ -
handle_rows
(rows, *_)[source]¶ Preprocesses a page of rows.
Parameters: - rows (list) – A list of rows. Each row is a JSON object containing the decoded JSON of the view as returned from the server
- connection – The connection object (pass to the
View
constructor) - include_docs – Whether to include documents in the return value.
This is
True
orFalse
depending on what was passed to theView
constructor
Returns: an iterable. When the iterable is exhausted, this method will be called again with a new ‘page’.
-
-
class
couchbase.views.iterator.
ViewRow
¶ This is the default class returned by the
RowProcessor
-
key
¶ The key emitted by the view’s
map
function (first argument toemit
)
-
value
¶ The value emitted by the view’s
map
function (second argument toemit
). If the view was queried withreduce
enabled, then this contains the reduced value after being processed by thereduce
function.
-
docid
¶ This is the document ID for the row. This is always
None
ifreduce
was specified. Otherwise it may be passed to one of theget
orset
method to retrieve or otherwise access the underlying document. Note that ifinclude_docs
was specified, thedoc
already contains the document
-
doc
¶ If
include_docs
was specified, contains the actualcouchbase.bucket.Result
object for the document.
-
Query
Object¶
-
class
couchbase.views.params.
Query
[source]¶ -
__init__
(passthrough=False, unrecognized_ok=False, **params)¶ Create a new Query object.
A Query object is used as a container for the various view options. It can be used as a standalone object to encode queries but is typically passed as the
query
value toView
.Parameters: - passthrough (boolean) – Whether passthrough mode is enabled
- unrecognized_ok (boolean) – Whether unrecognized options are acceptable. See Circumventing Parameter Constraints.
- params – Key-value pairs for view options. See View Options for a list of acceptable options and their values.
Raise: couchbase.exceptions.ArgumentError
if a view option or a combination of view options were deemed invalid.
-
update
(copy=False, **params)¶ Chained assignment operator.
This may be used to quickly assign extra parameters to the
Query
object.Example:
q = Query(reduce=True, full_sec=True) # Someplace later v = View(design, view, query=q.update(mapkey_range=["foo"]))
Its primary use is to easily modify the query object (in-place).
Parameters: - copy (boolean) – If set to true, the original object is copied before new attributes are added to it
- params – Extra arguments. These must be valid query options.
Returns: A
Query
object. Ifcopy
was set to true, this will be a new instance, otherwise it is the same instance on which this method was called
-
encoded
¶ Returns an encoded form of the query
-
View Options¶
This document explains the various view options, and how they are treated by the Couchbase library.
Many of the view options correspond to those listed here http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views-querying-rest-api.html
Note that these explain the view options and their values as they are passed along to the server.
These attributes are available as properties (with get and set) and can also be used as keys within a constructor.
Result Range and Sorting Properties¶
The following properties allow you to
- Define a range to limit your results (i.e. between foo and bar)
- Define a specific subset of keys for which results should be yielded
- Reverse the sort order
-
class
couchbase.views.params.
Query
[source] -
mapkey_range
¶ Specify the range based on the contents of the keys emitted by the view’s
map
function.Server Option: Maps to both startkey
andendkey
Value Type: Range Value of JSON Value elements The result output depends on the type of keys and ranges used.
One may specify a “full” range (that is, an exact match of the first and/or last key to use), or a partial range where the start and end ranges specify a subset of the key to be used as the start and end. In such a case, the results begin with the first key which matches the partial start key, and ends with the first key that matches the partial end key.
Additionally, keys may be compound keys, i.e. complex data types such as lists.
You may use the
STRING_RANGE_END
to specify a wildcard for an end range.Match all keys that start with “a” through keys starting with “f”:
q.mapkey_range = ["a", "f"+q.STRING_RANGE_END] q.inclusive_end = True
If you have a view function that looks something like this:
function(doc, meta) { if (doc.city && doc.event) { emit([doc.country, doc.state, doc.city], doc.event) } }
Then you may query for all events in a specific state by using:
q.mapkey_range = [ ["USA", "NV", ""] ["USA", "NV", q.STRING_RANGE_END] ]
While the first two elements are an exact match (i.e. only keys which have
["USA","NV", ...]
in them, the third element should accept anything, and thus has its start value as the empty string (i.e. lowest range) and the magicq.STRING_RANGE_END
as its lowest value.As such, the results may look like:
ViewRow(key=[u'USA', u'NV', u'Reno'], value=u'Air Races', docid=u'air_races_rno', doc=None) ViewRow(key=[u'USA', u'NV', u'Reno'], value=u'Reno Rodeo', docid=u'rodeo_rno', doc=None) ViewRow(key=[u'USA', u'NV', u'Reno'], value=u'Street Vibrations', docid=u'street_vibrations_rno', doc=None) # etc.
-
STRING_RANGE_END
= u'\u0fff'¶
-
dockey_range
¶ Server Option: Maps to both startkey_docid
andendkey_docid
Value Type: Range Value of String elements. Specify the range based on the contents of the keys as they are stored by
upsert()
. These are returned as the “Document IDs” in each view result.You must use this attribute in conjunction with
mapkey_range
option. Additionally, this option only has any effect if you are emitting duplicate keys for different document IDs. An example of this follows:Documents:
c.upsert("id_1", { "type" : "dummy" }) c.upsert("id_2", { "type" : "dummy" }) # ... c.upsert("id_9", { "type" : "dummy" })
View:
// This will emit "dummy" for ids 1..9 function map(doc, meta) { emit(doc.type); }
Only get information about
"dummy"
docs for IDs 3 through 6:q = Query() q.mapkey_range = ["dummy", "dummy" + Query.STRING_RANGE_END] q.dockey_range = ["id_3", "id_6"] q.inclusive_end = True
Warning
Apparently, only the first element of this parameter has any effect. Currently the example above will start returning rows from
id_3
(as expected), but does not stop after reachingid_6
.
-
key
¶
-
mapkey_single
¶ Server Option: key
Value Type: JSON Value Limit the view results to those keys which match the value to this option exactly.
View:
function(doc, meta) { if (doc.type == "brewery") { emit([meta.id]); } else { emit([doc.brewery_id, meta.id]); } }
Example:
q.mapkey_single = "abbaye_de_maredsous"
Note that as the
map
function can return more than one result with the same key, you may still get more than one result back.
-
keys
¶
-
mapkey_multi
¶ Server Option: keys
Value Type: JSON Array Like
mapkey_single
, but specify a sequence of keys. Only rows whose emitted keys match any of the keys specified here will be returned.Example:
q.mapkey_multi = [ ["abbaye_de_maresdous"], ["abbaye_de_maresdous", "abbaye_de_maresdous-8"], ["abbaye_do_maresdous", "abbaye_de_maresdous-10"] ]
-
inclusive_end
¶ Server Option: inclusive_end
Value Type: Boolean Type Declare that the range parameters’ (for e.g.
mapkey_range
anddockey_range
) end key should also be returned for rows that match it. By default, the resultset is terminated once the first key matching the end range is found.
-
descending
¶ Server Option: descending
Value Type: Boolean Type
-
Reduce Function Parameters¶
These options are valid only for views which have a reduce
function,
and for which the reduce
value is enabled
-
class
couchbase.views.params.
Query
[source] -
reduce
¶ Server Option: reduce
Value Type: Boolean Type Note that if the view specified in the query (to e.g.
couchbase.bucket.Bucket.query()
) does not have a reduce function specified, an exception will be thrown once the query begins.
-
group
¶ Server Option: group
Value Type: Boolean Type Specify this option to have the results contain a breakdown of the
reduce
function based on keys produced bymap
. By default, only a single row is returned indicating the aggregate value from all thereduce
invocations.Specifying this option will show a breakdown of the aggregate
reduce
value based on keys. Each unique key in the result set will have its own value.Setting this property will also set
reduce
toTrue
-
group_level
¶ Server Option: group_level
Value Type: Numeric Type This is analoguous to
group
, except that it places a constraint on how many elements of the compound key produced bymap
should be displayed in the summary. For example if this parameter is set to1
then the results are returned for each unique first element in the mapped keys.Setting this property will also set
reduce
toTrue
-
Pagination and Sampling¶
These options limit or paginate through the results
-
class
couchbase.views.params.
Query
[source] -
skip
¶ Server Option: skip
Value Type: Numeric Type Warning
Consider using
mapkey_range
instead. Using this property with high values is typically inefficient.
-
limit
¶ Server Option: limit
Value Type: Numeric Type Set an absolute limit on how many rows should be returned in this query. The number of rows returned will always be less or equal to this number.
-
Control Options¶
These do not particularly affect the actual query behavior, but may control some other behavior which may indirectly impact performance or indexing operations.
-
class
couchbase.views.params.
Query
[source] -
stale
¶ Server Option: stale
Specify the (re)-indexing behavior for the view itself. Views return results based on indexes - which are not updated for each query by default. Updating the index for each query would cause much performance issues. However it is sometimes desirable to ensure consistency of data (as sometimes there may be a delay between recently-updated keys and the view index).
This option allows to specify indexing behavior. It accepts a string which can have one of the following values:
ok
Stale indexes are allowable. This is the default. The constant
STALE_OK
may be used instead.false
Stale indexes are not allowable. Re-generate the index before returning the results. Note that if there are many results, this may take a considerable amount of time (on the order of several seconds, typically). The constant
STALE_UPDATE_BEFORE
may be used instead.update_after
Return stale indexes for this result (so that the query does not take a long time), but re-generated the index immediately after returning. The constant
STALE_UPDATE_AFTER
may be used instead.
A Boolean Type may be used as well, in which case
True
is converted to"ok"
, andFalse
is converted to"false"
-
on_error
¶ Server Option: on_error
Value Type: A string of either "stop"
or"continue"
. You may use the symbolic constantsONERROR_STOP
orONERROR_CONTINUE
-
connection_timeout
¶ This parameter is a server-side option indicating how long a given node should wait for another node to respond. This does not directly set the client-side timeout.
Server Option: connection_timeout
Value Type: Numeric Type
-
debug
¶ Server Option: debug
Value Type: Boolean Type If enabled, various debug output will be dumped in the resultset.
-
full_set
¶ Server Option: full_set
Value Type: Boolean Type If enabled, development views will operate over the entire data within the bucket (and not just a limited subset).
-
Value Type For Options¶
Different options accept different types, which shall be enumerated here
Boolean Type¶
Options which accept booleans may accept the following Python types:
- Standard python
bool
types, likeTrue
andFalse
- Numeric values which evaluate to booleans
- Strings containing either
"true"
or"false"
Other options passed as booleans will raise an error, as it is assumed that perhaps it was passed accidentally due to a bug in the application.
Numeric Type¶
Options which accept numeric values accept the following Python types:
int
,long
andfloat
objects- Strings which contain values convertible to said native numeric types
It is an error to pass a bool
as a number, despite the fact that in Python,
bool
are actually a subclass of int
.
JSON Value¶
Options which accept JSON values accept native Python types (and any user-
defined classes) which can successfully be passed through json.dumps
.
Do not pass an already-encoded JSON string, and do not URI-escape the string either - as this will be done by the option handling layer (but see Circumventing Parameter Constraints for a way to circumvent this)
Note that it is perfectly acceptable to pass JSON primitives (such as numbers, strings, and booleans).
JSON Array¶
Options which accept JSON array values should be pass a Python type which
can be converted to a JSON array. This typically means any ordered Python
sequence (such as list
and tuple
). Like JSON Value,
the contents of the list should not be URI-escaped, as this will be done
at the option handling layer
String¶
Options which accept strings accept so-called “semantic strings”, specifically; the following Python types are acceptable:
str
andunicode
objectsint
andlong
objects
Note that bool
, none
and other objects are not accepted - this is to
ensure that random objects passed don’t simply end up being repr()
‘d
and causing confusion in your view results.
If you have a custom object which has a __str__
method and would like to
use it as a string, you must explicitly do so prior to passing it as an option.
Range Value¶
Range specifiers take a sequence (list or tuple) of one or two elements.
If the sequence contains two items, the first is taken to be the start of the range, and the second is taken to be its (non-inclusive) end
If the sequence contains only a single item, it is taken to be the start of the range, and no end will be specified.
To specify a range which has an end but not a start, pass a two-element
sequence with the first element being an UNSPEC
value.
The type of each element is parameter-specific.
Unspecified Value¶
Conventionally, it is common for APIs to treat the value None
as being
a default parameter of some sort. Unfortunately since view queries deal with
JSON, and None
maps to a JSON null
, it is not possible for the view
processing functions to ignore None
.
As an alternative, a special constant is provided as
UNSPEC
. You may use this as a placeholder value for any
option. When the view processing code encounters this value, it will
discard the option-value pair.
Convenience Constants¶
These are convenience value constants for some of the options
-
params.
ONERROR_CONTINUE
= 'continue'¶
-
params.
ONERROR_STOP
= 'stop'¶
-
params.
STALE_OK
= 'ok'¶
-
params.
STALE_UPDATE_BEFORE
= 'false'¶
-
params.
STALE_UPDATE_AFTER
= 'update_after'¶
-
params.
UNSPEC
= <Placeholder>¶
Circumventing Parameter Constraints¶
Sometimes it may be necessary to circumvent existing constraints placed by the client library regarding view option validation.
For this, there are passthrough
and allow_unrecognized
options
which may be set in order to allow the client to be more lax in its conversion
process.
These options are present under various names in the various view query functions.
Passthrough
Passthrough removes any conversion functions applied. It simply assumes values for all options are strings, and then encodes them rather simply
Allowing Unrecognized Options
If a newer version of a server is released has added a new option, older versions of this library will not know about it, and will raise an error when it is being used. In this scenario, one can use the ‘allow unrecognized’ mode to add extra options, with their values being treated as simple strings.
This has the benefit of providing normal behavior for known options.
Geospatial Views¶
Geospatial views are views which can index and filter items based on one or more independent axes or coordinates. This allows greater application at query-time to filter based on more than a single attribute.
Filtering at query time is done though _ranges_. These ranges contain the start and end values for each key passed to the emit() in the map() function. Unlike Map-Reduce views and compound keys for startkey and endkey, each item in a spatial range is independent from any other, and is not sorted or evaluated in any particular order.
See `GeoCouch`_<https://github.com/couchbase/geocouch/wiki/Spatial-Views-API> for more information.
Creating Geospatial Views¶
Creating a geospatial view may be done in a manner similar to creating
a normal view; except that the design document defines the spatial
view in the spatial
field, rather than in the views
field.
ddoc = {
'spatial': {
'geoview':
'''
if (doc.loc) {
emit({
type: "Point",
geometry: doc.loc
}, doc.name);
}
'''
}
}
cb.bucket_manager().design_create('geo', ddoc)
The above snippet will create a geospatial design doc (geo
) with a single
view (called geoview
).
Querying Geospatial Views¶
To query a geospatial view, you must pass an instance of SpatialQuery
as the query
keyword argument to either the View
constructor, or
the Bucket.query()
method.
from couchbase.views.params import SpatialQuery
q = SpatialQuery(start_range=[0, -90, None], end_range=[180, 90, None])
for row in bkt.query(query=q):
print "Key:", row.key
print "Value:", row.value
print "Geometry", row.geometry
-
class
couchbase.views.params.
SpatialQuery
[source]¶ -
__init__
(passthrough=False, unrecognized_ok=False, **params)¶ Create a new Query object.
A Query object is used as a container for the various view options. It can be used as a standalone object to encode queries but is typically passed as the
query
value toView
.Parameters: - passthrough (boolean) – Whether passthrough mode is enabled
- unrecognized_ok (boolean) – Whether unrecognized options are acceptable. See Circumventing Parameter Constraints.
- params – Key-value pairs for view options. See View Options for a list of acceptable options and their values.
Raise: couchbase.exceptions.ArgumentError
if a view option or a combination of view options were deemed invalid.
-
start_range
¶ The starting range to query. If querying geometries, this should be the lower bounds of the longitudes and latitudes to filter. Use None to indicate that a given dimension should not be bounded.
q.start_range=[0, -90]
-
end_range
¶ The upper limit for the range. This contains the upper bounds for the ranges specified in
start_range
.q.end_range[180, 90]
-
skip
¶ See
Query.skip
-
limit
¶ See
Query.limit
-
stale
¶ See
Query.stale
-