Interface for a simple data store where the root and subtables are nested Python dict objects.
AUTHORS:
EXAMPLES:
from neuronpy.util import dictdb
import datetime
the_dict = { \
'sub1' : { \
'key_a' : { \
'key_a_a' : 1, \
'time_stamp' : \
datetime.datetime(2010, 7, 23, 18, 43, 36, 640692), \
'key_a_b' : 'ABCDEFG' }, \
'key_b' : [1,2,3], \
'key_c' : { \
'key_c_a' : 2, \
'key_c_b' : 'HIJKLMNO' }}, \
'sub2' : { \
'key_a' : { \
'key_a_a' : 2, \
'time_stamp' : \
datetime.datetime(2010, 8, 23, 18, 43, 36, 640692), \
'key_a_b' : 'XYZPDQ' }, \
'key_b' : [4,5,6], \
'key_c' : { \
'key_c_a' : 1, \
'key_c_b' : 'ABCDEFG' }}, \
'different_sub1' : { \
'different_key_a' : { \
'different_key_a_a' : 3, \
'different_key_a_b' : [1,2,3,4,5] }, \
'different_key_b' : None, \
'different_key_c' : { \
'different_key_c_a' : 2.0, \
'different_key_c_b' : ['a', 'b', 'c'] }} \
}
Nodes 'sub1_a', 'sub1_b', and 'different_sub1_a' refer to trunk nodes, which may be thought of as records in the database. Notice that in this dict, subdictionaries can each be of arbitrary depth, and contain any data type that can be put into a dict. This example shows records 'sub1_a' and 'sub1_b' share the same structure, but different_sub1 contains different data types.
There are two related functions for retrieving records: filter_dict() and match(). Both approaches use user-defined lambda functions. The filter_dict() function is itself a generator function that operates on (key, value, parent_keys) tuples. This function therefore allows any of these values to be ignored and one can search for keys or values irrespective of their key associations.
The match() method permits multiple queries where a given key meets some condition. The key and the condition as a lambda function are provided in a tuple: ('key', lambda v: <some_operation_with_v>) where “some_operation_with_v” evaluates to True or False.
Filter a dict by some function on (key, value, parent_key) tuples.
Parameters: |
|
---|
EXAMPLES:
Retrieve a subdict that contains any value with 'ABCDEFG'. In this case, k and p in the lambda function are ignored.
filtered = dict(filter_dict(the_dict, lambda k, v, p: v=='ABCDEFG'))
Retrieve a subdict where the 'time_stamp' is greater than 30 days.
def date_filter(k, v, p):
return k == 'time_stamp' and \
(datetime.datetime.now() - v).days >= 30
filtered = dict(filter_dict(the_dict, date_filter(k, v, p)))
If a sub-key is not unique, then we can query the parent keys. In this case, p is a list of keys starting from the first sub-dict to the current key. Therefore, k == p[-1]. This example searches for string ‘ABCDEF’, but mandates that k == 'key_c_b' and the parent key is p == 'key_c'.
def f(k, v, p):
return k == 'key_c_b' and len(p) >= 2 and p[-2] == 'key_c'
filtered = dict(filter_dict(the_dict, f(k, v, p)))
Retrieve the root items of the_dict where nested keys match the conditions to some number of queries.
Parameters: |
|
---|---|
Returns: | A sub-dict of the_dict where nested keys match the conditions to queries. The elements in the returned dict are copies of the_dict, which remains unaltered. |
EXAMPLES:
To search for records where 'key_a_a' == 2:
subdict = dictdb.match(the_dict, ('key_a_a',lambda v: v==2))
for key in subdict.keys():
print key
This should print sub2.
To make compound queries, pass a list of query-tuples.
subdict = dictdb.match(the_dict, [ \
('key_a_a',lambda v: v==2,['key_a']), \
('different_key_a_b',lambda v: sum(v)> 3,['different_key_a']) \
], \
mode='OR')
for key in subdict.keys():
print key
This should print sub2 and different_sub1_a.
In this case, each query is a list of 3-element tuples. The last element in the tuple specifies the parent dict of the particular key being searched for. This also makes an OR search of the terms.
For complex or custom data types and expressions, you can define your own function and pass that in as the second parameter in the query. The following example tests for a value in a list.
def my_func(x, y):
return isinstance(x, list) and y in x
subdict = dictdb.match(the_dict, \
('different_key_c_b', lambda v: my_func(v, 'b')))
for key in subdict.keys():
print key
Here is another example that retrieves records with a timestamp older than 30 days.
subdict = dictdb.match(the_dict, \
('time_stamp', lambda v: (datetime.datetime.now() - x).days >= 30))
for key in subdict.keys():
print key