Marbl-Python

Marbl-Python is an implementation of the Marbl specification for normalized representations of Markov blankets in Bayesian networks.

It provides objects and methods for normalizing, serializing and hashing Marbls (Markov blankets), and unordered collections of them.

Usage

Transition probability matrices are represented as p-dimensional nested lists of floats, where p is the number of the node’s parents (this makes lexicographic sorting trivial, since lists are natively sorted lexicographically, though at the cost of overhead in the conversion between NumPy array and Python list form).

Normalization

A Marbl represents the a Markov blanket. By default, its TPMs are normalized upon initialization. This can be overridden by explicitly setting the normalize flag to False.

A MarblSet is similarly used to represent an unordered collection of Marbls.

If you need to get the normal form of a single TPM, rather than an entire Markov blanket, use normalize_tpm().

Hashing

Just use the native Python hash function on Marbls and MarblSets.

Serialization

Both the Marbls and MarblSets can be serialized with pack(). Each object also has its own pack() method.

To deserialize, use unpack() for Marbls, and unpack_set() for MarblSets.

API

class marbl.Marbl(node_tpm, augmented_child_tpms, normalize=True)

A Markov blanket, in normal form by default.

Provides methods for serialization and hashing.

node_tpm list

The covered node’s p-dimensional transition probability matrix (where p is the number of the node’s parents), normalized by default.

augmented_child_tpms list

The augmented child tpms, normalized by default. A normalized augmented child TPM contains the index of the dimension corresponding to the covered node in the child’s normalized TPM, and the normalized TPM itself.

Marbls are rendered into normal form upon initialization by default.

Parameters:
  • node_tpm (list) – The un-normalized node’s TPM.
  • augmented_child_tpms (Iterable) – Each augmented child TPM contains the index of the dimension corresponding to the covered node in the child’s TPM, and the TPM itself.
Keyword Arguments:
 

normalize (bool) – Flag to indicate whether TPMs should be normalized. Defaults to True.

Warning

Incorrect use of the normalize flag can cause hashes to differ when they shouldn’t. Make sure you really don’t want the normal form if you pass False.

Examples

>>> tpm = np.array(
...       [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]])
>>> tpm2 = [[[0.3, 0.1],
...          [0.4, 0.3]],
...         [[0.4, 0.3],
...          [0.5, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [0, tpm]]
>>> marbl1 = Marbl(tpm, augmented_child_tpms)
>>> marbl2 = Marbl(tpm2, augmented_child_tpms)
>>> marbl1 == marbl2
True
>>> augmented_child_tpms2 = [[0, tpm], [1, tpm]]
>>> marbl2 = Marbl(tpm, augmented_child_tpms2)
>>> marbl1 == marbl2
False
>>> unnormalized = Marbl(tpm, augmented_child_tpms,
...                      normalize=False)
>>> unnormalized == marbl1
False
__hash__()

Return the canonical hash of the Marbl.

If two Marbls have the same hash, they are equivalent up to rearranging the labels of the covered node’s parents and the covered node’s children’s parents, as long as they were both normalized upon initialization.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> hash(marbl)
482032824703719516
pack()

Serialize the Marbl.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbl.pack()
b'\x92\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\xcb?\xe0\x00\x00\x00\x00\x00\x00\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x91\x92\x00\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\xcb?\xe0\x00\x00\x00\x00\x00\x00\xcb?\xb9\x99\x99\x99\x99\x99\x9a'
class marbl.MarblSet(marbls)

An immutable, unordered collection of not necessarily unique Markov blankets.

Provides methods for serialization and hashing.

Args: marbls (Iterable): The Marbls to include in the set.

__hash__()

Return the canonical hash of the multiset of Marbls.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbls = MarblSet([marbl]*3)
>>> hash(marbls)
170586149808347817
pack()

Serialize the multiset of Marbls.

Example

>>> tpm = [[0.3, 0.4],
...        [0.1, 0.3]]
>>> augmented_child_tpms = [[0, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbls = MarblSet([marbl]*2)
>>> marbls.pack()
b'\x92\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x91\x92\x01\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x91\x92\x01\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333'
marbl.normalize_tpm(tpm, track_parent_index=None)

Return the normal form of a TPM. Optionally, also return the new dimension index of a particular parent in the normalized TPM.

The TPM should be p-dimensional, where p is the number of parents. For example, with three parents, TPM[0][1][0] should give the transition probability if the state of the parents is (0,1,0).

Parameters:tpm (list) – The child TPM to be normalized.
Keyword Arguments:
 track_parent_index (int) – The zero-based index of the dimension corresponding to the covered node in the un-normalized child TPM. If this is not None, an normalized augmented child TPM will be returned instead of just a normalized TPM.

Examples

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> equivalent = [[[0.3, 0.1],
...                [0.4, 0.3]],
...               [[0.4, 0.3],
...                [0.5, 0.1]]]
>>> normalize_tpm(tpm) == normalize_tpm(equivalent)
True
>>> answer = [2, [[[0.3, 0.1],
...                [0.4, 0.3]],
...               [[0.4, 0.3],
...                [0.5, 0.1]]]]
>>> normalize_tpm(tpm, track_parent_index=1) == answer
True
marbl.pack(obj)

Alias for Marbl.pack() and MarblSet.pack().

marbl.unpack(packed_marbl)

Deserialize a Marbl.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbl == unpack(pack(marbl))
True
marbl.unpack_set(packed_marbls)

Deserialize a multiset of Marbls.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbls = MarblSet([marbl]*3)
>>> marbls == unpack_set(pack(marbls))
True