Marbl-Python¶

Marbl-Python is an implementation of the Marbl specification for normalized representations of Markov blankets in Bayesian networks.

It provides objects and methods for normalizing, serializing and hashing Marbls (Markov blankets), and unordered collections of them.

Usage¶

Transition probability matrices are represented as p-dimensional nested lists of floats, where p is the number of the node’s parents (this makes lexicographic sorting trivial, since lists are natively sorted lexicographically, though at the cost of overhead in the conversion between NumPy array and Python list form).

Normalization¶

A Marbl represents the a Markov blanket. By default, its TPMs are normalized upon initialization. This can be overridden by explicitly setting the normalize flag to False.

A MarblSet is similarly used to represent an unordered collection of Marbls.

If you need to get the normal form of a single TPM, rather than an entire Markov blanket, use normalize_tpm().

Hashing¶

Just use the native Python hash function on Marbls and MarblSets.

Serialization¶

Both the Marbls and MarblSets can be serialized with pack(). Each object also has its own pack() method.

To deserialize, use unpack() for Marbls, and unpack_set() for MarblSets.

API¶

class marbl.Marbl(node_tpm, augmented_child_tpms, normalize=True)¶

A Markov blanket, in normal form by default.

Provides methods for serialization and hashing.

node_tpm list¶: The covered node’s p-dimensional transition probability matrix (where p is the number of the node’s parents), normalized by default.

augmented_child_tpms list¶: The augmented child tpms, normalized by default. A normalized augmented child TPM contains the index of the dimension corresponding to the covered node in the child’s normalized TPM, and the normalized TPM itself.

Marbls are rendered into normal form upon initialization by default.

Keyword Arguments:
Parameters:	node_tpm (list) – The un-normalized node’s TPM. augmented_child_tpms (Iterable) – Each augmented child TPM contains the index of the dimension corresponding to the covered node in the child’s TPM, and the TPM itself.
	normalize (bool) – Flag to indicate whether TPMs should be normalized. Defaults to `True`.

Warning

Incorrect use of the normalize flag can cause hashes to differ when they shouldn’t. Make sure you really don’t want the normal form if you pass False.

Examples

>>> tpm = np.array(
...       [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]])
>>> tpm2 = [[[0.3, 0.1],
...          [0.4, 0.3]],
...         [[0.4, 0.3],
...          [0.5, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [0, tpm]]
>>> marbl1 = Marbl(tpm, augmented_child_tpms)
>>> marbl2 = Marbl(tpm2, augmented_child_tpms)
>>> marbl1 == marbl2
True
>>> augmented_child_tpms2 = [[0, tpm], [1, tpm]]
>>> marbl2 = Marbl(tpm, augmented_child_tpms2)
>>> marbl1 == marbl2
False
>>> unnormalized = Marbl(tpm, augmented_child_tpms,
...                      normalize=False)
>>> unnormalized == marbl1
False

__hash__()¶

Return the canonical hash of the Marbl.

If two Marbls have the same hash, they are equivalent up to rearranging the labels of the covered node’s parents and the covered node’s children’s parents, as long as they were both normalized upon initialization.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> hash(marbl)
482032824703719516

pack()¶

Serialize the Marbl.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbl.pack()
b'\x92\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\xcb?\xe0\x00\x00\x00\x00\x00\x00\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x91\x92\x00\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\xcb?\xe0\x00\x00\x00\x00\x00\x00\xcb?\xb9\x99\x99\x99\x99\x99\x9a'

class marbl.MarblSet(marbls)¶

An immutable, unordered collection of not necessarily unique Markov blankets.

Provides methods for serialization and hashing.

Args: marbls (Iterable): The Marbls to include in the set.

__hash__()¶

Return the canonical hash of the multiset of Marbls.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbls = MarblSet([marbl]*3)
>>> hash(marbls)
170586149808347817

pack()¶

Serialize the multiset of Marbls.

Example

>>> tpm = [[0.3, 0.4],
...        [0.1, 0.3]]
>>> augmented_child_tpms = [[0, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbls = MarblSet([marbl]*2)
>>> marbls.pack()
b'\x92\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x91\x92\x01\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x91\x92\x01\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333'

marbl.normalize_tpm(tpm, track_parent_index=None)¶

Return the normal form of a TPM. Optionally, also return the new dimension index of a particular parent in the normalized TPM.

The TPM should be p-dimensional, where p is the number of parents. For example, with three parents, TPM[0][1][0] should give the transition probability if the state of the parents is (0,1,0).

Keyword Arguments:
Parameters:	tpm (list) – The child TPM to be normalized.
	track_parent_index (int) – The zero-based index of the dimension corresponding to the covered node in the un-normalized child TPM. If this is not `None`, an normalized augmented child TPM will be returned instead of just a normalized TPM.

Examples

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> equivalent = [[[0.3, 0.1],
...                [0.4, 0.3]],
...               [[0.4, 0.3],
...                [0.5, 0.1]]]
>>> normalize_tpm(tpm) == normalize_tpm(equivalent)
True
>>> answer = [2, [[[0.3, 0.1],
...                [0.4, 0.3]],
...               [[0.4, 0.3],
...                [0.5, 0.1]]]]
>>> normalize_tpm(tpm, track_parent_index=1) == answer
True

marbl.pack(obj)¶: Alias for Marbl.pack() and MarblSet.pack().

marbl.unpack(packed_marbl)¶

Deserialize a Marbl.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbl == unpack(pack(marbl))
True

marbl.unpack_set(packed_marbls)¶

Deserialize a multiset of Marbls.

Example

>>> tpm = [[[0.3, 0.4],
...         [0.1, 0.3]],
...        [[0.4, 0.5],
...         [0.3, 0.1]]]
>>> augmented_child_tpms = [[0, tpm], [1, tpm]]
>>> marbl = Marbl(tpm, augmented_child_tpms)
>>> marbls = MarblSet([marbl]*3)
>>> marbls == unpack_set(pack(marbls))
True