Marbl-Python¶
Marbl-Python is an implementation of the Marbl specification for normalized representations of Markov blankets in Bayesian networks.
It provides objects and methods for normalizing, serializing and hashing Marbls (Markov blankets), and unordered collections of them.
Usage¶
Transition probability matrices are represented as p-dimensional nested lists of floats, where p is the number of the node’s parents (this makes lexicographic sorting trivial, since lists are natively sorted lexicographically, though at the cost of overhead in the conversion between NumPy array and Python list form).
Normalization¶
A Marbl represents the a Markov blanket. By default, its TPMs are normalized upon initialization. This can be overridden by explicitly setting the normalize flag to False.
A MarblSet is similarly used to represent an unordered collection of Marbls.
If you need to get the normal form of a single TPM, rather than an entire Markov blanket, use normalize_tpm().
Hashing¶
Just use the native Python hash function on Marbls and MarblSets.
Serialization¶
Both the Marbls and MarblSets can be serialized with pack(). Each object also has its own pack() method.
To deserialize, use unpack() for Marbls, and unpack_set() for MarblSets.
API¶
- class marbl.Marbl(node_tpm, augmented_child_tpms, normalize=True)¶
A Markov blanket, in normal form by default.
Provides methods for serialization and hashing.
- node_tpm list¶
The covered node’s p-dimensional transition probability matrix (where p is the number of the node’s parents), normalized by default.
- augmented_child_tpms list¶
The augmented child tpms, normalized by default. A normalized augmented child TPM contains the index of the dimension corresponding to the covered node in the child’s normalized TPM, and the normalized TPM itself.
Marbls are rendered into normal form upon initialization by default.
Parameters: - node_tpm (list) – The un-normalized node’s TPM.
- augmented_child_tpms (Iterable) – Each augmented child TPM contains the index of the dimension corresponding to the covered node in the child’s TPM, and the TPM itself.
Keyword Arguments: normalize (bool) – Flag to indicate whether TPMs should be normalized. Defaults to True.
Warning
Incorrect use of the normalize flag can cause hashes to differ when they shouldn’t. Make sure you really don’t want the normal form if you pass False.
Examples
>>> tpm = np.array( ... [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]]) >>> tpm2 = [[[0.3, 0.1], ... [0.4, 0.3]], ... [[0.4, 0.3], ... [0.5, 0.1]]] >>> augmented_child_tpms = [[0, tpm], [0, tpm]] >>> marbl1 = Marbl(tpm, augmented_child_tpms) >>> marbl2 = Marbl(tpm2, augmented_child_tpms) >>> marbl1 == marbl2 True >>> augmented_child_tpms2 = [[0, tpm], [1, tpm]] >>> marbl2 = Marbl(tpm, augmented_child_tpms2) >>> marbl1 == marbl2 False >>> unnormalized = Marbl(tpm, augmented_child_tpms, ... normalize=False) >>> unnormalized == marbl1 False
- __hash__()¶
Return the canonical hash of the Marbl.
If two Marbls have the same hash, they are equivalent up to rearranging the labels of the covered node’s parents and the covered node’s children’s parents, as long as they were both normalized upon initialization.
Example
>>> tpm = [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]] >>> augmented_child_tpms = [[0, tpm], [1, tpm]] >>> marbl = Marbl(tpm, augmented_child_tpms) >>> hash(marbl) 482032824703719516
- pack()¶
Serialize the Marbl.
Example
>>> tpm = [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]] >>> augmented_child_tpms = [[0, tpm]] >>> marbl = Marbl(tpm, augmented_child_tpms) >>> marbl.pack() b'\x92\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\xcb?\xe0\x00\x00\x00\x00\x00\x00\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x91\x92\x00\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\xcb?\xe0\x00\x00\x00\x00\x00\x00\xcb?\xb9\x99\x99\x99\x99\x99\x9a'
- class marbl.MarblSet(marbls)¶
An immutable, unordered collection of not necessarily unique Markov blankets.
Provides methods for serialization and hashing.
Args: marbls (Iterable): The Marbls to include in the set.
- __hash__()¶
Return the canonical hash of the multiset of Marbls.
Example
>>> tpm = [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]] >>> augmented_child_tpms = [[0, tpm], [1, tpm]] >>> marbl = Marbl(tpm, augmented_child_tpms) >>> marbls = MarblSet([marbl]*3) >>> hash(marbls) 170586149808347817
- pack()¶
Serialize the multiset of Marbls.
Example
>>> tpm = [[0.3, 0.4], ... [0.1, 0.3]] >>> augmented_child_tpms = [[0, tpm]] >>> marbl = Marbl(tpm, augmented_child_tpms) >>> marbls = MarblSet([marbl]*2) >>> marbls.pack() b'\x92\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x91\x92\x01\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x92\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333\x91\x92\x01\x92\x92\xcb?\xd3333333\xcb?\xb9\x99\x99\x99\x99\x99\x9a\x92\xcb?\xd9\x99\x99\x99\x99\x99\x9a\xcb?\xd3333333'
- marbl.normalize_tpm(tpm, track_parent_index=None)¶
Return the normal form of a TPM. Optionally, also return the new dimension index of a particular parent in the normalized TPM.
The TPM should be p-dimensional, where p is the number of parents. For example, with three parents, TPM[0][1][0] should give the transition probability if the state of the parents is (0,1,0).
Parameters: tpm (list) – The child TPM to be normalized. Keyword Arguments: track_parent_index (int) – The zero-based index of the dimension corresponding to the covered node in the un-normalized child TPM. If this is not None, an normalized augmented child TPM will be returned instead of just a normalized TPM. Examples
>>> tpm = [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]] >>> equivalent = [[[0.3, 0.1], ... [0.4, 0.3]], ... [[0.4, 0.3], ... [0.5, 0.1]]] >>> normalize_tpm(tpm) == normalize_tpm(equivalent) True >>> answer = [2, [[[0.3, 0.1], ... [0.4, 0.3]], ... [[0.4, 0.3], ... [0.5, 0.1]]]] >>> normalize_tpm(tpm, track_parent_index=1) == answer True
- marbl.pack(obj)¶
Alias for Marbl.pack() and MarblSet.pack().
- marbl.unpack(packed_marbl)¶
Deserialize a Marbl.
Example
>>> tpm = [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]] >>> augmented_child_tpms = [[0, tpm], [1, tpm]] >>> marbl = Marbl(tpm, augmented_child_tpms) >>> marbl == unpack(pack(marbl)) True
- marbl.unpack_set(packed_marbls)¶
Deserialize a multiset of Marbls.
Example
>>> tpm = [[[0.3, 0.4], ... [0.1, 0.3]], ... [[0.4, 0.5], ... [0.3, 0.1]]] >>> augmented_child_tpms = [[0, tpm], [1, tpm]] >>> marbl = Marbl(tpm, augmented_child_tpms) >>> marbls = MarblSet([marbl]*3) >>> marbls == unpack_set(pack(marbls)) True