This module provides tools for collecting and managing sets of samples generated by the library’s sampling functions. By averaging a series of samples, the progam can approximate a joint probability distribution without having to do the exact calculations, which may be useful in large networks.
This class is a machine for aggregating data from sample sequences. It contains the method aggregate.
The sequence inputted.
The average of all the entries in seq, represented as a dict where each vertex has an entry whose value is a dict of {key, value} pairs, where each key is a possible outcome of that vertex and its value is the approximate frequency.
Generate a sequence of samples using samplerstatement and return the average of its results.
This function stores the output of samplerstatement in the attribute seq, and then averages seq and stores the approximate distribution found in the attribute avg. It then returns avg.
Usage example: this would print the average of 10 data points:
import json
from libpgm.nodedata import NodeData
from libpgm.graphskeleton import GraphSkeleton
from libpgm.discretebayesiannetwork import DiscreteBayesianNetwork
from libpgm.sampleaggregator import SampleAggregator
# load nodedata and graphskeleton
nd = NodeData()
skel = GraphSkeleton()
nd.load("../tests/unittestdict.txt")
skel.load("../tests/unittestdict.txt")
# topologically order graphskeleton
skel.toporder()
# load bayesian network
bn = DiscreteBayesianNetwork(skel, nd)
# build aggregator
agg = SampleAggregator()
# average samples
result = agg.aggregate(bn.randomsample(10))
# output
print json.dumps(result, indent=2)