cpd – Conditional probability distributions

A cpd (conditional probability distribution) determines the probability of a variable given its parents. Currently, pebl inbcludes a pure-python and a C implementation of a multinomial cpd.

A cpd has only three public methods:

class pebl.cpd.CPD(data_)

Conditional probability distributions.

Currently, pebl only includes multinomial cpds and there are two versions: a pure-python and a fast C implementation. The C implementation will be used if available.

Create a CPD.

data_ should only contain data for the nodes involved in this CPD. The first column should be for the child node and the rest for its parents.

The Dataset.subset method can be used to create the required dataset:

d = data.fromfile("somedata.txt")
n = network.random_network(d.variables)
d.subset([child] + n.edges.parents(child))
loglikelihood()

Calculates the loglikelihood of the data.

This method implements the log of the g function (equation 12) from:

Cooper, Herskovitz. A Bayesian Method for the Induction of Probabilistic Networks from Data.

replace_data(oldrow, newrow)

Replaces a data row with a new one.

Missing values are handled using some form of sampling over the possible values and this requires making small changes to the data. Instead of recreating a CPD after every change, it’s far more efficient to simply make a small change in the CPD.

Previous topic

config – Pebl’s configuration system

Next topic

data – Pebl Dataset

This Page