Pebl Introduction
Pebl is a python library and command line application for learning the
structure of a Bayesian network given prior knowledge and observations. Pebl
includes the following features:
- Can learn with observational and interventional data
- Handles missing values and hidden variables using exact and heuristic
methods
- Provides several learning algorithms; makes creating new ones simple
- Has facilities for transparent parallel execution
- Calculates edge marginals and consensus networks
- Presents results in a variety of formats
Concepts
All Pebl analysis include data, a learner and a result. They may also
include prior models and task controllers.
- Data
- This is the set of observations that is used to score a given network.
The data can include missing values and hidden/unobserved variables and
observations can be marked as being the result of specific
interventions. Data can be read from a file or created programatically.
- Learner
- A learner implements a specific learning algorithm. It is given some
data, prior model and a stopping criteria and returns a result
object.
- Result
- A result object contains a list of the top-scoring networks found
during a learner run and some statistics about the analysis. Results
from different learning runs with the same data can be merged and
visualized in various formats.
- Prior Models
- A key strength of Bayesian analysis is the ability to integrate
knowledge with observations. A Pebl prior model specifies the prior
belief about the set of possible networks and can include hard and soft
constraints.
- Task Controllers
- Pebl uses task controllers to run analyses in parallel. Users can
utilize multiple CPU cores or computational clusters without managing
any of the details related to parallel programming.