:mod:`evaluator` -- Network evaluators ====================================== .. module:: pebl.evaluator :synopsis: Network evaluators Greedy learning algorithms work by scoring small, local changes to existing networks. To do this efficiently, one must maintain state to eliminate redundant computation or unnecessary cache retrievals -- that is, we should only rescore nodes that have changed. Maintaining this state can make implementing efficient versions of learning algorithms more difficult than a naive implementation. The classes in this module provide helpers that encapsulate all the state-management complexities required for efficient scoring. As long as callers make changes to networks in a transactional manner (using the provided methods), networks will be scored efficiently without redendant computation. The main evaluator is the SmartNetworkEvaluator class. It's interface is described below. Note: Most user's shouldn't need to use this module directly. All included learners encapsulate this functionality. This is really only for writing custom learners. LocalscoreCache --------------- Although most users will never use the localscore cache directly, using pebl with large datasets will require setting the maximum size of the cache to avoid memory issues. There is only one relevant configuration parameter. .. confparam:: localscore_cache.maxsize Max number of localscores to cache. Default=-1 means unlimited size. default=-1 SmartNetworkEvaluator --------------------- .. autoclass:: SmartNetworkEvaluator :members: Network Evaluators for use with Missing Values ---------------------------------------------- Scoring networks with missing values requires use of sampling algorithms to sample over the space of possible completions for the missing values. Pebl provides a few algorithms for this. Configuration Parameters ^^^^^^^^^^^^^^^^^^^^^^^^ .. Autogenerated by pebl.config.paramdocs at Tue Apr 29 10:52:50 2008 .. confparam:: evaluator.missingdata_evaluator Evaluator to use for handling missing data. Choices include: * gibbs: Gibb's sampling * maxentropy_gibbs: Gibbs's sampling over all completions of the missing values that result in maximum entropy discretization for the variables. * exact: exact enumeration of all possible missing values (only useable when there are few missing values) default=gibbs .. Autogenerated by pebl.config.paramdocs at Tue Apr 29 10:33:33 2008 .. confparam:: gibbs.burnin Burn-in period for the gibbs sampler (specified as a multiple of the number of missing values) default=10 .. confparam:: gibbs.stopping_criteria Stopping criteria for the gibbs sampler. Should be a valid python expression that evaluates to true when gibbs should stop. It can use the following variables: * iters: number of iterations * n: number of missing values Examples: * iters > n**2 (for n-squared iterations) * iters > 100 (for 100 iterations) default=iters > n**2 MissingDataNetworkEvaluator ^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autoclass:: MissingDataNetworkEvaluator :members: .. autoclass:: MissingDataMaximumEntropyNetworkEvaluator :members: .. autoclass:: MissingDataExactNetworkEvaluator :members: Factory Functions ------------------ .. autofunction:: fromconfig