Iterated Intelligence

Implements Uncoordinated PGT Intelligence for iterSemiNFG objects

Created on Wed Jan 2 16:33:36 2013

Copyright (C) 2013 James Bono

GNU Affero General Public License

pynfg.pgtsolutions.intelligence.iterated.iterated_MC(G, S, noise, X, M, innoise=1, delta=1, integrand=None, mix=False, satisfice=None)[source]

Run Importance Sampling on policy sequences for PGT IQ Calculations

For examples, see below or PyNFG/bin/hideandseek.py

Parameters:
  • G (iterSemiNFG) – the game to be evaluated
  • S (int) – number of policy profiles to sample
  • noise (float) – the degree of independence of the proposal distribution on the current value. 1 is independent, 0 returns no perturbation.
  • X (int) – number of samples of each policy profile
  • M (int) – number of random alt policies to compare
  • innoise (float) – the perturbation noise for the loop within iq_calc to draw alt CPTs to compare utilities to current CPT.
  • delta (float) – the discount factor (ignored if SemiNFG)
  • integrand (func) – a user-supplied function of G that is evaluated for each s in S
  • mix (bool) – False if restricting sampling to pure strategies. True if mixed strategies are included in sampling. Default is False.
  • satisfice (iterSemiNFG) – game G such that the CPTs of G together with innoise determine the intelligence satisficing distribution.
Returns:

  • intel - a sample-keyed dictionary of basename-keyed timestep iq lists
  • funcout - a sample-keyed dictionary of the output of the user-supplied integrand.
  • weight - a sample-keyed dictionay of basename-keyed importance weight dictionaries.

Warning

This will throw an error if there is a decision node in G.starttime that is not repeated throughout the net.

Note

This is an uncoordinated approach because intelligence is assigned to a decision node instead of players. As a result, it takes much longer to run than pynfg.pgtsolutions.intelligence.policy.policy_MC

Example:

def welfare(G):
    #calculate the welfare of a single sample of the iterSemiNFG G
    G.sample()
    w = 0
    for p in G.players:
        w += G.npv_reward(p, G.starttime, 1.0)
    return w

import copy
GG = copy.deepcopy(G) #G is an iterSemiNFG
S = 50 #number of MC samples
X = 10 #number of samples of utility of G in calculating iq
M = 20 #number of alternative strategies sampled in calculating iq
noise = .2 #noise in the perturbations of G for MC sampling

from pynfg.pgtsolutions.intelligence.iterated import iterated_MC

intelMC, funcoutMC, weightMC = iterated_MC(GG, S, noise, X, M,
                                           innoise=.2,
                                           delta=1,
                                           integrand=welfare,
                                           mix=False,
                                           satisfice=GG)
pynfg.pgtsolutions.intelligence.iterated.iterated_MH(G, S, density, noise, X, M, innoise=1, delta=1, integrand=None, mix=False, satisfice=None)[source]

Run Metropolis-Hastings on policy sequences for PGT IQ Calculations

For examples, see below or PyNFG/bin/hideandseek.py

Parameters:
  • G (iterSemiNFG) – the game to be evaluated
  • S (int) – number of MH iterations
  • density (func) – the function that assigns weights to iq
  • noise (float) – the degree of independence of the proposal distribution on the current value. 1 is independent, 0 returns no perturbation.
  • X (int) – number of samples of each policy profile
  • M (int) – number of random alt policies to compare
  • innoise (float) – the perturbation noise for the loop within iq_calc to draw alt CPTs to compare utilities to current CPT.
  • delta (float) – the discount factor (ignored if SemiNFG)
  • integrand (func) – a user-supplied function of G that is evaluated for each s in S
  • mix (bool) – if true, proposal distribution is over mixed CPTs. Default is False.
  • satisfice (iterSemiNFG) – game G such that the CPTs of G together with innoise determine the intelligence satisficing distribution.
Returns:

  • intel - a sample-keyed dictionary of basename-keyed timestep iq lists
  • funcout - a sample-keyed dictionary of the output of the user-supplied integrand.
  • dens - a list of the density values, one for each MH draw.

Warning

This will throw an error if there is a decision node in G.starttime that is not repeated throughout the net.

Note

This is an uncoordinated approach because intelligence is assigned to a decision node instead of players. As a result, it takes much longer to run than pynfg.pgtsolutions.intelligence.policy.policy_MH

Example:

def density(iqdict):
    #calculate the PGT density for a given iqdict
    x = iqdict.values()
    y = np.power(x,2)
    z = np.product(y)
    return z

def welfare(G):
    #calculate the welfare of a single sample of the iterSemiNFG G
    G.sample()
    w = 0
    for p in G.players:
        w += G.npv_reward(p, G.starttime, 1.0)
    return w

import copy
GG = copy.deepcopy(G) #G is an iterSemiNFG
S = 50 #number of MH samples
X = 10 #number of samples of utility of G in calculating iq
M = 20 #number of alternative strategies sampled in calculating iq
noise = .2 #noise in the perturbations of G for MH sampling

from pynfg.pgtsolutions.intelligence.iterated import iterated_MH

intelMH, funcoutMH, densMH = iterated_MH(GG, S, density, noise, X, M,
                                         innoise=.2,
                                         delta=1,
                                         integrand=welfare,
                                         mix=False,
                                         satisfice=GG)
pynfg.pgtsolutions.intelligence.iterated.iterated_calciq(bn, G, X, M, mix, delta, start, innoise, satisfice=None)[source]

Estimate IQ of player’s policy at a given time step

Parameters:
  • p (str) – the name of the player whose intelligence is being evaluated.
  • G (iterSemiNFG) – the iterated semi-NFG to be evaluated
  • X (int) – number of samples of each policy profile
  • M (int) – number of random alt policies with which to compare
  • mix (bool) – if true, proposal distribution is over mixed CPTs. Default is False.
  • delta (float) – the discount factor (ignored if SemiNFG)
  • innoise (float) – the perturbation noise for the inner loop to draw alt CPTs
  • satisfice (iterSemiNFG) – game G such that the CPTs of G together with innoise determine the intelligence satisficing distribution.
Returns:

an estimate of the fraction of alternative policies at the given time step that have a lower npv reward than the current policy.

Previous topic

Policy Intelligence

Next topic

RL Solutions

This Page

Mailing List

Join the Google Group: