Monte Carlo RL

Implements Monte Carlo Reinforcement Learning for iterSemiNFG objects

Created on Mon Feb 18 09:03:32 2013

Copyright (C) 2013 James Bono (jwbono@gmail.com)

GNU Affero General Public License

pynfg.rlsolutions.mcrl.ewma_mcrl(Game, bn, J, N, alpha, delta, eps, uni=False, pureout=False)[source]

Use EWMA MC RL to approximate the optimal CPT at bn given G

Parameters:
  • Game (iterSemiNFG) – The iterated semi-NFG on which to perform the RL
  • bn (str) – the basename of the node with the CPT to be trained
  • J (int, list, or np.array) – The number of runs per training episode. If a schedule is desired, enter a list or np.array with size equal to N.
  • N (int) – The number of training episodes
  • alpha (int, list or np.array) – The exponential weight for the moving average. If a schedule is desired, enter a list or np.array with size equal to N
  • delta (float) – The discount factor
  • eps (float) – The maximum step-size for policy improvements
  • uni (bool) – if True, training is initialized with a uniform policy. Default False to allow “seeding” with different policies, e.g. level k-1
  • pureout (bool) – if True, the policy is turned into a pure policy at the end of training by assigning argmax actions prob 1. Default is False

Example:

import copy
GG = copy.deepcopy(G)
from pynfg.rlsolutions.mcrl import ewma_mcrl
G1, Rseries = ewma_mcrl(GG, 'D1', J=np.floor(linspace(300,100,num=50)),
                        N=50, alpha=1, delta=0.8, eps=0.4, 
                        pureout=True)

Previous topic

RL Solutions

Next topic

Q-Learning

This Page

Mailing List

Join the Google Group: