Monte Carlo RL¶

Implements Monte Carlo Reinforcement Learning for iterSemiNFG objects

Created on Mon Feb 18 09:03:32 2013

GNU Affero General Public License

pynfg.rlsolutions.mcrl.ewma_mcrl(Game, bn, J, N, alpha, delta, eps, uni=False, pureout=False)[source]¶

Use EWMA MC RL to approximate the optimal CPT at bn given G

Parameters:

Game (iterSemiNFG) – The iterated semi-NFG on which to perform the RL
bn (str) – the basename of the node with the CPT to be trained
J (int, list, or np.array) – The number of runs per training episode. If a schedule is desired, enter a list or np.array with size equal to N.
N (int) – The number of training episodes
alpha (int, list or np.array) – The exponential weight for the moving average. If a schedule is desired, enter a list or np.array with size equal to N
delta (float) – The discount factor
eps (float) – The maximum step-size for policy improvements
uni (bool) – if True, training is initialized with a uniform policy. Default False to allow “seeding” with different policies, e.g. level k-1
pureout (bool) – if True, the policy is turned into a pure policy at the end of training by assigning argmax actions prob 1. Default is False

Example:

import copy
GG = copy.deepcopy(G)
from pynfg.rlsolutions.mcrl import ewma_mcrl
G1, Rseries = ewma_mcrl(GG, 'D1', J=np.floor(linspace(300,100,num=50)),
                        N=50, alpha=1, delta=0.8, eps=0.4, 
                        pureout=True)

Monte Carlo RL¶

Previous topic

Next topic

This Page

Mailing List

Navigation

Monte Carlo RL¶

Previous topic

Next topic

This Page

Quick search

Mailing List

Navigation