Bandits

flask_mab.bandits Module

class flask_mab.bandits.AnnealingSoftmaxBandit(tau=0)[source]

Bases: flask_mab.bandits.SoftmaxBandit

class flask_mab.bandits.Bandit[source]

Bases: object

The primary bandit interface. Don’t use this unless you really want uniform random arm selection (which defeats the whole purpose, really)

Used as a control to test against and as an interface to define methods against.

add_arm(arm_id, value=None)[source]
classmethod fromdict(dict_spec)[source]
pull_arm(arm_id)[source]
reward_arm(arm_id, reward)[source]
suggest_arm()[source]

Uniform random for default bandit.

Just uses random.choice to choose between arms

class flask_mab.bandits.EpsilonGreedyBandit(epsilon=0.1)[source]

Bases: flask_mab.bandits.Bandit

Epsilon Greedy Bandit implementation. Aggressively favors the present winner.

Will assign winning arm 1.0 - epsilon of the time, uniform random between arms epsilon % of the time.

Will “exploit” the present winner more often with lower values of epsilon, “explore” other candidates more often with higher values of epsilon.

Parameters:epsilon (float) – The percentage of the time to “explore” other arms, E.G a value of 0.1 will perform random assignment for %10 of traffic
suggest_arm()[source]

Get an arm according to the EpsilonGreedy Strategy

class flask_mab.bandits.NaiveStochasticBandit[source]

Bases: flask_mab.bandits.Bandit

A naive weighted random Bandit. Favors the winner by giving it greater weight in random selection.

Winner will eventually flatten out the losers if margin is great enough

suggest_arm()[source]

Get an arm according to the Naive Stochastic Strategy

class flask_mab.bandits.SoftmaxBandit(tau=1.0)[source]

Bases: flask_mab.bandits.NaiveStochasticBandit

class flask_mab.bandits.ThompsonBandit(prior=(1.0, 1.0))[source]

Bases: flask_mab.bandits.NaiveStochasticBandit

reward_arm(arm_id, reward)[source]
suggest_arm()[source]
flask_mab.bandits.all_same(items)[source]
flask_mab.bandits.random() → x in the interval [0, 1).
Fork me on GitHub