defaults: add e-greedy

Question

Closed this issue 7 years ago · 2 comments

We need an example default for e-greedy. Only get action differs from e-first:

getcontext code: {} # Empty

getaction code:
e = .1
if(binomial(p=e) == 1:

else:

(Note, the action JSON object should include "propensity" = Pr(action). This can be computed using (1-e)*pr() in the above example code).

getreward code:
R_a ~ normal(0, 1)
R_b ~ normal(1,1)

setreward code:
update the appropriate mean reward for the action.

Correct

Answer 1 · 2017-09-25T12:31:04.000Z

The propensities are e*0.5 and (1-e) (instead of (1-e)*pr(), since that greedy step you will always take the max), right? @MKaptein