jrobine/twm

Rollout Problem

baitingzbt opened this issue · 2 comments

When rolling out policy here. The nested function policy(index) ALWAYS assumes dreamer is None (i.e. never going to the else section).

I think its correct, can put a debug point to be sure, but

def fun1():
  dreamer = None

  def fun2():
    nonlocal dreamer
    if dreamer == None: 
      print("dreamer was none")
      dreamer = 1
    else : 
      print("dreamer working!")
  return fun2

myfunc = fun1()
myfunc()
myfunc()
myfunc()

prints the following :

dreamer was none
dreamer working!
dreamer working!

which is as expected.

Hi I think I see the problem now after more testing. The default config results in a train_every=0.76<1, which causes the trainer to create a new function object for rollout every step.