AI4Finance-Foundation/ElegantRL

AgentBase.py中std的计算是否有问题

Opened this issue · 2 comments

第235行部分self.cri.state_std[:] = self.cri.state_std,这样的话cri.state_std是永远不变的

谢谢你的提醒,我们检查了相关函数,的确发现问题,如下:

在函数 def update_avg_std_for_normalization 里,我们更新了 state_std

state_avg = states.mean(dim=0, keepdim=True)
state_std = states.std(dim=0, keepdim=True)
self.act.state_avg[:] = self.act.state_avg * (1 - tau) + state_avg * tau
self.act.state_std[:] = self.cri.state_std * (1 - tau) + state_std * tau + 1e-4
self.cri.state_avg[:] = self.act.state_avg
self.cri.state_std[:] = self.cri.state_std
returns_avg = returns.mean(dim=0)
returns_std = returns.std(dim=0)
self.cri.value_avg[:] = self.cri.value_avg * (1 - tau) + returns_avg * tau
self.cri.value_std[:] = self.cri.value_std * (1 - tau) + returns_std * tau + 1e-4

应该改成

        self.cri.state_avg[:] = self.act.state_avg
        self.cri.state_std[:] = self.act.state_std   # 这里应该做出修改

#304

我提交了一个PR修复您提及的bug,再次感谢你