StepNeverStop/RLs

Check that the code implementation is accurate and reasonable

StepNeverStop opened this issue · 2 comments

  • check and fix C51 [deaab73]
  • check qrdqn [deaab73]
  • check iqn
  • check and fix Rainbow
  • check on-policy buffer sampling
  • check function discounted_sum
  • check function calculate_td_error
  • checke whether works well when training with visual input
  • fix TRPO that step_size sometime be nan
  • check vdn and qmix
  • 检查将代码中关于运算维度的选择(dim/axis)把能设置为-1的都设置为-1。
  • 校正RNN隐状态在使用探索策略时的迭代更新 abf6b0a
  • 实现按策略与环境交互的间隔更新策略 abf6b0a