TD3 vs SAC
HYDesmondLiu opened this issue · 0 comments
HYDesmondLiu commented
Hi,
First, thanks for sharing the repo.
I am really confused by the performance comparison between SAC and TD3.
In TD3's results, TD3 beats SAC in every environment evaluated with max avg. return after 1M timesteps (Table 1). However, in your SAC paper (Fig.1 ) it could be observed that almost in no environment TD3 beats SAC.
Is this because of different noises added in your and their experiments? Could you kindly provide some insights into this observation?