Value network
Closed this issue ยท 4 comments
Hi,
I skimmed over the author's implementation and it seems that they don't use the value network. Instead they only use the Q-networks. Seems they removed it in this commit
Thanks,
Lukas
Hmm...
You are correct the author's don't use a value function
https://arxiv.org/pdf/1812.05905.pdf -> automatic entropy tuning
Although, they did use it in their earlier paper https://arxiv.org/pdf/1801.01290.pdf -> Maybe that's why I never removed it
I agree with you, that we should remove the value function.
It will also make sac.py a bit cleaner. (Since we anyway don't use a value function in the deterministic case)
I have made some minor changes in the repo. (Nothing major -> Just changing variable names)
You are free to make a PR regarding this issue whenever you want.
Or I'll make the changes.
Whatever is fine with you.
I already made the changes in my repo. You can copy it from there or I can send you a pull request. As you want :)
Got it. ๐
I'll make the changes. ๐