Value network

Question

Value network

Closed this issue 5 years ago · 4 comments

Hi,

I skimmed over the author's implementation and it seems that they don't use the value network. Instead they only use the Q-networks. Seems they removed it in this commit

Thanks,

Lukas

Answer 1 · 2019-04-04T17:28:11.000Z

Hmm...
You are correct the author's don't use a value function
https://arxiv.org/pdf/1812.05905.pdf -> automatic entropy tuning
Although, they did use it in their earlier paper https://arxiv.org/pdf/1801.01290.pdf -> Maybe that's why I never removed it

I agree with you, that we should remove the value function.
It will also make sac.py a bit cleaner. (Since we anyway don't use a value function in the deterministic case)

I have made some minor changes in the repo. (Nothing major -> Just changing variable names)

You are free to make a PR regarding this issue whenever you want.

Answer 2 · 2019-04-05T21:17:50.000Z

Or I'll make the changes.
Whatever is fine with you.

Answer 3 · 2019-04-05T21:47:43.000Z

I already made the changes in my repo. You can copy it from there or I can send you a pull request. As you want :)

Answer 4 · 2019-04-05T21:57:17.000Z

Got it. 👍
I'll make the changes. 😁