pranz24/pytorch-soft-actor-critic

Value network

Closed this issue ยท 4 comments

Hi,

I skimmed over the author's implementation and it seems that they don't use the value network. Instead they only use the Q-networks. Seems they removed it in this commit

Thanks,

Lukas

Hmm...
You are correct the author's don't use a value function
https://arxiv.org/pdf/1812.05905.pdf -> automatic entropy tuning
Although, they did use it in their earlier paper https://arxiv.org/pdf/1801.01290.pdf -> Maybe that's why I never removed it

I agree with you, that we should remove the value function.
It will also make sac.py a bit cleaner. (Since we anyway don't use a value function in the deterministic case)

I have made some minor changes in the repo. (Nothing major -> Just changing variable names)

You are free to make a PR regarding this issue whenever you want.

Or I'll make the changes.
Whatever is fine with you.

I already made the changes in my repo. You can copy it from there or I can send you a pull request. As you want :)

Got it. ๐Ÿ‘
I'll make the changes. ๐Ÿ˜