The member value_c and entropy_c in A2CAgent

Question

The member value_c and entropy_c in A2CAgent

Closed this issue 4 years ago · 4 comments

I can see the comment of the 2 member values: coefficients are used for the loss terms.
I can see they are used when calculating the loss values. What's the purpose of the 2 values and how they are set? The blog article seems didn't mention them.

Answer 1 · 2020-07-09T15:56:22.000Z

Hello,
They are scaling coefficients and can be treated as hyperparameters. Value is often set to 0.5 to match with MSE loss derivative. Entropy should be low enough to only slightly nudge policy in the uniform direction, but not interfere with it.

Answer 2 · 2020-07-10T03:36:29.000Z

Thanks for the explanation and I have a rough understanding now. Is there any recommended documentation about them? Looks they are not widely used and It's the first time I see someone mentions them.

Answer 3 · 2020-07-10T13:28:52.000Z

I don't know of a resource where it's explicitly described. Hyperparameter choice is often more art than science, usually people pick what others have in the past as a baseline and iterate over them with a sweep or even just manual perturbations.

Answer 4 · 2020-07-11T04:08:12.000Z

OK, thanks all the same!