Question regarding the Mujoco environments
JasonMa2016 opened this issue · 1 comments
Dear authors,
Thank you for open-sourcing the code! I have a few questions regarding the Mujoco environments. First, I wonder how the mean and std arrays of `
class Hopper:
def __init__(self):
self.mean = np.array([1.41599384, -0.05478602, -0.25522216, -0.25404721,
0.27525085, 2.60889529, -0.0085352, 0.0068375,
-0.07123674, -0.05044839, -0.45569644])
self.std = np.array([0.19805723, 0.07824488, 0.17120271, 0.32000514,
0.62401884, 0.82814161, 1.51915814, 1.17378372,
1.87761249, 3.63482761, 5.7164752 ])`
are computed?
I am also hoping to figure out how to construct the equivalent function evaluation class for 2dWalker, HalfCheetah, Ant, and Humanoid as reported in the paper. As these are not provided in the repo, could you let me know how to do so? Do they also require some hard-coded mean and std values to normalize the policy matrix M?
Thank you very much!
Hello Jason,
Please follow ARS for a complete set of Mujoco environments. You can extract M (i.e. mean of states) from their released policy.
You can find ARS here:
https://github.com/modestyachts/ARS