Farama-Foundation/Metaworld

Is the fixed env.reset() intended?

chwoong opened this issue · 2 comments

Hello,

I wonder how to use single-goal environments refer to Accessing Single Goal Environments
If I fix the seed of the environment,

from metaworld.envs import ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE
import numpy as np

door_open_goal_observable_cls = ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLE["door-open-v2-goal-observable"]
env = door_open_goal_observable_cls(seed=0)

# Reset environment
obs = env.reset()  
obs = env.reset()
...

obs always returns the same observation including goal position.

My question is...
Is it possible to build an environment where the obs we get from env.reset() satisfies that the goal is fixed and the rest (robotic arm's position) is randomized?

Or is it intended that the initial state $s_0$ is always the same even when evaluating?

Sorry for the basic question, but I'm confused...

Thank you.

Hi! Yes if you use the environments individually, env.reset() returns the exact same values every time you reset. If you want to randomize the arm position you could randomly sample X actions and apply them in the environment before you start saving transitions.

Oh! thank you for your reply and it was very helful!!
I will close the issue
thank you:)