lucidrains/phasic-policy-gradient

Need some information

RajS999 opened this issue · 0 comments

Can we use these implementations of agents in non-gaming environments? That is mainly when state or observation space is not game frames but say lists. I am looking for simpler PPG implementation for such list-based state space. The possible issue is to have simple feed forward neural network based policy instead of CNN based policy.

Just for comparison, I guess, in stable-baselines3, MlpPolicy is used as policy for such cases, unlike CnnPolicy which is required for gaming environments observation / state space. But stable-baselines3 does not have PPG implementation.

PS: sorry, I feel raising issue is incorrect way to ask for information