Clean baseline implementation of PPO using an episodic TransformerXL memory
Primary LanguagePythonMIT LicenseMIT