Policy-based optimization : single-step policy gradient seen as an evolution strategy
Primary LanguagePythonMIT LicenseMIT