/LlamaGym

Fine-tune LLM agents with online reinforcement learning

Primary LanguagePythonMIT LicenseMIT

Watchers