/T5-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) and GAN (Generative Adversarial Network) on top of the T5 architecture.

Primary LanguagePythonMIT LicenseMIT

Watchers