/nanoChatGPT

A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick

Primary LanguagePythonMIT LicenseMIT

Watchers