/self-taught-critiquer

Reducing the time to create critique-writing models by 100-1000x on n-digit arithmetic problems by getting the model to learn from its own generated outputs.

Primary LanguagePython

Stargazers