/RLHF-FLAN-T5-Anthropic

Performs RLHF learning using the FLAN-T5 transformer model with the Anthropic data

Primary LanguagePython

Watchers