/ELLA

Reward shaping approach for instruction following settings, leveraging language at multiple levels of abstraction.

Primary LanguagePython

Watchers