Stanford-ILIAD/ELLA
Reward shaping approach for instruction following settings, leveraging language at multiple levels of abstraction.
Python
Reward shaping approach for instruction following settings, leveraging language at multiple levels of abstraction.
Python