vimalabs/VIMABench

Will anyone want to try RL? How about Eureka method for generating rewards?

Opened this issue · 0 comments

Maybe we can use Eureka-like (https://eureka-research.github.io/) method to generate the reward function? Thanks!