zchuning opened this issue 3 years ago · 0 comments
Hi,
Could you release the evaluation script for the benchmark? In particular, it will be very helpful to know which policy seeds/checkpoints are used for each evaluation metric.