mees/calvin_env

Major concern about evaluation

ezhang7423 opened this issue · 3 comments

Hi there!
I've found that rolling out ground truth trajectories (labelled by the language annotator) from the dataset is not always evaluated to be successful by the Tasks.get_task_info. This seems to be quite concerning. Perhaps I've done something wrong on my end?

image

Could you share which code you ran exactly?

And could I ask you to move (i.e. reopen) this issue to the calvin repo?

Sorry for the late reply! I've attached info to reproduce in this issue: mees/calvin#32