Questions on Table 3 (AlfWorld)

Question

Questions on Table 3 (AlfWorld)

guosyjlu opened this issue a year ago · 1 comments

Hi,
Thanks for your great work! I have a question on Table 3, where results of Act and ReAct are reported as avg/best of 6. I am wondering where does 6 come from, given that the decoding strategy is greedy.
Thank you!

Answer 1 · 2023-10-25T20:26:58.000Z

Hi @guosyjlu , as stated in the paper, "For robustness, we construct 6 prompts for each task type through each permutation of 2 annotated trajectories from the 3 we annotate." So 6 trials come from different selections of prompting examples!