Questions about std in SVEA paper

Question

Questions about std in SVEA paper

Closed this issue 2 years ago · 2 comments

Hi, thanks for the great work!
I've noticed that "Hi, we compute the standard deviation over the mean episode returns of each seed". from the previous issue. (#4)
However, I'm still a bit confused. Could you please confirm if my understanding is correct?

(Fig.5 Top) Training performance: std of 5 seeds
(Fig.5 Bottom) Test performance: For each seed, run zero-shot evaluation 30 times (args.eval_episode) and calculate the mean from these 30 Return values (resulting in 1 mean value per seed). Then compute std using these 5 mean values.

Thank you!

Answer 1 · 2023-04-13T00:23:44.000Z

I'm glad that you're interested in our work! I assume that you are referring to Figure 5 in our SVEA paper (https://arxiv.org/abs/2107.00644). Your understanding is correct: we evaluate each seed for X episodes, compute the mean return for each seed, and then report mean + std across seeds. This ensures that the std reflects variability between independent runs (seeds) rather than variability in the environment (e.g. initial conditions).

Answer 2 · 2023-04-13T01:43:33.000Z

Thank you so much for the quick reply.