tristandeleu/pytorch-maml-rl

what is the mean of train_episodes and valid_episodes?

GeorgeDUT opened this issue · 4 comments

the test.py writes a file about "task", "train_episodes", "valid_episodes". "train_episodes", "valid_episodes" are the total rewards of an episode?

Yes train_returns and valid_returns are the (undiscounted) cumulated rewards for each task and each episode.

what is the difference between "train_return" and "valid_return"?

train_return correspond to the returns for the trajectories sampled with the initial policy (before adaptation), and valid_return correspond to the returns for the trajectories sampled with the adapted policy. Taking the notations of the paper (Algorithm 3), train_returns correspond to the returns on D and valid_returns correspond to the returns on D'.

thanks so much, I get it