sweetice/Deep-reinforcement-learning-with-pytorch

Char 05 DDPG: step index and episode index

Closed this issue · 1 comments

for i in range(args.num_iteration): state = env.reset() for t in range(args.max_episode):

from the above code we can infer that i stands for the i-th step, and t stands for the t-th episode.
However, it is shown in code:
print('Episode {}, The memory size is {} '.format(i, len(agent.replay_buffer.storage)))
that i is used for counting episode.

So do we need to change the positions of args.max_episode and args.num_iteration ?

Thanks for your issue.
Note that i stands for the number of episodes. And t stands for the length of one trajectory, i.e, the agent would stop a rollout when t>args.max_episode.
However, this may mislead programmers, this bug has fixed in new version.