kzl/decision-transformer

Minor bug that removes best performing trajectory in gym experiments

Closed this issue · 1 comments

I believe that this line should have a <= rather than a < in order for the code to not cut out the best performing trajectory even when using pct_traj = 1.

while ind >= 0 and timesteps + traj_lens[sorted_inds[ind]] < num_timesteps:

To replicate, use a dataset with 2 trajectories and use pct_traj = 1. and the resulting num_trajectories will just be 1 rather than 2.

kzl commented

Thanks for catching this! I think you're right.