kzl/decision-transformer

position embeddings do not vary between various time steps

udaymallappa opened this issue · 1 comments

The code for the gym seems to be in alignment with the paper. However, the Atari code seems to be having some inconsistencies w.r.t pos embeddings. The timesteps seems to be storing just the start index of the block. For example, with a block_size of 30, the following line model_atari.py indicates that the embeddings at time step t1, t2, .... t30 are the same. This is different from how Gym code implements timesteps stores the entire sequence of indices using "timesteps.append(np.arange(s[-1].shape[1]).reshape(1, -1))"

position_embeddings = torch.gather(all_global_pos_emb, 1, torch.repeat_interleave(timesteps, self.config.n_embd, dim=-1)) + self.pos_emb[:, :token_embeddings.shape[1], :]

kzl commented

It is a little different but shouldn't matter much.