MineDojo/MineCLIP

Training details about MineAgent

mansicer opened this issue · 9 comments

Hi. Thank you for releasing the precious benchmark! I'm working on implementing the PPO agent you reported in the paper. However, I found some misalignments between the code and your paper.

Trimmed action space

As mentioned by #4, the code below does not correspond to the 89 action dims in Appendix G.2.

action_dim=[3, 3, 4, 25, 25, 8],

About the compass observation

In the paper I see that the compass has a shape of (2,). However, I see an input of (4,) shape in your code.

"compass": torch.rand((B, 4), device=device),

Training on MultiDiscrete action space

Is the 89-dimension action space in the paper a MultiDiscrete action space like the original MineDojo action space, or you simply treat it as a Discrete action space?

In addition, can you release the training code on three task groups in the paper (or share this code via my GitHub email)? It will be beneficial for baseline comparisons!