NVIDIA/Megatron-LM

[BUG] [MoE] Typo in Token Drop policy's default value

Closed this issue · 3 comments

Describe the bug
In the new MoE Token Drop code, the code and documentation expect drop_policy to be either prob or position, but the current default argument is probs.

https://github.com/NVIDIA/Megatron-LM/blob/db3a3f79d1cda60ea4b3db0ceffcf20c5760e11d/megatron/core/transformer/moe/moe_utils.py#L272C5-L272C16

@yanring

Proposed fix
#811

Thanks for the fix, @passaglia! We've already got an internal MR that's been reviewed to fix this issue, so it should be synced to GitHub soon. Thanks again!

Great, thank you @yanring ! I'll close this issue once the GitHub repo is updated.

Fixed in 7968fd6