GPT-MoE supports for expert parallel
YJHMITWEB opened this issue · 0 comments
YJHMITWEB commented
Hi, I am wondering if https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gpt_guide.md#gpt-with-moe example provided has the support for expert parallel. The provided examples are using nlp_gpt3_text-generation_0.35B_MoE-64
, but there are only tensor parallel and pipeline parallel options.
Since in the Swin-Transformer-Quantization
folder, FasterTransformer is using the Swin-MoE
repo which supports expert parallel, I'd like to know how to enable this feature for GPT-MoE
as well.