Issues
- 1
- 3
Runtime Error when testing on H100
#8 opened by yihaocs - 0
error when use huggingface trainer
#9 opened by linyubupa - 2
Megatron-LM’s communication
#7 opened by wangpengfei1013 - 0
does it support mixtral_8x7b model?
#6 opened by strngelet - 3
how to use your code in other models?
#5 opened by zxgx - 1
About Megatron-LM sequence length
#3 opened by ZhongYingMatrix - 1
Same name paper
#1 opened by SeanLi-OI