TorchMoE/MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models.
PythonApache-2.0
Issues
- 1
- 1
Can the MoE-Infinity framework be used in conjunction with the vLLM framework?
#23 opened by alphabewitch - 3
run on the mutiple gpus
#15 opened by YLSnowy - 0
CPU memory problem when using gptq quantization
#28 opened by JustQJ - 2
RuntimeError: CUDA error: invalid device ordinal. When I run script.py, I meet the error below.
#27 opened by Tingberer - 1
- 4
- 2
Output of Mixtral-8*7b is strange
#16 opened by JustQJ - 0
TODO for first release
#1 opened by drunkcoding - 2
How to Install it?
#10 opened by MSGitt - 2
Install from pip failed
#11 opened by future-xy - 0
Grok-1 Support
#8 opened by drunkcoding - 1
MoE-Infinity API Proposal
#2 opened by drunkcoding - 0
Support Constrained Server Memory
#5 opened by drunkcoding