Kevin-shihello-world/vllm-compress-comm

vllm-compress-comm use inverse FFT and a new kind of training strategy on training of a new kind of diffution model to compress those Tensors transport among GPUs in accelerating multi-GPU inferencing.

PythonApache-2.0

No issues in this repository yet.