Pipeline mode support

Question

Pipeline mode support

Closed this issue a month ago · 2 comments

I found that torch2.4 officially supports Pipeline Parallelism, the package name is torch.distributed.pipelining。 It was migrated from the PiPPy project.

Will exllamav2 add support for Pipeline Parallelism sometime?

Answer 1 · 2024-08-29T08:52:47.000Z

Possibly, but not before fully exploring tensor parallelism which potentially makes PP totally redundant. I.e. why run two staggered batches at 1x speed when you can run a single batch at 2x speed?

Answer 2 · 2024-09-02T04:03:59.000Z

TP have high communication requirements, so the performance improvement for PCIe devices might not be as significant as expected. On the other hand, PP has lower communication requirements, theoretically leading to a larger performance boost.

However, PP doesn't seem to be very useful for single-batch inference. 😂😂😂