microsoft/mscclpp

[Doc] What are the differences between this repo mscclpp and msccl?

YJHMITWEB opened this issue · 4 comments

Hi, I am reading the documentation in both msccl and mscclpp, however, I am confused about what the relationship is between them. Does mscclpp have all the features of msccl? Or are they just for different use cases?

As I understand, msccl provides the same interfaces as nccl. User can easily replace msccl with nccl. msccl also has multi algorithms which can improve the perf for different network topologies. You can think msccl is a scheduler (multi algos) + NCCL executor.

mscclpp is a new type of executor. Due to nccl executor has some limitations such as is CPU driven and can not be fine-grain controlled, we publish mscclpp for advanced users. It is in the same level as nccl.

Hi @Binyang2014 , thanks for the reply.

I see. As you mentioned, MSCCL is a scheduler (multi-algos) + NCCL executor, and MSCCLPP is at the same level as NCCL. Does that mean MSCCLPP itself does not have that high-level scheduler (multi-algos)? And if not, is it possible to use MSCCL + MSCCLPP, so that we have the scheduler and high-performant executors?

We don't have the scheduler yet. It's under our plan.

Got it, thanks! @Binyang2014