ray-project/ray

[core][experimental] Support nested and dynamically sized GPU tensors via NCCL

Closed this issue · 2 comments

Description

#45092 introduces statically sized p2p transfer for GPU tensors via NCCL. We should also support GPU tensors that are stored inside other data on host memory, as well as dynamically sized tensors.

Use case

No response

@stephanie-wang is this something that is needed for integration with vLLM or is this a usability item to make it more natural for NCCL users to convert to ADAG?

It is both!

A lighterweight integration with vLLM is possible without this, but supporting this would mean that GPU-GPU communication can also be done through DAGs (instead of user code).