Replicated Heap for the HIP module

Question

Replicated Heap for the HIP module

Opened this issue a year ago · 6 comments

The hipHostRegister can not guarantee the same virtual address between host and gpu memory. However, the current implementation of ReplicatedHeap of the CUDA module relies on the UVA, so we can not easily copy paste the implementation from the CUDA module into the HIP module. This will affect the MultiAffineAccessor for now.

Answer 1 · 2023-07-07T19:46:16.000Z

@eddy16112 please keep in mind that NVIDIA CUDA cannot guarantee the same CPU and GPU address either, it just happens to be the case for the majority of Linux systems. See CU_DEVICE_ATTRIBUTE_CAN_USE_HOST_POINTER_FOR_REGISTERED_MEM for more information. We should not assume the pointer passed to cuMemHostRegister is the same as the gpu one for portability reasons.

Answer 2 · 2023-07-07T19:53:15.000Z

Yeah, I think we need to check if the addresses are identical, and if not, we will need to raise an error.

Answer 3 · 2023-07-09T17:16:33.000Z

But you nor the user can control this, you could get intermittent errors. Why do these need to be the same address? Why not just do the translation before using the address on the gpu? You cannot assume the same address across non-local CPU processors, so can we not use the same logic between local GPU and CPU processors?

Answer 4 · 2023-07-09T23:12:18.000Z

That will be a question for @streichler , AFAIK, without unified address, it will be much more complicated to implement the replicated heap.

Answer 5 · 2024-03-12T21:11:37.000Z

See also my test results at #1646 (comment)

It appears that unified addresses are supported with ROCm 5.6.0 and above.

Answer 6 · 2024-03-13T16:49:10.000Z

Is this resolved or is there more to do here?