LLNL/Umpire

Zero-out Kernel

kab163 opened this issue · 1 comments

Is your feature request related to a problem? Please describe.

Need to zero-out a large array of GPU memory ("fast" way to zero out device memory).

Describe the solution you'd like

Want to allocate a large array and zero it out. We could call malloc_zero_out_kernel(nbytes); or something like that instead of having to write our own kernel to zero it out. Have a built-in umpire function to do that.

Describe alternatives you've considered

Using the resource manager to do a memset takes too long. Allocating an array and then launching a kernel to zero out memory could work, but that adds more code.

Additional context

See teams conversation here.

Another idea is to not just have zero as the value to set a range of memory to, but any value (or at least -1 and nan.. maybe others)