Improve the C++ usage, by removing the need for allocating with new
NetroScript opened this issue · 1 comments
Currently a new and delete are needed, to allocate the object both on the GPU and CPU like following:
auto* memAccessStorage = new CudaMemAccessStorage<unsigned int>(size*20);
// ....
// Free up the managed memory objects
delete memAccessStorage;This should be replaced by an approach where C++ automatically manages the memory.
There are several approaches for this, for example that the CudaMemAccessStorage constructor for the CPU, internally also generates a CudaMemAccessStorage instance on the GPU, and you would then pass this instance to the kernel, instead of the original object.
This removes the need for new and delete (as this can be then done in the constructor), but slightly increases the needed code in the kernel call from for example:
reduce<<<blocks, threads>>>(size, input, output);to
reduce<<<blocks, threads>>>(size, input.GPU(), output.GPU());This is obsolete by making use of smart pointers, the new README will contain an example which shows how to use it, so no changes to the library are necessairy.