eholk/harlan

Lazy memory transfers

Closed this issue · 1 comments

It struck me this morning that this may not actually be hard to do at all, and it will hopefully improve our performance.

The idea is to not transfer data until we need it. We could do this dynamically by having a flag in the region header saying whether the region is on the CPU or GPU. Before each kernel, we send the necessary regions to the GPU, checking the flag first to make sure the region isn't already there. On exit from kernels, we wouldn't blindly copy all regions back. Instead, in get_region_pointer (the function that is called every time we want to access data in a region), we'd have it check the flag and transfer the region back to the CPU if necessary.

We should be able to do this with minimal changes to the compiler.

I have this mostly working in a branch, and it did improve our performance. It also caused test failures, so I'm going to wait a bit to pull it into master.