gtcasl/gpuocelot

Add support for CUDA 5 features: dynamic parallelism etc..

jwang323 opened this issue · 1 comments

From rtf...@gmail.com on May 17, 2012 09:59:33

Only a remainder of features added to cuda 5.0 and that would be good to have in gpuocelot:
*SM_30 and SM_35 PTX instrinsics support
*Dynamic parallelism
object linking? don't know if that makes sense here..

Original issue: http://code.google.com/p/gpuocelot/issues/detail?id=68

From SolusStu...@gmail.com on May 30, 2012 08:57:22

For object linking, currently NVCC has only announced support for CUBIN linking. They will also support PTX linking in the future. At that time, Ocelot should support it with only minor changes, so I plan to wait for that feature. In the meantime, it is possible to 'link' PTX files together by simply concatenating them.

Dynamic parallelism should be supported by default on the NVIDIA devices since device code contains the kernel launch and interacts directly with the GPU driver.

There is experimental support for asynchronous dynamic parallelism in the emulator and LLVM backend via a user-level library that simply calls cudaLaunch from a different user-pthread. We plan to move this functionality into the CUDA runtime. We also need support for synchronous dynamic parallelism, which should be relatively easy, but still needs some implementation work. Of course both of these need unit tests.

I'm not sure what the status is on the AMD backend.

Labels: -Type-Defect -Priority-Medium Type-Enhancement Priority-High