SCIInstitute/SCI-Solver_FEM

Get to build and run with no errors and possibly reasonable output

Closed this issue · 2 comments

Master still has runtime CUDA errors. develop branch might solve this.

Up to 63 Registers are used in a thread at a time. A common max registers per block is 32768. 32768 / 63 is a max of 520 threads per block, which is often seen passed in multiple runs. Clamping threads/block to < max registers per block / 63 will need to be implemented.

OR the files using too many registers may need modifications to use fewer.

completed with f1d334d