Host-transfer multi-threading issue
ndryden opened this issue · 0 comments
ndryden commented
When using multiple threads (even on #130), some threads in some ranks non-deterministically get incorrect results when testing. Anecdotally, the incorrect results are either a vector of all 0s (which is strange since the code does not zero memory) or simply incorrect values (but which seem to be in a reasonable range, e.g., no NaNs or clear garbage).
I do not observe this issue with only a single thread.