use/neural-net

Shouldn't this be total number of threads (i.e. batch size) rather than threads per block?

Closed this issue · 1 comments

use commented

cudaMalloc(&d_nodeErrors, sizeof(float) * numLayers * maxLayerSize * numBlocks * threadsPerBlock);

use commented

It is total num threads because numBlocks * threadsPerBlock