dmlc/mshadow

potential random issue with DTypes

Closed this issue · 8 comments

Currently random number generation only supporting float and double type using cuRAND. According to cuRAND doc(CUDA7.5, I haven't found the link to CUDA8.0), half type random number is not yet supported. A candidate solution is to create one extra float type tensor to generate values and convert them into DTypes other than float or double.

Currently the Random is used in dropout layer to created mask, which might be a issue if we want to support DType.
@tqchen

What we can do is create a random number using real, and run a cast to cast the result

I know, but cuRAND creates a chunk of random numbers given the pointer. So either we generate and cast them one by one to save space, or cast the entire chunk to save time. I'm not sure if the device api could be applied in this case.

problem is bypassed.

How did you bypass the problem?

By using tcast in dropout layer. just like what the cast layer do. I checked pdf files in cuda8.0 for cuRAND, but I don't see any half random generation stuff. In future, a Dtype random generation might be needed.

The only place right now where that solution is unsatisfactory is for Float64 where we limit the amount of randomness to 32bit

Let me check my dropout code. I think we can solve it like cudnn batch norm with some switch case stuff. Also it seems that cudnn dropout is not implemented in mxnet.

I just checked my dropout. I didn't use 'swith case' suff in mask generation. The default real_t is already good enough.