bytedeco/javacpp-presets

[PyTorch] torch.cuda.is_bf16_supported() is missing

haifengl opened this issue · 8 comments

I cannot find it anywhere.

That is a Python-only function.
From the presets, you can check for the device compute capability, or just try to create a small BF16 tensor.

Thanks! How to check device compute capability? torch_cuda.getDeviceProperties() returns a plain Pointer.

Also how to get CUDA runtime version such as cudaRuntimeGetVersion()? torch.C10_CUDA_VERSION_MAJOR seems compile time version.

Thanks! How to check device compute capability? torch_cuda.getDeviceProperties() returns a plain Pointer.

Right. That's something I'm currently working on. Next version of Pytorch presets will depend on CUDA presets and this kind of function will return the proper type.
In the meantime, you could directly use the CUDA presets.

Also how to get CUDA runtime version such as cudaRuntimeGetVersion()? torch.C10_CUDA_VERSION_MAJOR seems compile time version.

I'm not sure. Maybe there is a way using the CUDA presets.

I guess you'd better try to create a BF16 gpu tensor and catch the exception if the final objective is the one of your top post.

Thanks. BTW, torch.C10_CUDA_VERSION_MAJOR and torch.C10_CUDA_VERSION are always 0, which are not correct.

It is not right just to create a BF16 tensor. On pre-ampere hardware bf16 works, but doesn't provide speed-ups compared to fp32 matmul operations, and some matmul operations are failing outright. So I would like to check cuda version and device compute capability.

Thanks!

Although these methods work fine on a single GPU box, they hang on a multi-GPU box.