openxla/xla

gpu f16 cast to fp32 calculation, and then converted back?

Opened this issue · 1 comments

For elementwise operations with fp16 input, the data is first converted to fp32, and convert back after call gpu functions? But gpu actually support fp16 and bf16.
bool cast_result_to_fp16 = false;

besides, does xla support tf32 now?