A bfloat16
implementation for projects of BioVault (Biomedical Visual Analytics Unit LUMC - TU Delft)
Originally based upon dnnl::impl::bfloat16_t
from the Deep Neural Network Library (DNNL) of Intel Corporation:
- https://github.com/intel/mkl-dnn/blob/v1.2/src/cpu/bfloat16.cpp
- https://github.com/intel/mkl-dnn/blob/v1.2/src/common/bfloat16.hpp
Note: This repository is no longer maintained at github.com/N-Dekker. New location: https://github.com/biovault/biovault_bfloat16
-
Intel, BFLOAT16 – Hardware Numerics Definition", White Paper, November 2018, Revision 1.0 Document Number: 338302-001US https://software.intel.com/en-us/download/bfloat16-hardware-numerics-definition
-
Wikipedia bfloat16 floating-point format current revision, as edited at 19:51, 30 December 2019 (Updating to Intel latest releases).
-
John D. Cook, 15 November 2018, Comparing bfloat16 range and precision to other 16-bit numbers
-
Shibo Wang, Pankaj Kanwar, August 23, 2019 BFloat16: The secret to high performance on Cloud TPUs
-
DNNL: Prevent constructing signaling NaNs and denormals (subnormal floats) by bfloat16_t
-
DNNL: Avoid undefined behavior (UB) bfloat16_t by removing type-pun via union (Closed)
-
TensorFlow: bfloat16 does not flush denormals (subnormal floats) to zero
-
Visual C++: Signaling NaN (float, double) becomes quiet NaN when returned from function (both x86 and x64)