From WikiChip
Brain floating-point format (bfloat16)
Revision as of 10:38, 7 November 2018 by David (talk | contribs)

Brain floating-point format (bfloat16) is an number encoding format occupying 16 bits representing a floating-point number. It is equvilent to a standard single-precision floating-point value with a truncated mantissa field. Bfloat16 is designed to be used in hardware accelerating machine learning algorithms.

Overview

Bfloat16 follows the same format as a standard IEEE 754 single-precision floating-point but truncates the mantissa field from 23 bits to just 7 bits. Preserving the exponent bits keeps the format to the same range as the 32-bit single precision FP (~1e-38 to ~3e38). This allows for relatively simpler conversion between the two data types.