From WikiChip
Difference between revisions of "brain floating-point format"
(bfloat16) |
|||
Line 1: | Line 1: | ||
{{title|Brain floating-point format (bfloat16)}} | {{title|Brain floating-point format (bfloat16)}} | ||
'''Brain floating-point format''' ('''bfloat16''') is an [[number encoding format]] occupying 16 bits representing a floating-point number. It is equvilent to a standard [[single-precision floating-point]] value with a truncated [[mantissa field]]. Bfloat16 is designed to be used in hardware [[accelerating]] machine learning algorithms. | '''Brain floating-point format''' ('''bfloat16''') is an [[number encoding format]] occupying 16 bits representing a floating-point number. It is equvilent to a standard [[single-precision floating-point]] value with a truncated [[mantissa field]]. Bfloat16 is designed to be used in hardware [[accelerating]] machine learning algorithms. | ||
+ | |||
+ | == Overview == | ||
+ | Bfloat16 follows the same format as a standard IEEE 754 [[single-precision floating-point]] but truncates the [[mantissa field]] from 23 bits to just 7 bits. Preserving the exponent bits keeps the format to the same range as the 32-bit single precision FP (~1e<sup>-38</sup> to ~3e<sup>38</sup>). This allows for relatively simpler conversion between the two data types. |
Revision as of 09:38, 7 November 2018
Brain floating-point format (bfloat16) is an number encoding format occupying 16 bits representing a floating-point number. It is equvilent to a standard single-precision floating-point value with a truncated mantissa field. Bfloat16 is designed to be used in hardware accelerating machine learning algorithms.
Overview
Bfloat16 follows the same format as a standard IEEE 754 single-precision floating-point but truncates the mantissa field from 23 bits to just 7 bits. Preserving the exponent bits keeps the format to the same range as the 32-bit single precision FP (~1e-38 to ~3e38). This allows for relatively simpler conversion between the two data types.