Professional Documents
Culture Documents
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
next
previous
Floating-Point Representation
Decimal Cases
( ( )
is expressed as
. In
where M is the fraction mantissa or significand. E is the exponent. B is the base, in decimal case . Binary Cases As an example, a 32-bit word is used in MIPS computer to represent a floating-point number:
1 bit ..... 8 bits .............. 23 bits representing: The implied base is 2 (not explicitly shown in the representation). The exponent can be represented in signed 2's complement (but also see biased notation later). The implied decimal point is between the exponent field E and the significand field M. More bits in field E mean larger range of values representable. More bits in field M mean higher precision. Zero is represented by all bits equal to 0: Normalization To efficiently use the bits available for the significand, it is shifted to the left until all leading 0's disappear
1 of 7
6/3/2012 7:17 AM
Floating-Point Representation
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
(as they make no contribution to the precision). The value can be kept unchanged by adjusting the exponent accordingly. Moreover, as the MSB of the significand is always 1, it does not need to be shown explicitly. The significand could be further shifted to the left by 1 bit to gain one more bit for precision. The first bit 1 before the decimal point is implicit. The actual value represented is
However, to avoid possible confusion, in the following the default normalization does not assume this implicit 1 unless otherwise specified. Zero is represented by all 0's and is not (and cannot be) normalized. Example: A binary number can be represented in 14-bit floating-point form in the following ways (1 sign bit, a 4-bit exponent field and a 9-bit significand field):
with an implied 1.0: By normalization, highest precision can be achieved. Biased Notation for Exponent To simplify the hardware for comparing two exponents (to use simpler integer sorting rather than subtraction), we may want to avoid 2's complement representation for the exponent. This can be done by simply adding 1 (a bias) at the MSB of the exponent field and the resulting representation is called biased notation. Consider a 5-bit exponent field (range of exponents: ):
2 of 7
6/3/2012 7:17 AM
Floating-Point Representation
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
The bias depends on number of bits in the exponent field. If there are e bits in this field, the bias is , which lifts the representation (not the actual exponent) by half of the range to get rid of the negative parts represented by 2's complement. The range of actual exponents represented is still the same. With the biased exponent, the value represented by the notation is:
Floating-Point Notation of IEEE 754 The IEEE 754 floating-point standard uses 32 bits to represent a floating-point number, including 1 sign bit, 8 exponent bits and 23 bits for the significand. As the implied base is 2, an implied 1 is used, i.e., the significand has effectively 24 bits including 1 implied bit to the left of the decimal point not explicitly represented in the notation. Note in particular that in IEEE 754 notation, the bias for the 8-bit exponent is (instead of The 8-bit exponent field: ).
3 of 7
6/3/2012 7:17 AM
Floating-Point Representation
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
The range of exponents representable is from -126 to 127; The exponent (with all zero significand) is reserved to represent infinities or not-anumber (NaN) which may occur when, e.g., a number is divided by zero; The smallest exponent is reserved to represent denormalized numbers (smaller than which cannot be normalized) and zero, e.g., is represented by:
Other Implied Bases Given e bits for the exponent field, the range of exponent values representable is and the range of magnitudes representable is about
For example, if
This range can be extended by (a) increasing number of bits for exponent, or (b) increasing the implied base from 2 to 4, 8, 16, etc. (or in general, the range of magnitudes representable is ). For example, when the implied base is ,
Normalization: If the implied base is , the significand must be shifted multiple of q bits at a time so that the exponent can be correspondingly adjusted to keep the value unchanged. If at least one of the first q bits of the significand is 1, the representation is normalized. Obviously, the implied 1 can no longer be used. Examples: Normalize . Note that the base is 4 (instead of 2)
4 of 7
6/3/2012 7:17 AM
Floating-Point Representation
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
Note that the significand has to be shifted to the left two bits at a time during normalization, because the smallest reduction of the exponent necessary to keep the value represented unchanged is 1, corresponding to dividing the value by 4. Similarly, if the implied base is , the significand has to be shifted 3 bits at a time. In general, if , normalization means to left shift the significand q bits at a time until there is at least one 1 in the highest q bits of the significand. Obviously the implied 1 can not be used. Represent in biased notation with and implied base is 2. bits for exponent field. The bias is
5 of 7
6/3/2012 7:17 AM
Floating-Point Representation
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
1.0
37.5
. -78.25
As the most negative exponent representable is -126, this value is a denorm which cannot be normalized:
Can you answer the following questions regarding 32-bit IEEE 754 floating-point representation and explain why?:
6 of 7
6/3/2012 7:17 AM
Floating-Point Representation
http://fourier.eng.hmc.edu/e85/lectures/arithmetic_html/node11.html
next
previous
Next: Floating-Point Arithmetic Up: arithmetic_html Previous: Fast Multiplication Ruye Wang 2003-10-24
7 of 7
6/3/2012 7:17 AM