Share
Explore BrainMass

16 bit floating point format

Use the 16 bit floating point format to perform the following:

a) Convert ED80 from fpx to decimal.
b) Convert 1.745 * 10-3 from decimal to fpx.
c) Add two fpx numbers (7B80 + 7300).
d) Subtract two fpx numbers (7700 - 7CF0).
e) Multiply two fpx numbers (7500 * A70A).

All above numbers are fp (floating point) in hexadecimal, except for 1b, which is given in decimal.

Solution Preview

Since nothing is mentioned about the 16 bit floating point format in the question, this solution considers 16 bit Half Precision floating point format as mentioned at http://en.wikipedia.org/wiki/Half_precision .

Bits layout in 16 bit (Half Precision) floating point format is as follows.

(1 Sign bit, 5 Exponent bits, 10 Fraction/Significand bits)

Exponent bias (b) = 15

In the response below, suffixes H and B are used to indicate hexadecimal and binary values respectively.

a) Convert ED80 from fpx to decimal

(ED80)H = (1110 1101 1000 0000)B

Sign bit (S) = 1
Exponent bits (e) = (11011)B = 27
Fraction bits (f) = (01 1000 0000)B

Equivalent decimal number = (-1)^S * 2^(e-b) * (1.f)B
= (-1)^1 * 2^(27-15) * (1.0110000000)B
= - 2^12 * (1.0110000000)B
= - (1011000000000)B
= - (4096+1024+512)
= - 5632 or -5632.00

b) Convert 1.745 * 10^(-3) from ...

Solution Summary

Since nothing is mentioned about the 16 bit floating point format in the question, this solution considers 16 bit Half Precision floating point format as mentioned at http://en.wikipedia.org/wiki/Half_precision .

$2.19