Floating point precision

float

A floating point. Well, a floating doughnut.

How do you compare two numbers for equality? Use the equality operator of your language-of-choice, right? Well, not exactly. For example, consider the following code (in Java):

int i = 2 * 5;
if (i == 10) {
...do something
} else {
...do something else
}

and this one:

float f = 2 * 5;
if (f == 10) {
...do something
} else {
...do something else
}

What would be the end result? The correct answer is that the first if block always returns true, while we cannot be so sure about the second one. But why?

The reason lies in the way the numbers are stored in memory. Integer numbers (for example int or long) are stored in an exact manner. That is, the number is stored in an 32-bit or 64-bit memory location representing the number with all digits intact. For example, the decimal integer number 84,765,434 is stored in 32 bits as:

0000101000011010110101011111010

Floats, in contrast, are stored with an approximation, using a sign, an exponent and a mantissa part, assuming that any real number can be represented as

Significant digits * baseexponent

IEEE-754 Floating Point (from Wikipedia)

IEEE-754 Floating Point (from Wikipedia)

As you may already recognized, the number is only accurate to the number of significant digits. For example, the above example of 84,765,434 becomes 84,765 * 103, that is, 84,765,000 if significant digits are only 5 digits long. Thankfully, in programming languages, significant digits are much longer such that normally no such rounding occurs in smaller numbers. However, because of the nature of the numbers are stored, some smaller changes are introduced after every operation on floatint point numbers. For example, when you multiply 2 with 5, you may get (this is written purely as an example):

10.000000000000000001

because of these imperfections introduced. Thus, when you compare it with the equality operator (== in Java)

if (f == 10) {

it will probably fail. What is more, in a lesser extent, the operators greater or equal to (>=) and less or equal to (<=) will also be affected. In the above example, this if block will also fail:

if (f <= 10) {

Because f is greater than 10, even though a tiny little bit.

There are several workarounds to this problem, each with its own problems. One is to allow for a tolerance (epsilon) when comparing floats. Such as:

e = 1e-6;
if (abs(f – 10) < e) {
…do something assuming f is equal to 10
}

Here, we are ignoring any difference smaller than 10-6 between f and 10. That is, any value of f between 9.99999 and 10.000001 will pass as being equal to 10. The problem here is that if it is possible to have values of f that lies between this interval but still undesirable to pass the test. You can read more on the mathematics of accuracy problems.

Leave a Reply

Your email address will not be published. Required fields are marked *