ULP unit of least precision

Asked 14/5, 2017 at 14:57 Answered 15/5, 2017 at 1:40

Can anyone explain ULP Unit of least precision? I have the following definition, but it is still not clear

"The size of the error when representing fractions is proportional to the size of the number stored. The ULP or unit of least precision defines the maximum error you can get when storing a number. The bigger the number stored the bigger the ULP."

What does it mean exactly? Thanks in advance

Mattingly answered 14/5, 2017 at 14:57 Comment(0)

In floating-point formats, numbers are represented with a sign s, a significand (also called a fraction) f, and an exponent e. E.g., with binary floating-point, the value represented by s, f, and e is (-1)^s•f•2^e

f is restricted to a certain number of digits and, with a floating-point format using two for its base, is typically required to be at least one and less than two. The smallest change that can be made in the number (with certain exceptions discussed below) is to modify the last digit of f by 1. For example, if f is restricted to six binary digits, then it has values from 1.00000 to 1.11111, and the smallest change that can be made in it is 0.00001. Given the exponent e, a change of 0.00001 in f modifies the value represented by 0.00001•2^e. This is the unit of least precision (ULP)

Note that the ULP varies depending on the exponent.

The exceptions I mentioned occur at the very largest representable finite value (where the number can only be increased by producing infinity), the very smallest (most negative) representable finite value, at zero and subnormal numbers (where special things happen with the fraction and the exponent), and at boundaries where the exponent changes. At those boundaries, you are decreasing the exponent, which means the value of the least significant digit of f decreases, so the step is actually ½ of the old ULP.

When a single operation is only limited by what numbers the floating-point system can represent in its finite range (rather than exceeding that range), then the maximum error in a result is ½ of an ULP. This is because, if you were further than ½ of an ULP from the mathematically exact result, you could alter the calculated result by 1 ULP so that its error decreases in magnitude. (E.g., if an exact result is 3.75, changing from 3 to 4 changes the error from .75 to .25.)

Elementary arithmetic operations, such as addition, multiplication, and division, should provide results rounded to the nearest representable result, so they have errors that are at most ½ an ULP. Square root should also be implemented that way. It is a goal for math library functions (such as cosine and logarithm) to provide good rounding, but it is hard to get correct rounding, so commercial libraries generally do not guarantee correct rounding.

Conversions from decimal (e.g. in ASCII text) to an internal floating-point format ought to be correctly rounded, but not all software libraries or language implementations do this correctly.

Compound operations, such as subroutines that perform many calculations to get a result, will have many rounding errors and generally will not return a result that is within ½ an ULP of the mathematically exact result.

Note that it is not technically correct the say the size of the error when representing fractions is proportional to the size of the number stored. The bound on the error is roughly proportional—we can say ½ ULP is a bound on the error, and an ULP is roughly proportional to the number. It is only roughly proportional because it varies by a factor of two (when using binary) as the fraction ranges from one to two. E.g., 1 and 1.9375 have the same ULP because they use the same exponent, but the ULP is a larger proportion of 1 than it is of 1.9375.

And only the bound on the error is roughly proportional. The actual error depends on the numbers involved. E.g., if we add 1 and 1, we get 2 with no error.

Tired answered 15/5, 2017 at 1:40 Comment(0)

-1

Each floating point number represents an interval of real numbers. How that interval is situated relative to its floating point number (interpreted as dyadic fraction) depends on the rounding mode. The error relates to the maximum distance of any real number in the interval to the floating point number.

So the safest answer is the distance to the next floating point number on either side. If the rounding mode is the usual rounding to the nearest floating point number, the maximum error is half of that.

Nimiety answered 14/5, 2017 at 16:1 Comment(3)

In IEEE 754, a floating-point datum represents exactly one real number. They do not represent intervals. The values represented are specified in clause 3.3. – Tired 15/5, 2017 at 1:17

I did not say that it implements interval arithmetic. But what do you call the process that assigns arctan(1) or sqrt(2) a floating point number? – Nimiety 15/5, 2017 at 5:9

Rounding. Ideally, the value returned by arctan or other elementary functions is the result of rounding the exact mathematical result to the nearest representable number according to the rules and direction of the applicable rounding mode (usually to the nearest value, with ties toward an even low bit, but other modes include rounding toward +infinity, rounding toward -infinity, and rounding toward zero). sqrt can be readily implemented with this property. However, in practice, most math libraries do not implement arctan and other functions with this property. – Tired 15/5, 2017 at 22:42

Recommended topics

Hot tags