Most accurate way to compute asinhf() from log1pf()?

Asked 30/12, 2015 at 18:42 Answered 1/1, 2016 at 6:24

Solved c math floating-point floating-accuracy

The inverse hyperbolic function asinh() is closely related to the natural logarithm. I am trying to determine the most accurate way to compute asinh() from the C99 standard math function log1p(). For ease of experimentation, I am limiting myself to IEEE-754 single-precision computation right now, that is I am looking at asinhf() and log1pf(). I intend to re-use the exact same algorithm for double precision computation, i.e. asinh() and log1p(), later.

My primary goal is to minimize ulp error, the secondary goal is to minimize the number of incorrectly rounded results, under the constraint that the improved code would at most be minimally slower than the versions posted below. Any incremental improvement to accuracy, say 0.2 ulp, would be welcome. Adding a couple of FMAs (fused multiply-adds) would be fine, on the other hand I am hoping someone could identify a solution which employs a fast rsqrtf() (reciprocal square root).

The resulting C99 code should lend itself to vectorization, possibly by some minor straightforward transformations. All intermediate computation must occur at the precision of the function argument and result, as any switch to higher precision may have a severe negative performance impact. The code must work correctly both with IEEE-754 denormal support and in FTZ (flush to zero) mode.

So far, I have identified the following two candidate implementations. Note that the code may be easily transformed into a branchless vectorizable version with a single call to log1pf(), but I have not done so at this stage to avoid unnecessary obfuscation.

/* for a >= 0, asinh(a) = log (a + sqrt (a*a+1))
                        = log1p (a + (sqrt (a*a+1) - 1))
                        = log1p (a + sqrt1pm1 (a*a))
                        = log1p (a + (a*a / (1 + sqrt(a*a + 1))))
                        = log1p (a + a * (a / (1 + sqrt(a*a + 1))))
                        = log1p (fma (a / (1 + sqrt(a*a + 1)), a, a)
                        = log1p (fma (1 / (1/a + sqrt(1/a*a + 1)), a, a)
*/
float my_asinhf (float a)
{
    float fa, t;
    fa = fabsf (a);
#if !USE_RECIPROCAL
    if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation
        t = log1pf (fa) + 0x1.62e430p-1f; // log(2)
    } else {
        t = fmaf (fa / (1.0f + sqrtf (fmaf (fa, fa, 1.0f))), fa, fa);
        t = log1pf (t);
    }
#else // USE_RECIPROCAL
    if (fa > 0x1.0p126f) { // prevent underflow in intermediate computation
        t = log1pf (fa) + 0x1.62e430p-1f; // log(2)
    } else {
        t = 1.0f / fa;
        t = fmaf (1.0f / (t + sqrtf (fmaf (t, t, 1.0f))), fa, fa);
        t = log1pf (t);
    }
#endif // USE_RECIPROCAL
    return copysignf (t, a); // restore sign
}

With a particular log1pf() implementation that is accurate to < 0.6 ulps, I am observing the following error statistics when testing exhaustively across all 2³² possible IEEE-754 single-precision inputs. When USE_RECIPROCAL = 0, the maximum error is 1.49486 ulp, and there are 353,587,822 incorrectly rounded results. With USE_RECIPROCAL = 1, the maximum error is 1.50805 ulp, and there are only 77,569,390 incorrectly rounded results.

In terms of performance, the variant USE_RECIPROCAL = 0 will be faster if reciprocals and full divisions take roughly the same amount of time, but the variant USE_RECIPROCAL = 1 could be faster if very fast reciprocal support is available.

Answers can assume that all basic arithmetic, including FMA (fused multiply-add) is correctly rounded according to IEEE-754 round-to-nearest-or-even mode. In addition, faster, nearly correctly rounded, versions of reciprocal and rsqrtf() may be available, where "nearly correctly rounded" means the maximum ulp error will be limited to something like 0.53 ulps and the overwhelming majority of results, say > 95%, are correctly rounded. Basic arithmetic with directed roundings may be available at no additional cost to performance.

Kristine answered 30/12, 2015 at 18:42 Comment(19)

If you want accuracy, ditch float and move to double asap, or even long double if your compiler supports 80-bit real number representation. – Hangout 30/12, 2015 at 18:49

That is not a realistic option for performance reasons, all intermediate computation must occur at the "native precision" of the input and result. I'll edit that requirement into the question. – Kristine 30/12, 2015 at 18:53

double is more native than float, which is now only suitable for some very restricted systems like embedded. – Hangout 30/12, 2015 at 18:55

Your question is "most accurate way" so I suggest you cut to the chase and not have to go through your tests all over again. – Hangout 30/12, 2015 at 18:58

By "native" I meant "the same type as used in the function prototype". There are commonly used platforms (GPUs) where float operations can have significantly higher throughput than double operations, and on many platform including x86 with AVX there is no convenient way to use any type wider than double. – Kristine 30/12, 2015 at 19:0

So you can use double for better accuracy. – Hangout 30/12, 2015 at 19:1

@WeatherVane "Most accurate" under the constraints I outlined. I have clarified the question with regard to the use of higher precision in intermediate computations, what other clarifications do you think would be needed. – Kristine 30/12, 2015 at 19:2

Yes, all platforms I am considering do have double support but it may be up to 32x slower than float, so not practical given the need for high performance. In addition I am using the float version as an experimental platform for the double version, and there is no hardware support for any higher precision beyond double – Kristine 30/12, 2015 at 19:5

Please make your mind up. Do you want double or do you want float? Do you want speed or do you want accuracy? You said, in the question, before moving to double. My first comment stands. – Hangout 30/12, 2015 at 19:8

I want the best accuracy under the constraints stated, that is, without totally screwing performance. The question as written is about the float case, but I am intending to re-use the same algorithm for double later, as I mentioned at the start of the question. – Kristine 30/12, 2015 at 19:11

My assessment of your my_asinhf() using #if !USE_RECIPROCAL is also worst case error of about 1 ULP at a == 7.517228127e-01 and an average error of 1/8 ULP. I do not think a more accurate answer is feasible with approaches used. Perhaps a Minimax polynomial? – Flowerpot 30/12, 2015 at 22:18

A minimax polynomial is certainly a possibility I considered, and is my usual "plan B" for such issues. But before I go there, I wanted to check whether I am missing something regarding possible re-arrangements of the arithmetic. For example, is there an arrangement where either a sqrt or rsqrt (a contracting operation) is used late, rather than early in the computation? Can FMA be used more effectively than it currently is to improve accuracy? Is there a way to implement sqrt1pm1 that would help accuracy (I have no practical experience with that function)? – Kristine 30/12, 2015 at 22:30

I'm unclear on the value of accuracy versus speed. May I surmise correctly that you want maximum ULP error about 0.5-0.6-ish and better speed performance, using your posted solution as the function to beat? Or do you want 2x+ speed and allowing a couple up to 2.0 ULP ? – Flowerpot 30/12, 2015 at 22:31

Ideally I would like to retain the performance of my posted variants, but achieve better accuracy. I will take any accuracy improvement I can get, e.g. 0.2 ulp better than the current code. If a more accurate solution would require, say, a couple more FMAs, I could live with the performance impact. Conversely I am hoping someone can find a way to exploit a fast rsqrt. Getting to a faithfully-rounded implementation (< 1 ulp) would be great. My current thinking is that shooting for any bound tighter than that would definitely mean negatively impacting performance, which I want to avoid. – Kristine 30/12, 2015 at 22:42

@chux After spending considerable additional time researching this, the only solution I found that reduces the error further than my variant #if !USE_RECIPROCAL is to compute the input to log1pf() as a double-float arg_hi, arg_lo, then compute asinh() as fmaf (arg_lo, 1.0f/(1.0f+arg_hi), log1pf (arg_hi)). When using a correctly rounded log1pf() this reduces the error in asinhf() to 1 ulp. This is not practical from a performance perspective because it requires either higher precision or expensive double-float computation for the argument transformation. – Kristine 31/12, 2015 at 22:29

Why the specific constraint that the implementation reduce computing asinhf to computing log1pf? – Danieldaniela 1/1, 2016 at 5:1

That is a very common way of computing asinh(), and happens to be the one I am currently exploring in detail. The attractiveness of such a general approach is that one can build implementations for various math functions from just a few building blocks, with inverse hyperbolic functions in particular being mapped back to log1p(). Since the difficult arguments for asinh() are all small, one could use a minimax polynomial for those, but that would be a yet another code branch (not advantageous if the goal is a vectorizable implementation), and worthy of a separate question. – Kristine 1/1, 2016 at 5:23

Are you familiar with Table Maker's Dilemma? – Flowerpot 4/1, 2016 at 14:48

I am very much familiar with TMD, how does this figure into my question, though? – Kristine 4/1, 2016 at 15:52

After various additional experiments, I have convinced myself that a simple argument transformation that does not use higher precision than the argument and result cannot achieve a tighter error bound than the one achieved by the first variant in the code I posted.

Since my question is about minimizing the error for the argument transformation which is incurred in addition to the error in log1pf() itself, the most straightforward approach to use for experimentation is to utilize a correctly rounded implementation of that logarithm function. Note that a correctly-rounded implementation is highly unlikely to exist in the context of a high-performance environment. According to the works of J.-M. Muller et. al., to produce accurate single-precision results, x86 extended precision computation should be sufficient, for example

float accurate_log1pf (float a)
{
    float res;
    __asm fldln2;
    __asm fld     dword ptr [a];
    __asm fyl2xp1;
    __asm fst     dword ptr [res];
    __asm fcompp;
    return res;
}

An implementation of asinhf() using the first variant from my question then looks as follows:

float my_asinhf (float a)
{
    float fa, s, t;
    fa = fabsf (a);
    if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation
        t = log1pf (fa) + 0x1.62e430p-1f; // log(2)
    } else {
        t = fmaf (fa / (1.0f + sqrtf (fmaf (fa, fa, 1.0f))), fa, fa);
        t = accurate_log1pf (t);
    }
    return copysignf (t, a); // restore sign
}

Testing with all 2³² IEEE-754 single-precision operands shows that the maximum error of 1.49486070 ulp occurs at ±0x1.ff5022p-9 and there are 353,521,140 incorrectly rounded results. What happens if the entire argument transformation uses double-precision arithmetic? The code changes to

float my_asinhf (float a)
{
    float fa, s, t;
    fa = fabsf (a);
    if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation
        t = log1pf (fa) + 0x1.62e430p-1f; // log(2)
    } else {
        double tt = fa;
        tt = fma (tt / (1.0 + sqrt (fma (tt, tt, 1.0))), tt, tt);
        t = (float)tt;
        t = accurate_log1pf (t);
    }
    return copysignf (t, a); // restore sign
}

However, the error bound does not improve with this change! The maximum error of 1.49486070 ulp still occurs at ±0x1.ff5022p-9 and there are now 350,971,046 incorrectly rounded results, slightly fewer than before. The issue seems to be that a float operand cannot convey enough information to log1pf() to produce more accurate results. A similar problem occurs when computing sinf() and cosf(). If the reduced argument, represented as a correctly rounded float operand, is passed to the core polynomials, the resulting error in sinf() and cosf() is just a tad under 1.5 ulp, just as we are observing here with my_asinhf().

One solution is to compute the transformed argument to higher than single precision, for example as a double-float operand pair (a useful brief overwiew of double-float techniques can be found in this paper by Andrew Thall). In this case, we can use the additional information to perform linear interpolation on the result, based on the knowledge that the derivative of the logarithm is the reciprocal. This gives us:

float my_asinhf (float a)
{
    float fa, s, t;
    fa = fabsf (a);
    if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation
        t = log1pf (fa) + 0x1.62e430p-1f; // log(2)
    } else {
        double tt = fa;
        tt = fma (tt / (1.0 + sqrt (fma (tt, tt, 1.0))), tt, tt);
        t = (float)tt;                // "head" of double-float
        s = (float)(tt - (double)t);  // "tail" of double-float
        t = fmaf (s, 1.0f / (1.0f + t), accurate_log1pf (t)); // interpolate
    }
    return copysignf (t, a); // restore sign
}

Exhaustive test of this version indicates that the maximum error has been reduced to 0.99999948 ulp, it occurs at ±0x1.deeea0p-22. There are 349,653,534 incorrectly rounded results. A faithfully-rounded implementation of asinhf() has been achieved.

Unfortunately, the practical utility of this result is limited. Depending on HW platform, the throughput of arithmetic operations on double may only be 1/2 to 1/32 of the throughput of float operations. The double-precision computation can be replaced with double-float computation, but this would incur very significant cost as well. Lastly, my approach here was to use the single-precision implementation as a proving ground for subsequent double-precision work, and many hardware platforms (certainly all the ones I am interested in) do not offer hardware support for a numeric format with higher precision than IEEE-754 binary64 (double precision). Therefore any solution should not require higher-precision arithmetic in intermediate computation.

Since all the troublesome arguments in the case of asinhf() are small in magnitude, one could [partially?] address the accuracy issue by using a polynomial minimax approximation for the region around the origin. As this would create another code branch, it would likely make vectorization more difficult.

Kristine answered 1/1, 2016 at 6:24 Comment(0)

Firstly, you may want to look into the accuracy and speed of your log1pf function: these can vary a bit between libms (I've found the OS X math functions to be fast, the glibc ones to be slower but typically correctly rounded).

Openlibm, based on the BSD libm, which in turn is based on Sun's fdlibm, use multiple approaches by range, but the main bit is the relation:

t = x*x;
w = log1pf(fabsf(x)+t/(one+sqrtf(one+t)));

You may also want to try compiling with the -fno-math-errno option, which disables the old System V error codes for sqrt (IEEE-754 exceptions will still work).

Somehow answered 31/12, 2015 at 15:39 Comment(1)

My question is about minimizing the error incurred on top of the error due to log1pf(). With a correctly rounded log1pf(), my two asinhf() variants have maximum error of 1.49486 and 1.50805 ulp, respectively. With a faithfully rounded log1pf() [max error of 0.88383 ulp], the error in asinhf() grows to 1.70243 / 1.72481 ulp. With your prosed variant used in conjunction with a correctly rounded log1pf() the maximum error in asinhf() is 1.54368 ulp, so higher than with either of my two current variants. – Kristine 31/12, 2015 at 16:22

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags