high bits of long multiplication in Java?

Asked 17/9, 2013 at 20:22 Answered 30/3, 2019 at 10:3

Is there any way to get the high half of the multiplication of two longs in Java? I.e. the part that vanishes due to overflow. (So the upper 64 bits of the 128-bit result)

I'm used to writing OpenCL code where the command mul_hi does exactly this: http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/mul_hi.html

Since OpenCL can do it efficiently on my CPU, Java should be able to do so as well, but I can't find how I should do this (or even mimic its behaviour efficiently) in Java. Is this possible in Java, and if so, how?

Sharice answered 17/9, 2013 at 20:22 Comment(1)

Do you mean the high 64 bits of the 128-bit result? – Factional 17/9, 2013 at 20:26

The accepted solution is wrong most of the time (66%), though the error is bounded (it can be smaller than the exact result by at most 2 and it can never be bigger). This comes from

ignoring the x_lo * y_lo product
first shifting and then adding x_hi * y_lo and x_lo * y_hi

My solution seems to always work for non-negative operands.

final long x_hi = x >>> 32;
final long y_hi = y >>> 32;
final long x_lo = x & 0xFFFFFFFFL;
final long y_lo = y & 0xFFFFFFFFL;
long result = x_lo * y_lo;
result >>>= 32;

result += x_hi * y_lo + x_lo * y_hi;
result >>>= 32;
result += x_hi * y_hi;

Tested on a billion random operands. There should be a special test for corner cases and some analysis.

Dealing with negative operands would be more complicated as it'd prohibit using the unsigned shift and force us to handle intermediate result overflow.

In case speed doesn't matter much (and it rarely does), I'd go for

 BigInteger.valueOf(x).multiply(BigInteger.valueOf(y))
     .shiftRight(64).longValue();

Bergen answered 25/7, 2015 at 17:35 Comment(1)

You could link to or otherwise indicate the accepted solution in the future to avoid what happened now: your solution became the accepted one, so it isn't clear which solution you are referring to that is wrong. – Carmarthenshire 24/6, 2019 at 17:41

Java 9 has Math.multiplyHigh, which according to the Javadocs "Returns as a long the most significant 64 bits of the 128-bit product of two 64-bit factors."

Lynsey answered 14/8, 2018 at 7:0 Comment(0)

The accepted solution is wrong most of the time (66%), though the error is bounded (it can be smaller than the exact result by at most 2 and it can never be bigger). This comes from

ignoring the x_lo * y_lo product
first shifting and then adding x_hi * y_lo and x_lo * y_hi

My solution seems to always work for non-negative operands.

final long x_hi = x >>> 32;
final long y_hi = y >>> 32;
final long x_lo = x & 0xFFFFFFFFL;
final long y_lo = y & 0xFFFFFFFFL;
long result = x_lo * y_lo;
result >>>= 32;

result += x_hi * y_lo + x_lo * y_hi;
result >>>= 32;
result += x_hi * y_hi;

Tested on a billion random operands. There should be a special test for corner cases and some analysis.

Dealing with negative operands would be more complicated as it'd prohibit using the unsigned shift and force us to handle intermediate result overflow.

In case speed doesn't matter much (and it rarely does), I'd go for

 BigInteger.valueOf(x).multiply(BigInteger.valueOf(y))
     .shiftRight(64).longValue();

Bergen answered 25/7, 2015 at 17:35 Comment(1)

Let's say you have two longs, x and y, and x = x_hi * 2^32 + x_lo, and y = y_hi * 2^32 + y_lo.

Then x * y == (x_hi * y_hi) * 2^64 + (x_hi * y_lo + x_lo * y_hi) * 2^32 + (x_lo * y_lo).

The high 64 bits of that product can, therefore, be computed as follows:

long x_hi = x >>> 32;
long y_hi = y >>> 32;
long x_lo = x & 0xFFFFFFFFL;
long y_lo = y & 0xFFFFFFFFL;
long prod_hi = (x_hi * y_hi) + ((x_ hi * y_lo) >>> 32) + ((x_lo * y_hi) >>> 32);

Factional answered 17/9, 2013 at 20:42 Comment(3)

What would that yield for 0x1FFFFFFFF times 0x1FFFFFFFF? By my reckoning, it would compute 1, but the correct answer should be 3. I think to get a precise result, one must compute all four partial products and then sum them carefully. – Sky 25/2, 2014 at 23:8

@maaartinus: While I can see this to mimic lib/kernel/mul_hi.cl, "truncate each partial product, scale and sum" is still wrong - "scale and sum partial products, truncate" is right. It does not even get minimal expected error from skipping the least significant partial product for lack of rounding/"truncation precompensation". (Try decimal 19*19 for an example, if 0x1FFFFFFFF times 0x1FFFFFFFF worked for you.) (2013/3/23: "Correct mul_hi()"? MA.) – Pirbhai 25/7, 2015 at 14:21

@Pirbhai I know, I only fixed an error making it plain wrong. It's still slightly wrong most of the time, but it's not my answer, my answer is this one. – Bergen 25/7, 2015 at 17:38

If either x or y can be negative, you should use Hacker's Delight function (Henry S. Warren, Hacker's Delight, Addison-Wesley, 2nd edition, Fig. 8.2):

long x_high = x >>> 32;
long x_low = x & 0xFFFFFFFFL;
long y_high = y >>> 32;
long y_low = y & 0xFFFFFFFFL;
long z2 = x_low * y_low;
long t = x_high * y_low + (z2 >>> 32);
long z1 = t & 0xFFFFFFFFL;
long z0 = t >>> 32;
z1 += x_low * y_high;
return x_high * y_high + z0 + (z1 >>> 32);

Lucaslucca answered 10/8, 2016 at 17:38 Comment(4)

What is y2? How about providing a reference (page#, edition) or a hyperlink for Hacker's Delight? – Pirbhai 14/8, 2016 at 8:48

I have added the fig. number and edition number, and I corrected my typo. As for the hyperlink, what do you have in mind? Amazon's page? – Lucaslucca 21/8, 2016 at 13:57

Authoritative links not objectionable to many practitioners and legislations, preferably. Funny S. Warren should advise to change parameters to 32-bit unsigned integers, and continue to present the unchanged ed.1 16/32 bit code. – Pirbhai 21/8, 2016 at 17:36

All ">>" should be replaced with ">>>" to use this method for unsigned multiplication. – Marketa 3/2, 2017 at 8:31

Here is a code snippet from Java's Math.multiplyHigh(long,long)

    public static long multiplyHigh(long x, long y) {
        if (x < 0 || y < 0) {
            // Use technique from section 8-2 of Henry S. Warren, Jr.,
            // Hacker's Delight (2nd ed.) (Addison Wesley, 2013), 173-174.
            long x1 = x >> 32;
            long x2 = x & 0xFFFFFFFFL;
            long y1 = y >> 32;
            long y2 = y & 0xFFFFFFFFL;
            long z2 = x2 * y2;
            long t = x1 * y2 + (z2 >>> 32);
            long z1 = t & 0xFFFFFFFFL;
            long z0 = t >> 32;
            z1 += x2 * y1;
            return x1 * y1 + z0 + (z1 >> 32);
        } else {
            // Use Karatsuba technique with two base 2^32 digits.
            long x1 = x >>> 32;
            long y1 = y >>> 32;
            long x2 = x & 0xFFFFFFFFL;
            long y2 = y & 0xFFFFFFFFL;
            long A = x1 * y1;
            long B = x2 * y2;
            long C = (x1 + x2) * (y1 + y2);
            long K = C - A - B;
            return (((B >>> 32) + K) >>> 32) + A;
        }
    }

As from Java 9, this is included in java.lang.Math and probably direct call to it should be made. Posting the source just to show what is going on "under the hood".

Materiel answered 16/1, 2019 at 10:30 Comment(1)

This likely isn't what is going on under the hood. Hotspot will treat multiplyHigh as an intrinsic function on some platforms. That means that direct calls to this method can be replaced (by the JIT compiler) with code that takes advantage of native instructions that perform the same function. You could think of this code as a fallback option, which the JVM could use to emulate the instruction on architectures that don't have it. – Mccully 3/5, 2019 at 22:48

Some of described cases above works wrong. First of all you have to ask yourself what types of operands do you treat (signed/unsigned).

There is a modified code from sample above that is fixed for carry flag (treat x & y as unsigned 64-bit values):

public static long productHi(long x, long y) {
    final long x_hi = x >>> 32;
    final long y_hi = y >>> 32;
    final long x_lo = x & 0xFFFFFFFFL;
    final long y_lo = y & 0xFFFFFFFFL;
    long result = (x_lo * y_lo) >>> 32;
    long a = x_hi * y_lo;
    long b = x_lo * y_hi;
    long sum = a + b + result;
    long carry = ((a & b) | (~sum & (a ^ b))) >>> 63;
    result = (sum >>> 32) + x_hi * y_hi + (carry << 32);
    return result;
}

Volt answered 30/3, 2019 at 10:3 Comment(0)

You should look at using a BigInteger.

Emigration answered 17/9, 2013 at 20:40 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags