Purpose of bitwise OR of an integer with its negative
Asked Answered
L

1

4

I was curious about the implementation and representation of NaN in both IEEE single- and double-precision floating points, and I found this implementation of an "is NaN" function. Namely:

int isnan(double x)
{
    int32_t hx,lx;

    // Move lower 32 bits of double to lx, higher 32 to hx.
    EXTRACT_WORDS(hx,lx,x);

    // Remove sign bit, since -NaN and NaN are both NaN.
    hx &= 0x7fffffff;

    // Equivalent to hx |= (lx != 0).
    hx |= (u_int32_t)(lx|(-lx))>>31;

    // Difference is negative iff (hx & 0x7ff00000) == 0x7ff00000 and (hx & 0x000fffff) != 0.
    hx = 0x7ff00000 - hx;
    return (int)((u_int32_t)(hx))>>31;
}

I didn't understand the purpose of (lx|(-lx)) >> 31, and after failing to reason it out in my head, I tested it on all integers, and found it results in 0 for lx = 0 and 1 otherwise.

The only reasons I could come up with are that perhaps using (lx != 0) instead was not possible due to some C standard not defining what integer value is assigned to true operations (e.g. not guaranteed to be 1 for true) or that perhaps != is slower than the negative-or-and-bit-shift. Otherwise, I'm stumped.

For reference, the code I used to try all integers, in case of errors.

#include <stdio.h>
#include <stdint.h>
int main(void) {
        int32_t i = 0;
        do {
                if (((uint32_t)(i | (-i)) >> 31) == 0)
                        printf("%d\n", i); // prints only 0
        } while (i++ != 0xFFFFFFFF); // overflows to -max_int and then climb to -1
        return 0;
}
Lighting answered 18/9, 2014 at 6:4 Comment(12)
Print out the bit patterns for lx and -lx respectively, and see what you find out from that.Turkestan
It removes a conditional jump.Makalu
Joachim, now that I see the result, I can see why it's only 1 for 0, because if it's non-zero, then the most significant bit (sign bit) for either lx or -lx will be 1 (and thus when you bit-shift 31 right, it'll be 1). My question is why not use (lx != 0) instead of the comparison.Lighting
Raymond, would hx |= (u_int32_t)(lx != 0) not work in this case (does not have a conditional jump as far as I can see)? Is a true expression not guaranteed to be converted to an integer value of 1 by some C standard?Lighting
Minor correction to my comment to Joachim: it's only 0 for 0, not only 1 for 0.Lighting
Aside from the specific expression you're asking about, I don't see how this code does anything like determine if a double is a NaN value. I don't see how it returns anything but 0. It appears to be very old code that has since been replaced - I suspect it's simply several lines of buggy code.Herat
Michael, from my understanding, a floating point is NaN if the exponent bits are all 1's and the significand/mantissa bits are all 0's. I hit a snag with trying to unravel the mystery of that expression, so I didn't look at the rest of the function too deeply, but my intuition says it should work. Since the lower 32 bits of a double are all part of the significand, I believe the idea is to put a 1 at the end of the higher bit int32 if there are any non-zero bits in the lower bits (thus, the comparison with 0) and then check for any 1s in the significand bits from the upper half.Lighting
It's a nasty bit of code and could be made loads neater, but I'm nonetheless intrigued by what appears to be unwieldy optimization soup.Lighting
what is EXTRACT_WORDS ?Paisley
@Jengerer: thank you - I'm not sure what I was doing wrong in my previous attempts to work through what the code was doing, but you've hit it on the head (except that NaN has a non-zero mantissa; if the manitssa is all zeros then that indicates infinity). This once again proves that I should always avoid problems dealing with floating point.Herat
@Michael: thanks for the correction. What lead to me looking up this code initially was learning that any floating point equality/inequality with one of operands being NaN results in false. Thus, NaN == NaN is false, and perhaps more scary, NaN > x does not give the same result for !(NaN <= x).Lighting
@Jengerer: yup - floating point is a minefield.Herat
Q
3

The expression (u_int32_t)(lx|(-lx))>>31 is equivalent to lx==0? 0:1.

However, with lx==0? 0:1, you are imposing a branch operation into the object code.

This might yield reduced performance in comparison with a couple of bit-wise operations.

It really depends on the underlying HW architecture as well as the designated compiler at hand.

But it will for sure lead to inconsistent performance, depending on branch-prediction heuristics.

The running time of (u_int32_t)(lx|(-lx))>>31 is guaranteed to be identical on every execution.

Quadrinomial answered 18/9, 2014 at 7:12 Comment(17)
abs(lx) is the absolute value of lx , it doesn't bear any resemblance to lx|-lxPaisley
@barakmanos: what does 3 | (-3) give on your machine? On mine it gives -1.Hypnotherapy
-abs(lx) is the absolute value of lx negated. For example if lx is 3, then -abs(lx) is -3, and lx|-lx is -1.Paisley
@MattMcNabb: Yep, miscalculated that in my head... answer revised... thanks for pointing that out.Quadrinomial
Onto another point now, surely one would expect one's compiler to select the fastest implementation (if optimizing for speed). Perhaps this code was written for a bad compiler.Paisley
@MattMcNabb: I doubt that very much. To be fair, a branch operation will pretty much always yield reduced performance in comparison with a couple of bit-wise operations. I used "might" because I did not want to make a conclusive statement, as there might be some HW architecture out there, with a unique instruction-set (or even just a single instruction) specifically for that purpose.Quadrinomial
Interesting. Do you know off the top of your head why simply converting lx != 0 to an integer would not suffice? Does that branch internally, or is there no guarantee that the results would be 1 and 0 for true and false, respectively?Lighting
@Jengerer: No, I believe that the result is guaranteed to be 1 or 0. But to the best of my knowledge, that's what every compiler designated for a general purpose processor does. Specifically designated processors (like DSPs for example) may provide an that supports this conversion without branching. As I explained in the answer simply converting lx != 0 to an integer is sufficient, but yields a branch operation (with its subsequent repercussions, as explained in the answer) into the compiled code.Quadrinomial
@Jengerer: Regardless of the above, you might be happy to know that your question has in fact provided an answer to a question that I had published here myself a while ago: https://mcmap.net/q/752795/-c-conversion-from-int-to-bool/1382251.Quadrinomial
@barakmanos: Thanks for the response. I did indeed see that in converting int x = (a != 0), there was an instruction in x86 for setting a byte to 0/1 if an operand was zero/non-zero, but I can believe that there are platforms that would use a branch instead.Lighting
@Jengerer: You're welcome. Did you mean "I believe that there are platforms that use something other than branch"? As I said, there might be specifically designated HW architectures, like DSP for example, that have a dedicated op-code for that purpose.Quadrinomial
@barakmanos: I took at look at x86 assembly implementation of something like return (x != 0), and it does use the instruction setne register to do it without a branch. I meant I suppose that, as you suggested, the lx | (-lx) is a solution that's guaranteed not to use a branch even for platforms that don't have an instruction like setne.Lighting
@Jengerer: setne sounds to me like an abbreviation of set (if) not equal, which in turn, sounds to me like a branch operation.Quadrinomial
@barakmanos: From some documentation I read, it sets the destination byte to 1 if the ZF flag set from the previous comparison is 0, and 0 otherwise, so I imagine it could be implemented by just assigning 0b0000000<negation ZF> to the destination register. I suppose it's not immediately apparent that this is the implementation that is used, however.Lighting
@Jengerer: Note the "if" that you quoted from the documentation. This is pretty straightforward branching, nothing too complicated here. The sne indeed performs an immediate operation according to the value of the ZF flag, but in order for that flag to attain a value of 0 or 1 according to the value of the input argument (lx in our example), the processor has to make a "branching decision" (something like, if lx == 0 then ZF = 1, otherwise ZF = 0).Quadrinomial
@barakmanos: Good point, I didn't think that the actual comparison might have to do some branching to set the ZF flag.Lighting
@Jengerer: The branching will not necessarily yield reduced performance comparing to that "OR trick" (although I believe that it will do, at least on most architectures). But since the value of the input argument is unknown, it will yield inconsistent performance, depending on that value, and subjected to the branch-prediction heuristics of the underlying HW architecture.Quadrinomial

© 2022 - 2024 — McMap. All rights reserved.