How do you efficiently count the trailing zero bits in a number?

C

4

6

I've written a function trailing_zeroes(int n) that returns the number of the trailing zeroes in the binary representation of a number.

Example: 4 in binary is 100, so the function in this case returns 2.

unsigned trailing_zeroes(int n) {
    unsigned bits;

    bits = 0;
    while (n >= 0 && !(n & 01)) {
        ++bits;
        if (n != 0)
            n >>= 1;
        else
            break;
    }
    return bits;
}

The reason of the if statement is because in case n equals to 0, there will be a loop.

I think it's pretty ugly this code written like this; is there a better way?

I want to avoid the break statement inside the while, because a lot of people told me that using that statement inside while/for sometimes could be "informal". I thought to rewrite the function like this, but I don't think it's the best way to do it:

unsigned bits;
if (n == 0)
    return bits = 1;

bits = 0;
while (!(n & 01)) {
    ++bits;
    n >>= 1;
}

Castellany answered 20/7, 2017 at 17:52 Comment(8)

Try __builtin_ctz. – Eckard 20/7, 2017 at 17:53

Simplest way might be converting to a std::string and inspect that one. – Fretwell 20/7, 2017 at 17:54

What is your specific question? We are not a code review service (as a sidenote: the code has implementation defined behaviour for certain values of n). – Blen 20/7, 2017 at 17:55

@Olaf Excuse me I don't understand. What could happen if n has those certain values ? (i'm really sorry for my bad english) – Castellany 20/7, 2017 at 18:15

Tip: Avoid using C and C++. Although the question may have relevance to both languages, the best answer may be language specific. – Whippoorwill 20/7, 2017 at 18:23

Why does 0 have 1 trailing zero? Shouldn't it have as many trailing zeros as there are bits in the integer type? Or should it have zero trailing bits because there is no 1-bit to be followed by zeros? (And the assignment in return bits = 1; is a bit bizarre; simply return 1; would be more sensible, though the compiler will almost certainly convert the code as if you wrote that anyway.) – Cheeky 21/7, 2017 at 0:55

@JonathanLeffler I was wrong. Now i think that it should return as many trailing zeros as there are bits in the integer type (as you said). But there are some built-in function that in case x is 0 the result is undefined (why ?) – Castellany 21/7, 2017 at 11:38

I don't know why the __builtin_ctz in GCC gives an undefined result for zero, but (guesswork) it is likely because the native implementations on different platforms yields different results — and it is likely that the two results are zero and the number of bits in the integer type. But rather than make the result platform-specific, GCC makes it undefined, so portable code doesn't call the function with an argument that's zero. – Cheeky 21/7, 2017 at 13:25

C

6

Your function is incorrect: it still has an infinite loop for 0. The test should be:

while (n > 0 && !(n & 1))

Note that you cannot handle negative numbers with this approach, hence your function should probably take an unsigned number argument, or you could convert the argument to unsigned.

Your function should special case 0 and use a simpler loop:

unsigned trailing_zeroes(int n) {
    unsigned bits = 0, x = n;

    if (x) {
        while ((x & 1) == 0) {
            ++bits;
            x >>= 1;
        }
    }
    return bits;
}

The above function is very simple and easy to understand. It is quite fast if the result is small. The value returned for 0 is 0 as in your function, which is questionable as 0 really has as many trailing zeroes as value bits in the unsigned type.

There is a more efficient approach with a constant number of steps:

unsigned trailing_zeroes(int n) {
    unsigned bits = 0, x = n;

    if (x) {
        /* assuming `x` has 32 bits: lets count the low order 0 bits in batches */
        /* mask the 16 low order bits, add 16 and shift them out if they are all 0 */
        if (!(x & 0x0000FFFF)) { bits += 16; x >>= 16; }
        /* mask the 8 low order bits, add 8 and shift them out if they are all 0 */
        if (!(x & 0x000000FF)) { bits +=  8; x >>=  8; }
        /* mask the 4 low order bits, add 4 and shift them out if they are all 0 */
        if (!(x & 0x0000000F)) { bits +=  4; x >>=  4; }
        /* mask the 2 low order bits, add 2 and shift them out if they are all 0 */
        if (!(x & 0x00000003)) { bits +=  2; x >>=  2; }
        /* mask the low order bit and add 1 if it is 0 */
        bits += (x & 1) ^ 1;
    }
    return bits;
}

Note that we could handle any larger int size by changing the first step to

while (!(x & 0x0000FFFF)) { bits += 16; x >>= 16; }

Some compilers have a built-in function __builtin_ctz() to count the number of trailing zeroes using very efficient assembly code. It is not a C Standard function but at the cost of reduced portability, you might want to use it if it is available. Check your compiler's documentation.

Here is the abstract from GCC documentation:

Built-in Function: int __builtin_ctz (unsigned int x)

Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.

Colton answered 20/7, 2017 at 21:0 Comment(8)

How about unsigned bits = sizeof n * CHAR_BIT; ... if (x) { bits = 0; ... } to handle the 0 case? (I guess it all depends on how you define trailing) I could see it being 0 or sizeof n * CHAR_BIT depending on the definition. – Commonable 20/7, 2017 at 21:26

@DavidC.Rankin: yes, I chose to stick with the OP's semantics, which are simpler to implement. – Colton 20/7, 2017 at 21:43

Yes, I agree with that. This is one that just struck me -- that I don't know the answer to. Zero would have all trailing zeros, but trailing behind nothing -- so it is one of those things that will just have to be as desired. – Commonable 20/7, 2017 at 21:46

@DavidC.Rankin: as you can see from the gcc doc, 0 must be special cased for the built-in function as well. – Colton 20/7, 2017 at 21:54

About the second approach: how do you know what costants number you have to use ? I mean, it's not very intuitive, that's pretty complex for me :P. Anyway, why do you convert the number to unsigned ? I guess because, for example, if in case n equals to ** -4 ** converting it to unsigned it becames 4 ? So why don't just change the sign in unsigned trailing_zeroes(unsigned n) (anyway thank you, you've been very very helpful :3) – Castellany 20/7, 2017 at 21:54

For negative operands, << has undefined behavior and the result of >> is implementation-defined. They are well defined for unsigned. – Commonable 20/7, 2017 at 21:57

@ClaudioPisa: the number is converted to unsigned to ensure defined behavior for right shifting x. Right shift a signed int with a negative value has implementation defined behavior, whereas the conversion is always defined and the function counts the trailing zeroes of negative correctly (-2 -> 111..110 -> 1). – Colton 20/7, 2017 at 21:58

@ClaudioPisa: for the constant values, it is easy: I shall add comments in the source code. – Colton 20/7, 2017 at 22:0

H

7

As already mentioned, there is a builtin that can do this, and as it may use hardware, it may be very fast. However, the doc for GCC does say that the result is undefined if the input is 0. Since this is an extension it may not be available for your compiler.

Otherwise, whenever anyone says 'but manipulation' or 'bit counting', you need to reach for your copy of "Hacker's Delight". A book so good that I bought both editions. There are about 4 pages (1st ed.) dedicated to this, 'ntz' (number of trailing zeroes). If you already have a 'nlz' (number of leading zeroes) or a 'popcnt' function, then you can get the ntz directly. Otherwise the book gives several implementations, some using popcnt, one with a loop, the others using binary search.

For instance,

int ntz3(unsigned x) {
   int n;

   if (x == 0) return(32);
   n = 1;
   if ((x & 0x0000FFFF) == 0) {n = n +16; x = x >>16;}
   if ((x & 0x000000FF) == 0) {n = n + 8; x = x >> 8;}
   if ((x & 0x0000000F) == 0) {n = n + 4; x = x >> 4;}
   if ((x & 0x00000003) == 0) {n = n + 2; x = x >> 2;}
   return n - (x & 1);
}

Heavyladen answered 20/7, 2017 at 18:22 Comment(1)

Please provide a reference to the standard defining this built-in (or any built-in function). – Blen 20/7, 2017 at 18:26

E

7

C++20 introduced the <bit> header, which provides countr_zero. The linked cppreference page provides the example:

#include <bit>
#include <bitset>
#include <cstdint>
#include <iostream>
 
int main()
{
    for (const std::uint8_t i : { 0, 0b11111111, 0b00011100, 0b00011101 }) {
        std::cout << "countr_zero( " << std::bitset<8>(i) << " ) = "
                  << std::countr_zero(i) << '\n';
    }
}

Output:

countr_zero( 00000000 ) = 8
countr_zero( 11111111 ) = 0
countr_zero( 00011100 ) = 2
countr_zero( 00011101 ) = 0

If you need a value of 0 for a 0 input, you can do something like:

(n ? countr_zero(n) : 0)

Exert answered 3/1, 2023 at 1:54 Comment(0)

C

6