What is the fastest way to count set bits in UInt32
Asked Answered
K

6

20

What is the fastest way to count the number of set bits (i.e. count the number of 1s) in an UInt32 without the use of a look up table? Is there a way to count in O(1)?

Kukri answered 29/8, 2012 at 5:46 Comment(1)
Look at the answer of this post.Flowerless
P
28

The bit-twiddling hacks page has a number of options.

Of course, you could argue that iterating over all 32 possible bits is O(N) in that it's the same cost every time :)

For simplicity, I'd consider the lookup-table-per-byte approach, or Brian Kernighan's neat idea which iterates as many times as there are bits set, which I'd write as:

public static int CountBits(uint value)
{
    int count = 0;
    while (value != 0)
    {
        count++;
        value &= value - 1;
    }
    return count;
}

If you don't like the idea of populating a 256-entry lookup table, a lookup-per-nybble would still be pretty fast. Mind you, it's possible that 8 array lookups might be slower than 32 simple bit operations.

Of course, it's worth testing your app's real performance before going to particularly esoteric approaches... is this really a bottleneck for you?

Peltate answered 29/8, 2012 at 5:56 Comment(4)
This looks more readable, but I still wonder what is the usage of counting bits like this. If I had to guess, I'd say it's probably used in cryptography.Mottle
I was intrigued by the LUT approach. It seems to definitely be slower for population counts, particularly if the intrinsic is available, but for getting the indices of the set bits the LUT appears faster if some care is taken. I've explored this on GitHub if you want to take a look.Krystenkrystin
@AlissonReinaldoSilva in my case, I have to calculate how many IP addresses there are within a given subnet mask: var power = IPAddress.Parse("255.255.240.0").GetAddressBytes().Select(b => b.InverseBits()).CountSetBits(); var addressesInNetwork = Math.Pow(2, power);Groovy
@AlissonReinaldoSilva : One esoteric case I use this for is to check that a number is a positive power of 2; i.e. var isPowerOf2 = (CountBits(input) == 1 && input>0);Preterition
K
27

Is a duplicate of: how-to-implement-bitcount-using-only-bitwise-operators or best-algorithm-to-count-the-number-of-set-bits-in-a-32-bit-integer

And there are many solutions for that problem. The one I use is:

    int NumberOfSetBits(int i)
    {
        i = i - ((i >> 1) & 0x55555555);
        i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
        return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
    }
Konopka answered 29/8, 2012 at 10:28 Comment(4)
I tested the performance of this answer, and the one from Jon Skeet. While that approach has a better readibility, this one is the fastest.Turbine
@raoul: Though this has the same performance for every case (12 arithmetic operations including one multiplication), Jon's version (actually Brian Kerninghan's) can be much faster if only a few bits are set (which is a typical case for enums).Bollard
Which one has better performance will depend on the rest of your application code. Whenever you're comparing two relatively fast alternatives, real performance will really be dominated by other factors outside of the method body, such as how stressed is the CPU instruction / memory cache typically at the time this method is called, and how much branch mispredictions and cache-misses do you typically get as a result. Purely contextual. ...Anyway the point I was getting to is that less branching == more predictable performance - not always the fastest - but predictability is very valuable too.Volvox
(...cont. The classic opposite example that everyone is familiar with is when you get that sudden dip in performance because a full garbage collection kicked in.)Volvox
K
22

In .NET Core 3.0 the x86 popcnt intrinsic has been exposed, allowing you to perform a single-instruction population count calculation on a uint or uint64.

int setBits = System.Runtime.Intrinsics.X86.Popcnt.PopCount(value);

There is also a 64-bit version System.Runtime.Intrinsics.X86.Popcnt.X64.PopCount() that can be used on a ulong when running on a 64-bit CPU.

Krystenkrystin answered 15/9, 2019 at 0:30 Comment(6)
thanks! Do you know if there is an intrinsic or defacto algorithm for returning the indices of the set bits, not just the counts?Gilgilba
@AlexNorcliffe That isn't possible with an intrinsic. The simplest approach is just to loop through all bits in the value and put the indices of the set bits into a list. You can speed this up further by computing the popcount ahead of time so you can use a preallocated array rather than a List object. The faster way is to do what John Skeet mentioned above and split the number into chunks so you can use a lookup table. I've investigated this here and shown that it is generally faster than computing the positions each time.Krystenkrystin
this is great - thanks! I must admit I'm mystified why HashSet is performing fastest for me, even when using a the very latest reference source for BitArray in .NET Core from Feb 2019 that has optimisations using Span and aggressive inlining. In my test I have 5 million values, and 2.5 million secondary values. My goal is to essentially end up with a list of which values in the first are not in the second. They are contiguous integers so I expected a BitArray to win - can't quite wrap my head around how calling HasSet<>.Remove could be faster! Tho it does use more allocations obvsGilgilba
@AlexNorcliffe I looked at the reference source for BitArray and each new instance creates a heap allocation (array of Int32 values large enough to hold the bits), extending the array (i.e. setting the Length property to a larger value) causes a new allocation and copy, shrinking the array causes the unused array elements and final-element bits to be cleared. Foreach loops over a BitArray will also allocate a new enumerator object on the heap, but the Get function should be fast: one aligned memory access, one division, one modulo, one variable shift left, one AND, one integer comparison.Krystenkrystin
@AlexNorcliffe Another potential option is to use BitVector32, which is effectively an encapsulation of bit extraction maths with some convenience methods, and since it's a struct it gets put on the stack rather than the heap. But the fastest performance is almost certainly always going to be doing the calculation manually in your own code, all inline, because you can avoid doing calls and branches (most calls to framework classes have parameter validation!) and keep things in a very tight loop for better performance.Krystenkrystin
@Krystenkrystin There is an even better solution since .NET Core 3.0: BitOperations.PopCount, which is available across all platforms and (hopefully) uses the intrinsic operation you mentioned.Clotheshorse
T
13

A platform independent BitOperations.PopCount is available in Core 3.0 and higher.

Hardware intrinsics are used when available, otherwise it defaults to a software fallback. Currently, both X86/64 and ARM processors are supported.

Source

Full credit to @Mark who mentioned this in a comment to another answer.

Tasso answered 12/10, 2022 at 17:48 Comment(1)
Also note Jeffs answer where this is now even better in .Net 7 :)Volvox
O
4

As of .NET 7 Int32 (and other integer types) have a static PopCount method so you can just call int.PopCount(int bitMask) without having to write your own algorithm or rely on System.Numerics and unsigned types with BitOperations.

https://learn.microsoft.com/en-us/dotnet/api/system.int32.popcount

Overarm answered 1/8, 2023 at 17:41 Comment(0)
D
-3

Here is the solution in java to get the set bits of a given number.

import java.util.*;

public class HelloWorld {

static int setBits(int n) {
    int count = 0;
    while(n != 0) {
        count+= ((n & 1) == 1) ? 1 : 0;
        n >>= 1;

    }
    return count;
}

 public static void main(String []args){
     Scanner sc = new Scanner(System.in);
     int n = sc.nextInt();
     System.out.println("Results: " + HelloWorld.setBits(n)); 
 }
}
Denotative answered 24/7, 2018 at 12:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.