How can I check Hamming Weight without converting to binary?
Asked Answered
L

10

14

How can I get the number of "1"s in the binary representation of a number without actually converting and counting ?

e.g.

  def number_of_ones(n):
      # do something
      # I want to MAKE this FASTER (computationally less complex).
      c = 0
      while n:
        c += n%2
        n /= 2
      return c


>>> number_of_ones(5)
    2
>>> number_of_ones(4)
    1
Ladida answered 9/5, 2009 at 18:54 Comment(4)
Is this a duplicate of #844237Hyphen
@ChrisW- python and c are two different lanaguagesImmunogenetics
It's not an exact duplicate. bitwise operations in python are much more expensive that c.Romy
Python integers already are binary, that's why it makes sense to talk about their hamming weight / popcount. Count the number of set bits in a 32-bit integer notes that Python since 3.10 has int.bit_count()Twirp
R
6

IMO, a good approach would be to use a look-up table - create a dictionary which converts bytes to number of 1's (you can use the code you posted to generate it, it would only need to run once), and then use something like this:

def number_of_ones(n):
    sum = 0
    while n != 0:
        sum += lookup_table[n & 0xff]
        n >>= 8
    return sum

I believe this is a fairly good trade-off between space and running-time.

Rhinal answered 9/5, 2009 at 19:46 Comment(0)
H
14

I'm not a python programmer, but hopefully it will be enough for you to follow.

c = 0
while n:
    c += 1
    n &= n - 1

return c

While a little obscure, it's primary advantage is speed and simplicity. The while loop is only iterated once for every bit set to 1 in n.

Haemagglutinate answered 9/5, 2009 at 19:5 Comment(0)
P
8

You cannot make this computationally less complex. It will be O(n) the number of bits, or, as the answer with the & trick showed, O(n) the number of bits set to 1; but unless all of the numbers you are using are a special case, the latter should on average be n/2, so both of those O(n) numbers are the same.

And the lookup-table trick, of course, is actually doing nothing for the computational complexity; it's just paying for time with space but without changing the underlying economics, which are that you must examine each bit once somehow and there is no way around that. You cannot, logically, answer a question about the bits in the number without inspecting each of them.

Now, I suppose I'm being a bit sloppy since many of these examples are actually O(n^2) since in Python you have to examine the whole number at once time, so with a Python long integer of, say, 100 bytes, a + or an & or a / operation will look at each byte at least once, and that will happen over and over until the number is reduced to zero (in the schemes outlined above), so these, again, are really O(n^2) operations. I am not sure Python will allow a true O(n) solution here.

Anyway: if you were really asking about computational complexity, which specifically means big-O analysis, that's your answer. :-)

Para answered 9/5, 2009 at 20:12 Comment(0)
R
6

IMO, a good approach would be to use a look-up table - create a dictionary which converts bytes to number of 1's (you can use the code you posted to generate it, it would only need to run once), and then use something like this:

def number_of_ones(n):
    sum = 0
    while n != 0:
        sum += lookup_table[n & 0xff]
        n >>= 8
    return sum

I believe this is a fairly good trade-off between space and running-time.

Rhinal answered 9/5, 2009 at 19:46 Comment(0)
G
6

If you want to do it in a single line, you could use:

sum( [x&(1<<i)>0 for i in range(32)] )
Gradatim answered 9/5, 2009 at 20:3 Comment(0)
S
4

https://en.wikipedia.org/wiki/Hamming_weight.

Solution with O(log(n)), n is bit length

m1  = 0x5555555555555555 #binary: 0101...
m2  = 0x3333333333333333 #binary: 00110011..
m4  = 0x0f0f0f0f0f0f0f0f #binary:  4 zeros,  4 ones ...
m8  = 0x00ff00ff00ff00ff #binary:  8 zeros,  8 ones ...
m16 = 0x0000ffff0000ffff #binary: 16 zeros, 16 ones ...
m32 = 0x00000000ffffffff #binary: 32 zeros, 32 ones
h01 = 0x0101010101010101 #the sum of 256 to the power of 0,1,2,3...

def hammingWeight(x: int) -> int:
    x -= (x>>1)&m1
    x = (x&m2) + ((x>>2)&m2)
    x = (x+(x>>4))&m4
    return ((x*h01)>>56)&0x7f

And in python 3.10, the int type will have a bit_count() method

Sarcomatosis answered 1/2, 2021 at 8:31 Comment(1)
Python integers are arbitrary precision; this will popcount the low 64 bits. Which might be fine for some use-cases. Obviously bit_count() if much better for both clarity and performance.Twirp
Z
1

If you're actually concerned about speed, code it up in C (see this question for how), and interface with the C implementation using something like ctypes.

Zlatoust answered 9/5, 2009 at 19:12 Comment(1)
I am concerned with computational complexity not with actual speed or language.Ladida
S
0
p = lambda n:n and 1+p(n&(n-1))

This uses short-circuiting and recursion, when n is greater than 0, it switches to calculate 1+p(n&(n-1)) where p(n&(n-1)) is called and so on, when n is 0 it returns 0. Complexity being O(n) since it runs the number of times the number of ones exist in the binary.

Example: p(9) = 1+p(8) = 1+1+p(0) = 1+1+0=2

Stoll answered 8/4, 2016 at 15:3 Comment(1)
If this is a valid answer, it would be more valuable with some explanation.Ehrenburg
D
0

I came here today to check the state of the art.

This is a subject come up from time to time, modern CPUs have a the population count algorithm. This is useful to calculate Bit Error Rate in certain communications application. This is the hamming weight, related to Hamming distance, scipy has an implementation but it works with arrays, not with the binary representation of numbers. For fixed size computation, e.g numpy arrays, the fastest approach I know is the method this answer, however the mentioned answer, I am calling this the divide and conquer method. In the general case the divide and conquer has complexity O(log(n)) instead of O(n). The accepted answer is really good for most cases, and even better in languages such as C where you could simply take ((*char)bits)[i], instead of shifting the number.

Here I give a general implementation for the divide and conquer approach where the masks are computed dynamically depending on the size of the input number.

def dq_number_of_ones(n):
  nb = 1
  while n >> nb:
    nb *= 2
  t = (1 << nb) - 1
  masks = []
  while nb > 1:
    nb //= 2
    t ^= t >> nb
    masks.append(t)
  
  t = n
  s = 1
  for tm in masks[::-1]:
    tm = masks.pop()
    t = ((tm & t) >> s) + ((tm >> s) & t)
    s *= 2
  return t

And for completenes here the OP's method and the accepted lookup table method

def number_of_ones(n):
    # do something
    # I want to MAKE this FASTER (computationally less complex).
    c = 0
    while n:
      c += n%2
      n //= 2
    return c
lookup_table = [number_of_ones(i) for i in range(256)]
def lt_number_of_ones(n):
    sum = 0
    while n != 0:
        sum += lookup_table[n & 0xff]
        n >>= 8
    return sum

A practical comparison of the two

import time
import matplotlib.pyplot as plt
x = []
t1 = []
t2 = []
t3 = []
for i in range(3,8000,20):
  y = i**i
  if i < 1000:
    time.sleep(0.0001)
    t0 = time.time()
    w1 = number_of_ones(y)
    t1.append(time.time() - t0)
  else:
    t1.append(np.nan)
  time.sleep(0.0001)
  t0 = time.time()
  w2 = dq_number_of_ones(y)
  t2.append(time.time() - t0)
  time.sleep(0.0001)
  t0 = time.time()
  _ = lt_number_of_ones(y)
  t3.append(time.time() - t0)
  time.sleep(0.0001)
  
  x.append(i)

plt.figure(figsize=(12, 4))
plt.subplot(121)
plt.plot(x, t1)
plt.plot(x, t2)
plt.plot(x, t3)
plt.xlim([10,None])
plt.title('Linear scale')
plt.ylabel('time to count bits in $x^x$')
plt.subplot(122)
plt.loglog(x, t1)
plt.loglog(x, t2)
plt.loglog(x, t3)
plt.xlim([10,None])
plt.title('Log scale')
plt.legend(['direct counting', 'lookup table', 'divide and conquer'])

enter image description here

Dutra answered 22/5, 2022 at 9:59 Comment(6)
The strategy with fixed-width masks like 0x5555555555555555 would best be described as an SWAR bithack. (As explained in Count the number of set bits in a 32-bit integer). I guess divide and conquer does kind of fit, though, as you divide into 2-bit accumulators that you widen. That version only works for 64-bit integers and narrower, though. One could check for the size being larger in a program that sometimes uses huge integers. I assume these are all horribly slow compared to the built-in bit_count() which hopefully makes these all obsolete.Twirp
Counting iterations of n &= n - 1 (usually attributed to Kernighan) is another algorithm worth looking at that's good for numbers with only a few set bits.Twirp
This Kerninghan?Dutra
Yes; an answer on Count the number of set bits in a 32-bit integer quotes graphics.stanford.edu/~seander/… which attributes it to C Programming Language 2nd Ed. in exercise 2-9 (by K & R). But also that it "was first published by Peter Wegner in CACM 3 (1960), 322". Still, as I said, it's often called Kernighan's method, perhaps because of Sean Anderson's attribution of it. When I commented yesterday, I hadn't checked on the history.Twirp
Nice info :) you must be a very annoying person!!! heheDutra
I posted the actual cPython code in my answer https://mcmap.net/q/303344/-how-can-i-check-hamming-weight-without-converting-to-binary Since it uses the compiler built-in, I assume it is much faster than any pure python function.Commix
M
0

I got interested in Siong's 'bit depletion' of version of bitCount. I started playing around with it to see if I could come up with a left-to-right variation of the same code.

What I wound up with was clearly more computational than the original. On the other hand, it was capturing the index offset of the bit set, so I collect the integers into a list.

Original (bomb-proofed):


def bitCount(int_no):

    if not isinstance( int_no, int) or int_no < 0:
        return None

    count = 0
    while(int_no):
        int_no &= (int_no - 1)
        count += 1
        
    return count

Left-right version:

from math import floor, log

def bitsSet(int_no):

    if not isinstance( int_no, int) or int_no < 0:
        return None
    
    bit_indexes = []
    while(int_no):
        index = floor(log(int_no, 2))
        int_no &= (pow(2, index) - 1)
        bit_indexes.insert(0, index)
    
    return bit_indexes

If you take len(bitsSet(n)), it works out to the same answer. Take the long way home ...

Medicate answered 10/10, 2023 at 13:54 Comment(0)
C
0

Python 3.10+: int.bit_count()

Just for fun, how is it implemented? In Objects/longobject.c it breaks it into words PyLong_SHIFT long, and eventually calls _Py_popcount32 in pycore_binutils.h. It uses __builtin_popcount if available, since the compiler built-in is obviously the best, otherwise the SWAR / "Kernighan's way" (according to Knuth, first published by Peter Wegner in 1960.)

// Population count: count the number of 1's in 'x'
// (number of bits set to 1), also known as the hamming weight.
//
// Implementation note. CPUID is not used, to test if x86 POPCNT instruction
// can be used, to keep the implementation simple. For example, Visual Studio
// __popcnt() is not used this reason. The clang and GCC builtin function can
// use the x86 POPCNT instruction if the target architecture has SSE4a or
// newer.
static inline int
_Py_popcount32(uint32_t x)
{
#if (defined(__clang__) || defined(__GNUC__))

#if SIZEOF_INT >= 4
    Py_BUILD_ASSERT(sizeof(x) <= sizeof(unsigned int));
    return __builtin_popcount(x);
#else
    // The C standard guarantees that unsigned long will always be big enough
    // to hold a uint32_t value without losing information.
    Py_BUILD_ASSERT(sizeof(x) <= sizeof(unsigned long));
    return __builtin_popcountl(x);
#endif

#else
    // 32-bit SWAR (SIMD Within A Register) popcount

    // Binary: 0 1 0 1 ...
    const uint32_t M1 = 0x55555555;
    // Binary: 00 11 00 11. ..
    const uint32_t M2 = 0x33333333;
    // Binary: 0000 1111 0000 1111 ...
    const uint32_t M4 = 0x0F0F0F0F;

    // Put count of each 2 bits into those 2 bits
    x = x - ((x >> 1) & M1);
    // Put count of each 4 bits into those 4 bits
    x = (x & M2) + ((x >> 2) & M2);
    // Put count of each 8 bits into those 8 bits
    x = (x + (x >> 4)) & M4;
    // Sum of the 4 byte counts.
    // Take care when considering changes to the next line. Portability and
    // correctness are delicate here, thanks to C's "integer promotions" (C99
    // §6.3.1.1p2). On machines where the `int` type has width greater than 32
    // bits, `x` will be promoted to an `int`, and following C's "usual
    // arithmetic conversions" (C99 §6.3.1.8), the multiplication will be
    // performed as a multiplication of two `unsigned int` operands. In this
    // case it's critical that we cast back to `uint32_t` in order to keep only
    // the least significant 32 bits. On machines where the `int` type has
    // width no greater than 32, the multiplication is of two 32-bit unsigned
    // integer types, and the (uint32_t) cast is a no-op. In both cases, we
    // avoid the risk of undefined behaviour due to overflow of a
    // multiplication of signed integer types.
    return (uint32_t)(x * 0x01010101U) >> 24;
#endif
}


Commix answered 1/7 at 0:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.