is there a non-iterative way of finding the index of the Nth set bit? [duplicate] - McMap

About

is there a non-iterative way of finding the index of the Nth set bit? [duplicate]

Asked 13/11, 2023 at 18:11 Answered 13/11, 2023 at 19:5

c++c bit-manipulation

C

1

3

#include <cassert>
#include <cstdint>

uint64_t pos_of_nth_bit(uint64_t X, uint64_t bit) {
  while (X) {
    if (!bit--)
      return __builtin_ctzll(X);
    X = X & (X - 1);
  }
  assert(false && "no such bit");
}

Is there a faster way to do the logic from pos_of_nth_bit ?

for example:

pos_of_nth_bit(0b101001000, 0) = 3
pos_of_nth_bit(0b101001000, 1) = 6
pos_of_nth_bit(0b101001000, 2) = 8

Cristinacristine answered 13/11, 2023 at 18:11 Comment(18)

@500-InternalServerError: How would log2 help? – Tarmac 13/11, 2023 at 18:17

I don't understand the question. The index of the Nth bit ... is just bit, isn't it? – Tetrad 13/11, 2023 at 18:17

@TedLyngmo I added examples to clarify what I mean. – Cristinacristine 13/11, 2023 at 18:20

@TedLyngmo: They want the index (of the position) of the nth set bit. So in 10010001001, the set bits are those at 1, 1000, 10000000, and 10000000000. The third of those would have index 7 (counting from 0 on the right; __builtin_ctzll indexes them differently). – Tarmac 13/11, 2023 at 18:21

@EricPostpischil Aha, "the nth set bit" ... Tyker: May I suggest a clearer function name? :-) – Tetrad 13/11, 2023 at 18:22

You could use a lookup table. Though for uint64_t that would be very large (probably a waste of space). But if you do it for a byte. Then you can loop over bytes and not single bits to calcuate the value. – Struma 13/11, 2023 at 18:22

This question does not deserve votes down. The code makes it clear what is being asked, as X & (X-1) turns off the lowest set bit, so the loop is counting set bits. While not everybody may recognize that X & (X - 1) turns off the lowest set bit, the question is tagged bit-manipulation, and one ought to exercise restraint on down voting bit manipulation questions if they are not familiar with this elementary bit manipulation idiom. – Tarmac 13/11, 2023 at 18:24

I'd say downvoting or voting to close when the question does not provide clarity (readers shouldn't have to evaluate code in detail to determine the author's intent, especially as that may mislead) is fair game. However, those who do so should pay close attention to the question and reverse those actions if the issue is rectified. – Pebbly 13/11, 2023 at 18:27

I'd use std::countl_zero(X) for C++20, and return std::numeric_limits<std::uint64_t>::max(); instead of assert. This'd make the code portable and constexprable. – Pelion 13/11, 2023 at 18:53

From the pos_of_nth_bit(0b101001000, 0) = 3 example we can infer that pos_of_nth_bit(0b1, 0) = 0 - what is pos_of_nth_bit(0b0, 0)? – Argentiferous 13/11, 2023 at 19:55

This is the select part of Rank-Select, so a relatively well-studied problem. For example: A Fast x86 Implementation of Select – Hardigg 13/11, 2023 at 20:20

When you say "non-iterative" do you mean "without repeatedly clearing the least significant set bit"? Because I don't think you actually want to forgo "iteration" entirely. I would think that the 6-or-so iterations of the O(log N) solution would be acceptable to you, wouldn't it? Your title has XY wording of your true objective. – Guardado 13/11, 2023 at 21:4

@harold after some more experiementing I tested something that should match what the paper proposed tzcnt + pdep + shift. but it is 42% slower than the naive baseline and 3.6x slower than the fastest implementation I found. so it isnt really intresting. – Cristinacristine 13/11, 2023 at 21:11

@Cristinacristine are you testing it on AMD Zen 2? That processor family has very slow pext/pdep. It's fast on Intel and Zen 3 though. – Hardigg 13/11, 2023 at 21:18

@harold it was on a AMD Zen2 – Cristinacristine 13/11, 2023 at 21:30

I don't see how this question is a duplicate of the flagged one. That one requests a value with a single bit set, this one requests the index of the bit. If there's an answer on the other question that illustrates that, please point it out. – Craigie 13/11, 2023 at 23:40

@MarkRansom: OP here assumes __builtin_ctzll is available, so finding the index from the value is trivial (and so is the converse). – Tarmac 13/11, 2023 at 23:53

@EricPostpischil you got me there, thanks. – Craigie 14/11, 2023 at 3:55

J

0

At least if we restrict the answer to normal hardware and reasonably practical solutions, you're almost certainly going to be stuck with some iteration.

The obvious O(1) solution that's not practical is a lookup table. The reason it's not practical is that for a M-bit input, the lookup table for the position of the N'th set bit will require a lookup table with N * 2^M entries (each if which would need to be log₂(M) bits in size). For 64-bit inputs, I'm pretty sure that's more memory than has been produced in all of history.

It's pretty easy, however, to reduce the maximum number of iterations quite a lot, while keeping the lookup table size quite reasonable. For example, we can look up the number of bits set in a byte with only a 256-entry table. With this, we do a maximum of 8 table lookups to isolate one byte that contains the i'th set bit, and then 8 iterations within that byte to find the correct bit. This reduces the worst case from 64 iterations to 16 iterations (and an expected average from 32 to around 8). The lookup table is small enough to fit in L1 cache of almost any CPU that has a cache at all, so access to it will generally be quite fast.

Julissa answered 13/11, 2023 at 19:5 Comment(1)

If you're using a lookup table just to count bits, you may as well use std::popcount – Hardigg 13/11, 2023 at 20:25

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.