Generate all binary strings of length n with k bits set
Asked Answered
E

14

69

What's the best algorithm to find all binary strings of length n that contain k bits set? For example, if n=4 and k=3, there are...

0111
1011
1101
1110

I need a good way to generate these given any n and any k so I'd prefer it to be done with strings.

Enterectomy answered 5/12, 2009 at 4:23 Comment(5)
For research. Doing some analysis on the matching preclusion number of certain graphs and I need some way to test all possible edge deletions of k edges.Enterectomy
If you're concerned about performance (i.e. large n and k), you probably want to consider a dynamic-programming approach.Promulgate
... particularly if it's feasible to perform, and cache (memoise) the results of, a partial edge-deletion on a subset of your graph, rather than first generating all strings and then doing stuff with them. This would boost your performance considerably.Promulgate
possible duplicate of Creating multiple numbers with certain number of bits setGonococcus
en.wikipedia.org/wiki/Combinatorial_number_systemCaplan
F
45

This method will generate all integers with exactly N '1' bits.

From https://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation

Compute the lexicographically next bit permutation

Suppose we have a pattern of N bits set to 1 in an integer and we want the next permutation of N 1 bits in a lexicographical sense. For example, if N is 3 and the bit pattern is 00010011, the next patterns would be 00010101, 00010110, 00011001, 00011010, 00011100, 00100011, and so forth. The following is a fast way to compute the next permutation.

unsigned int v; // current permutation of bits
unsigned int w; // next permutation of bits

unsigned int t = v | (v - 1); // t gets v's least significant 0 bits set to 1
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));

The __builtin_ctz(v) GNU C compiler intrinsic for x86 CPUs returns the number of trailing zeros. If you are using Microsoft compilers for x86, the intrinsic is _BitScanForward. These both emit a bsf instruction, but equivalents may be available for other architectures. If not, then consider using one of the methods for counting the consecutive zero bits mentioned earlier. Here is another version that tends to be slower because of its division operator, but it does not require counting the trailing zeros.

unsigned int t = (v | (v - 1)) + 1;
w = t | ((((t & -t) / (v & -v)) >> 1) - 1);

Thanks to Dario Sneidermanis of Argentina, who provided this on November 28, 2009.

Freestanding answered 16/1, 2010 at 1:37 Comment(1)
w = v==0 ? 0 : t | ((((t & -t) / (v & -v)) >> 1) - 1); prevents a divide by zero exception!Farthermost
M
23

Python

import itertools

def kbits(n, k):
    result = []
    for bits in itertools.combinations(range(n), k):
        s = ['0'] * n
        for bit in bits:
            s[bit] = '1'
        result.append(''.join(s))
    return result
    
print kbits(4, 3)

Output: ['1110', '1101', '1011', '0111']

Explanation:

Essentially we need to choose the positions of the 1-bits. There are n choose k ways of choosing k bits among n total bits. itertools is a nice module that does this for us. itertools.combinations(range(n), k) will choose k bits from [0, 1, 2 ... n-1] and then it's just a matter of building the string given those bit indexes.

Since you aren't using Python, look at the pseudo-code for itertools.combinations here:

http://docs.python.org/library/itertools.html#itertools.combinations

Should be easy to implement in any language.

Mothy answered 5/12, 2009 at 4:29 Comment(4)
Do you know of a language independent solution? This depends on python's itertools but my program is not written in python.Enterectomy
See my edit. The docs show how itertools.combinations is implemented. You can easily port it to whatever language you're using.Mothy
I added a link to a Java combination generator.Mothy
Java combination generator link is invalidFuselage
R
8

Forget about implementation ("be it done with strings" is obviously an implementation issue!) -- think about the algorithm, for Pete's sake... just as in, your very first TAG, man!

What you're looking for is all combinations of K items out of a set of N (the indices, 0 to N-1 , of the set bits). That's obviously simplest to express recursively, e.g., pseudocode:

combinations(K, setN):
  if k > length(setN): return "no combinations possible"
  if k == 0: return "empty combination"
  # combinations including the first item:
  return ((first-item-of setN) combined combinations(K-1, all-but-first-of setN))
   union combinations(K, all-but-first-of setN)

i.e., the first item is either present or absent: if present, you have K-1 left to go (from the tail aka all-but-firs), if absent, still K left to go.

Pattern-matching functional languages like SML or Haskell may be best to express this pseudocode (procedural ones, like my big love Python, may actually mask the problem too deeply by including too-rich functionality, such as itertools.combinations, which does all the hard work for you and therefore HIDES it from you!).

What are you most familiar with, for this purpose -- Scheme, SML, Haskell, ...? I'll be happy to translate the above pseudocode for you. I can do it in languages such as Python too, of course -- but since the point is getting you to understand the mechanics for this homework assignment, I won't use too-rich functionality such as itertools.combinations, but rather recursion (and recursion-elimination, if needed) on more obvious primitives (such as head, tail, and concatenation). But please DO let us know what pseudocode-like language you're most familiar with! (You DO understand that the problem you state is identically equipotent to "get all combinations of K items out or range(N)", right?).

Rove answered 5/12, 2009 at 4:39 Comment(4)
@Chip, "vehemence"?! You ain't seen nuttin' yet -- remember, I started out designing (digital) chips, so this kind of problems really stirs my Italian blood!-)Rove
You love itertools and you know it.Mothy
Uh, first of all, this isn't a homework assignment. Second, I'm using Java but that really shouldn't matter. While itertools.combinations is a python specific solution, I suppose I can implement it in Java, but that's another potential source of redundancy in a program that already runs slower than I had intended. The execution time for the program is already in the range of days but I'm able to find the computing power to brute force it as this is an NP-complete problem. I just don't need it to be any longer.Enterectomy
The problem I'm referring to as NP-complete isn't this binary string problem, rather the matching preclusion problem that I'm trying to solve that requires this algorithm. Sorry.Enterectomy
L
4

This C# method returns an enumerator that creates all combinations. As it creates the combinations as you enumerate them it only uses stack space, so it's not limited by memory space in the number of combinations that it can create.

This is the first version that I came up with. It's limited by the stack space to a length of about 2700:

static IEnumerable<string> BinStrings(int length, int bits) {
  if (length == 1) {
    yield return bits.ToString();
  } else {
    if (length > bits) {
      foreach (string s in BinStrings(length - 1, bits)) {
        yield return "0" + s;
      }
    }
    if (bits > 0) {
      foreach (string s in BinStrings(length - 1, bits - 1)) {
        yield return "1" + s;
      }
    }
  }
}

This is the second version, that uses a binary split rather than splitting off the first character, so it uses the stack much more efficiently. It's only limited by the memory space for the string that it creates in each iteration, and I have tested it up to a length of 10000000:

static IEnumerable<string> BinStrings(int length, int bits) {
  if (length == 1) {
    yield return bits.ToString();
  } else {
    int first = length / 2;
    int last = length - first;
    int low = Math.Max(0, bits - last);
    int high = Math.Min(bits, first);
    for (int i = low; i <= high; i++) {
      foreach (string f in BinStrings(first, i)) {
        foreach (string l in BinStrings(last, bits - i)) {
          yield return f + l;
        }
      }
    }
  }
}
Lilliamlillian answered 5/12, 2009 at 5:24 Comment(0)
C
4

One problem with many of the standard solutions to this problem is that the entire set of strings is generated and then those are iterated through, which may exhaust the stack. It quickly becomes unwieldy for any but the smallest sets. In addition, in many instances, only a partial sampling is needed, but the standard (recursive) solutions generally chop the problem into pieces that are heavily biased to one direction (eg. consider all the solutions with a zero starting bit, and then all the solutions with a one starting bit).

In many cases, it would be more desireable to be able to pass a bit string (specifying element selection) to a function and have it return the next bit string in such a way as to have a minimal change (this is known as a Gray Code) and to have a representation of all the elements.

Donald Knuth covers a whole host of algorithms for this in his Fascicle 3A, section 7.2.1.3: Generating all Combinations.

There is an approach for tackling the iterative Gray Code algorithm for all ways of choosing k elements from n at http://answers.yahoo.com/question/index?qid=20081208224633AA0gdMl with a link to final PHP code listed in the comment (click to expand it) at the bottom of the page.

Confection answered 9/3, 2010 at 13:13 Comment(0)
M
2

One possible 1.5-liner:

$ python -c 'import itertools; \
             print set([ n for n in itertools.permutations("0111", 4)])'

set([('1', '1', '1', '0'), ('0', '1', '1', '1'), ..., ('1', '0', '1', '1')])

.. where k is the number of 1s in "0111".

The itertools module explains equivalents for its methods; see the equivalent for the permutation method.

Molnar answered 5/12, 2009 at 4:32 Comment(2)
Nice, but won't scale as well, especially when n is large and k is small-ish.Mothy
It is extremely inefficient, consider using itertools.combinations insteadAdust
S
1

One algorithm that should work:

generate-strings(prefix, len, numBits) -> String:
    if (len == 0):
        print prefix
        return
    if (len == numBits):
        print prefix + (len x "1")
    generate-strings(prefix + "0", len-1, numBits)
    generate-strings(prefix + "1", len-1, numBits)

Good luck!

Scowl answered 5/12, 2009 at 4:31 Comment(4)
Ah, with a little modification, this algorithm works. Thanks! I'll post the modification in the original question.Enterectomy
Though, after consideration, this does produce a lot of dead branches in the tree. I'll have to test this with larger n values.Enterectomy
Ah, yeah it looks like the runtime for this algorithm will take too long for the data sets I need tested. I'm looking at n=32 and k=7 specifically, but I need the flexibility of scale for future tests.Enterectomy
FWIW, my algorithm runs in about 5 seconds for (32, 7)... 3.3 million combinations. And that's Python, Java will be faster.Mothy
G
1

In a more generic way, the below function will give you all possible index combinations for an N choose K problem which you can then apply to a string or whatever else:

def generate_index_combinations(n, k):

    possible_combinations = []

    def walk(current_index, indexes_so_far=None):
        indexes_so_far = indexes_so_far or []
        if len(indexes_so_far) == k:
            indexes_so_far = tuple(indexes_so_far)
            possible_combinations.append(indexes_so_far)
            return
        if current_index == n:
            return
        walk(current_index + 1, indexes_so_far + [current_index])
        walk(current_index + 1, indexes_so_far)

    if k == 0:
        return []
    walk(0)
    return possible_combinations
Glasses answered 28/12, 2014 at 6:17 Comment(0)
M
0

I would try recursion.

There are n digits with k of them 1s. Another way to view this is sequence of k+1 slots with n-k 0s distributed among them. That is, (a run of 0s followed by a 1) k times, then followed by another run of 0s. Any of these runs can be of length zero, but the total length needs to be n-k.

Represent this as an array of k+1 integers. Convert to a string at the bottom of the recursion.

Recursively call to depth n-k, a method that increments one element of the array before a recursive call and then decrements it, k+1 times.

At the depth of n-k, output the string.

int[] run = new int[k+1];

void recur(int depth) {
    if(depth == 0){
        output();
        return;
    }

    for(int i = 0; i < k + 1; ++i){
        ++run[i];
        recur(depth - 1);
        --run[i];
    }

public static void main(string[] arrrgghhs) {
    recur(n - k);
}

It's been a while since I have done Java, so there are probably some errors in this code, but the idea should work.

Mcwhirter answered 5/12, 2009 at 5:24 Comment(0)
P
0

Are strings faster than an array of ints? All the solutions prepending to strings probably result in a copy of the string at each iteration.

So probably the most efficient way would be an array of int or char that you append to. Java has efficient growable containers, right? Use that, if it's faster than string. Or if BigInteger is efficient, it's certainly compact, since each bit only takes a bit, not a whole byte or int. But then to iterate over the bits you need to & mask a bit, and bitshift the mask to the next bit position. So probably slower, unless JIT compilers are good at that these days.

I would post this a a comment on the original question, but my karma isn't high enough. Sorry.

Pinkeye answered 5/12, 2009 at 6:21 Comment(0)
A
0

Python (functional style)

Using python's itertools.combinations you can generate all choices of k our of n and map those choices to a binary array with reduce

from itertools import combinations
from functools import reduce # not necessary in python 2.x

def k_bits_on(k,n):
       one_at = lambda v,i:v[:i]+[1]+v[i+1:]
       return [tuple(reduce(one_at,c,[0]*n)) for c in combinations(range(n),k)]

Example usage:

In [4]: k_bits_on(2,5)
Out[4]:
[(0, 0, 0, 1, 1),
 (0, 0, 1, 0, 1),
 (0, 0, 1, 1, 0),
 (0, 1, 0, 0, 1),
 (0, 1, 0, 1, 0),
 (0, 1, 1, 0, 0),
 (1, 0, 0, 0, 1),
 (1, 0, 0, 1, 0),
 (1, 0, 1, 0, 0),
 (1, 1, 0, 0, 0)]
Adust answered 22/11, 2016 at 22:26 Comment(1)
is there a numpy equivalent?Latricelatricia
N
0

Well for this question (where you need to iterate over all the submasks in increasing order of their number of set bits), which has been marked as a duplicate of this.

We can simply iterate over all the submasks add them to a vector and sort it according to the number of set bits.

vector<int> v;
for(ll i=mask;i>0;i=(i-1)&mask)
    v.push_back(i);
auto cmp = [](const auto &a, const auto &b){
    return __builtin_popcountll(a) < __builtin_popcountll(b);
}
v.sort(v.begin(), v.end(), cmp);

Another way would be to iterate over all the submasks N times and add a number to the vector if the number of set bits is equal to i in the ith iteration.

Both ways have complexity of O(n*2^n)

Nummulite answered 18/6, 2019 at 10:23 Comment(0)
A
0

Best and Easy Solution

This is an easy problem. We just need to use Dynamic Programming. I can give my solution which stores integeres. After that you can convert integers to bitwise strings.

List<Long> dp[]=new List[m+1];
for(int i=0;i<=m;i++) dp[i]=new ArrayList<>();
// dp[i] stores all possible bit masks of n length and i bits set 

dp[0].add(0l);
for(int i=1;i<=m;i++){
  // transitions
  for(int j=0;j<dp[i-1].size();j++){
    long num=dp[i-1].get(j);
    for(int p=0;p<n;p++){
      if((num&(1l<<p))==0) dp[i].add(num|(1l<<p));
    }
  }
}

// dp[m] contains all possible numbers having m bits set of len n

But dp[m] contains duplicates because adding 1 to 10 or 01 gives 11 two times. To handle that we can use HashSet

Set<Long> set=new HashSet<>();
for(int i=0;i<dp[m].size();i++) set.add(dp[m].get(i));
Anthropolatry answered 25/10, 2022 at 14:57 Comment(0)
L
0

if you want to solve this problem recursively, you can do this by a D&C algorithm :

def binlist(n,k,s):
    if n==0:
        if s.count('1')==k:
            print(s)
    else:
        binlist(n-1,k,s+'1')
        binlist(n-1,k,s+'0')

binlist(5,3,'')

the output will be :

11100
11010
11001
10110
10101
10011
01110
01101
01011
00111
Lindeberg answered 13/1, 2023 at 16:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.