Generate subsets of length n
Asked Answered
S

1

0

Given a Set, generate all subsets of length n.

  • Sample input:

    set = new Set(['a', 'b', 'c']), length = 2
    
  • Sample output:

    Set {'a', 'b'}, Set {'a', 'c'}, Set {'b', 'c'}
    

How to do this efficiently, without converting the Set to an Array?

I already have a good solution for arrays as input:

function* subsets(array, length, start = 0) {
  if (start >= array.length || length < 1) {
    yield new Set();
  } else {
    while (start <= array.length - length) {
      let first = array[start];
      for (subset of subsets(array, length - 1, start + 1)) {
        subset.add(first);
        yield subset;
      }
      ++start;
    }
  }
}

let array = ['a', 'b', 'c', 'd'];

for (subset of subsets(array, 2)) {
  console.log(...subset);
}

However, I couldn't come up with an efficient implementation for sets without converting the set to an array - wich I would like to avoid in favour of e.g. purely using iterators.

Sherillsherilyn answered 18/6, 2016 at 2:37 Comment(8)
Whoever voted to close this question because it is too broad: I couldn't find any solution. I would be happy if there were at least one. If you think it is too broad, would you mind providing at least one answers?Sherillsherilyn
If your concern is efficiency wouldn't it be better to use the array version? Here's an answer to which is more efficient (Object vs Array) #8423993Topical
Take a look at this: linkSignpost
@IsmailRBOUH "Implements functions to calculate combinations of elements in JS Arrays." well...Sherillsherilyn
@Topical If I want subsets of a large Set, wouldn't it be nice if we could just work on the given Set instead of copying all its elements to an Array first? I suppose iterating through sets is fast or will be fast in the future, faster than first converting everything to an array and iterating that instead. And it will always be more space efficient.Sherillsherilyn
Set iterators can't go back. If you don't want an array, you will need to keep creating new iterators and skip iterations until you reach the desired position. That's a waste. Just convert to array, which allows random access.Repair
@Repair That is exactly the part were I got stuck: cloning or reversing iterators is not possible. But - there might be a solution by exploiting the fact that Set.values() keeps the insertion order, removing from the beginning and inserting to the end might allow 'backtracking' by simply incrementing the iterator. I haven't explored that yet and wouldnt be surprised if it is much slower than array conversion. I was hoping there is another way. But your comment strongly suggests there is none. Thanks!Sherillsherilyn
@Sherillsherilyn I assumed you didn't want to alter the order of the set. Your idea might work indeed.Repair
R
3

I implemented your idea of deleting and reinserting values with ordered multi subsets:

function orderedMultiSubsets(set, n) {
  if(!Number.isInteger(n) || n < 0) return function*(){}();
  var subset = new Array(n),
      iterator = set.values();
  return (function* backtrack(index) {
    if(index === n) {
      yield subset.slice();
    } else {
      for(var i=0; i<set.size; ++i) {
        subset[index] = iterator.next().value; /* Get first item */
        set.delete(subset[index]); /* Remove it */
        set.add(subset[index]); /* Insert it at the end */
        yield* backtrack(index+1);
      }
    }
  })(0);
}
for(var subset of orderedMultiSubsets(new Set([1,2,3]), 2)) {
  console.log(subset.join());
}

And then I think I managed to prune as early as possible for the unordered subsets case:

function subsets(set, n) {
  if(!Number.isInteger(n) || n < 0 || n > set.size) return function*(){}();
  var subset = new Array(n),
      iterator = set.values();
  return (function* backtrack(index, remaining) {
    if(index === n) {
      yield subset.slice();
    } else {
      for(var i=0; i<set.size; ++i) {
        subset[index] = iterator.next().value; /* Get first item */
        set.delete(subset[index]); /* Remove it */
        set.add(subset[index]); /* Insert it at the end */
        if(i <= remaining) {
          yield* backtrack(index+1, remaining-i);
        }
      }
    }
  })(0, set.size-n);
}
for(var subset of subsets(new Set([1,2,3,4]), 2)) {
  console.log(subset.join());
}

I still use an array for the subsets, but its size is only n. So in case the size of the set is much bigger than n, this approach might use less memory than duplicating the set into an array, which I guess is the point of your question. But note that the deletions and insertions are probably more expensive than array lookups.

Bonus: at the end, the order of the set is the same as before calling the function.

Repair answered 18/6, 2016 at 18:53 Comment(3)
BTW: Do you have a performance comparison? Your last edit indicates you do :)Sherillsherilyn
@Sherillsherilyn I just used var t = performance.now(); for(var subset of subsets(new Set([1,2,3,4,5,6,7,8,9,10,11,12,13,14]), 7)) { if(subset.length !== 7) throw Error('wrong'); } performance.now() - t;. It improved from 4 seconds to 40 milliseconds.Repair
how would one change the ordered function so that it only used each item in the set 1 time (e.g. "2,2" not allowed)?Gobi

© 2022 - 2024 — McMap. All rights reserved.