Library function for Permutation and Combination in C++
Asked Answered
R

5

37

What's the most widely used existing library in C++ to give all the combination and permutation of k elements out of n elements?

I am not asking the algorithm but the existing library or methods.

Thanks.

Remus answered 6/2, 2010 at 3:37 Comment(1)
Duplicate: #1206244Mishap
J
20

Combinations: from Mark Nelson's article on the same topic we have next_combination Permutations: From STL we have std::next_permutation

   template <typename Iterator>
   inline bool next_combination(const Iterator first, Iterator k, const Iterator last)
   {
      if ((first == last) || (first == k) || (last == k))
         return false;
      Iterator itr1 = first;
      Iterator itr2 = last;
      ++itr1;
      if (last == itr1)
         return false;
      itr1 = last;
      --itr1;
      itr1 = k;
      --itr2;
      while (first != itr1)
      {
         if (*--itr1 < *itr2)
         {
            Iterator j = k;
            while (!(*itr1 < *j)) ++j;
            std::iter_swap(itr1,j);
            ++itr1;
            ++j;
            itr2 = k;
            std::rotate(itr1,j,last);
            while (last != j)
            {
               ++j;
               ++itr2;
            }
            std::rotate(k,itr2,last);
            return true;
         }
      }
      std::rotate(first,k,last);
      return false;
   }
Jawbreaker answered 6/2, 2010 at 4:30 Comment(4)
If you are using large sets of data this approach fairly quickly becomes inefficient as the calls to std::rotate get expensive. If you really need fast combination generation you don't want to have to rotate all the data. For an example of a way to do this look up gray codes: en.wikipedia.org/wiki/Gray_code.Reentry
@shuttle87: first, you don't have to rotate heavy data, it's often sufficient to rotate indices or pointers to the data. Second, it's unlikely that you'd have to enumerate permutations of more than 20 elements. Even 20! iterations is barely manageable, and 30! is completely infeasible. Therefore you won't have to rotate more than approx 20 elements.Pinion
The line itr1 = k; likely contains an error since it invalidates the itr1--; above.Inning
@Reentry It is possible to optimize it to use fewer calls to std::rotate: gist.github.com/viliml/eedbf77554ab7c748822c824289529fdDunson
B
41

I decided to test the solutions by dman and Charles Bailey here. I'll call them solutions A and B respectively. My test is visiting each combination of of a vector<int> size = 100, 5 at a time. Here's the test code:

Test Code

struct F
{
    unsigned long long count_;

    F() : count_(0) {}

    bool operator()(std::vector<int>::iterator, std::vector<int>::iterator)
    {++count_; return false;}
};

int main()
{
    typedef std::chrono::high_resolution_clock Clock;
    typedef std::chrono::duration<double> sec;
    typedef std::chrono::duration<double, std::nano> ns;
    int n = 100;
    std::vector<int> v(n);
    std::iota(v.begin(), v.end(), 0);
    std::vector<int>::iterator r = v.begin() + 5;
    F f;
    Clock::time_point t0 = Clock::now();
    do
    {
        f(v.begin(), r);
    } while (next_combination(v.begin(), r, v.end()));
    Clock::time_point t1 = Clock::now();
    sec s0 = t1 - t0;
    ns pvt0 = s0 / f.count_;
    std::cout << "N = " << v.size() << ", r = " << r-v.begin()
              << ", visits = " << f.count_ << '\n'
              << "\tnext_combination total = " << s0.count() << " seconds\n"
              << "\tnext_combination per visit = " << pvt0.count() << " ns";
}

All code was compiled using clang++ -O3 on a 2.8 GHz Intel Core i5.

Solution A

Solution A results in an infinite loop. Even when I make n very small, this program never completes. Subsequently downvoted for this reason.

Solution B

This is an edit. Solution B changed in the course of writing this answer. At first it gave incorrect answers and due to very prompt updating it now gives correct answers. It prints out:

N = 100, r = 5, visits = 75287520
    next_combination total = 4519.84 seconds
    next_combination per visit = 60034.3 ns

Solution C

Next I tried the solution from N2639 which looks very similar to solution A, but works correctly. I'll call this solution C and it prints out:

N = 100, r = 5, visits = 75287520
    next_combination total = 6.42602 seconds
    next_combination per visit = 85.3531 ns

Solution C is 703 times faster than solution B.

Solution D

Finally there is a solution D found here. This solution has a different signature / style and is called for_each_combination, and is used much like std::for_each. The driver code above changes between the timer calls like so:

Clock::time_point t0 = Clock::now();
f = for_each_combination(v.begin(), r, v.end(), f);
Clock::time_point t1 = Clock::now();

Solution D prints out:

N = 100, r = 5, visits = 75287520
    for_each_combination = 0.498979 seconds
    for_each_combination per visit = 6.62765 ns

Solution D is 12.9 times faster than solution C, and over 9000 times faster than solution B.

I consider this a relatively small problem: only 75 million visits. As the number of visits increases into the billions, the discrepancy in the performance between these algorithms continues to grow. Solution B is already unwieldy. Solution C eventually becomes unwieldy. Solution D is the highest performing algorithm to visit all combinations I'm aware of.

The link showing solution D also contains several other algorithms for enumerating and visiting permutations with various properties (circular, reversible, etc.). Each of these algorithms was designed with performance as one of the goals. And note that none of these algorithms requires the initial sequence to be in sorted order. The elements need not even be LessThanComparable.

Beverleybeverlie answered 13/3, 2011 at 0:8 Comment(5)
Thank you for spotting the mistake in my answer. Please note that I don't get notified just because you mention my name in another answer so in order to distinguish your downvote from the random unexplained downvotes that old questions often attract it would have been helpful if you had left a comment on my incorrect answer so that I could learn of my mistake.Hoecake
Noted, sorry for my lack of etiquette. I'm a newbie here and will modify my behavior accordingly, thanks.Beverleybeverlie
Now that I've read your link in some detail, +1 from me. My answer was aiming for minimum implementation effort (and C++03 only). This answer gives a solution that would actually be acceptable for non-trivial input lengths.Hoecake
Just FYI, I ran your test case on my fixed code and got visits = 75287520, which is better, but next_combination total = 3418.53 seconds which is obviously worse. I think my machine is a generation or two before i5, though, and gcc, not clang.Hoecake
I ran a sanity check for correctness on g++-4.2 and got the correct answers with your updated code. I didn't run a performance test there though.Beverleybeverlie
J
20

Combinations: from Mark Nelson's article on the same topic we have next_combination Permutations: From STL we have std::next_permutation

   template <typename Iterator>
   inline bool next_combination(const Iterator first, Iterator k, const Iterator last)
   {
      if ((first == last) || (first == k) || (last == k))
         return false;
      Iterator itr1 = first;
      Iterator itr2 = last;
      ++itr1;
      if (last == itr1)
         return false;
      itr1 = last;
      --itr1;
      itr1 = k;
      --itr2;
      while (first != itr1)
      {
         if (*--itr1 < *itr2)
         {
            Iterator j = k;
            while (!(*itr1 < *j)) ++j;
            std::iter_swap(itr1,j);
            ++itr1;
            ++j;
            itr2 = k;
            std::rotate(itr1,j,last);
            while (last != j)
            {
               ++j;
               ++itr2;
            }
            std::rotate(k,itr2,last);
            return true;
         }
      }
      std::rotate(first,k,last);
      return false;
   }
Jawbreaker answered 6/2, 2010 at 4:30 Comment(4)
If you are using large sets of data this approach fairly quickly becomes inefficient as the calls to std::rotate get expensive. If you really need fast combination generation you don't want to have to rotate all the data. For an example of a way to do this look up gray codes: en.wikipedia.org/wiki/Gray_code.Reentry
@shuttle87: first, you don't have to rotate heavy data, it's often sufficient to rotate indices or pointers to the data. Second, it's unlikely that you'd have to enumerate permutations of more than 20 elements. Even 20! iterations is barely manageable, and 30! is completely infeasible. Therefore you won't have to rotate more than approx 20 elements.Pinion
The line itr1 = k; likely contains an error since it invalidates the itr1--; above.Inning
@Reentry It is possible to optimize it to use fewer calls to std::rotate: gist.github.com/viliml/eedbf77554ab7c748822c824289529fdDunson
H
18

This answer provides a minimal implementation effort solution. It may not have acceptable performance if you want to retrieve combinations for large input ranges.

The standard library has std::next_permutation and you can trivially build a next_k_permutation from it and a next_combination from that.

template<class RandIt, class Compare>
bool next_k_permutation(RandIt first, RandIt mid, RandIt last, Compare comp)
{
    std::sort(mid, last, std::tr1::bind(comp, std::tr1::placeholders::_2
                                            , std::tr1::placeholders::_1));
    return std::next_permutation(first, last, comp);
}

If you don't have tr1::bind or boost::bind you would need to build a function object that swaps the arguments to a given comparison. Of course, if you're only interested in a std::less variant of next_combination then you can use std::greater directly:

template<class RandIt>
bool next_k_permutation(RandIt first, RandIt mid, RandIt last)
{
    typedef typename std::iterator_traits<RandIt>::value_type value_type;

    std::sort(mid, last, std::greater< value_type >());
    return std::next_permutation(first, last);
}

This is a relatively safe version of next_combination. If you can guarantee that the range [mid, last) is in order as they would be after a call to next_combination then you can use the simpler:

template<class BiDiIt, class Compare>
bool next_k_permutation(BiDiIt first, BiDiIt mid, BiDiIt last, Compare comp)
{
    std::reverse(mid, last);
    return std::next_permutation(first, last, comp);
}

This also works with bi-directional iterators as well as random access iterators.

To output combinations instead of k-permutations, we have to ensure that we output each combination only once, so we'll return a combination it only if it is a k-permutation in order.

template<class BiDiIt, class Compare>
bool next_combination(BiDiIt first, BiDiIt mid, BiDiIt last, Compare comp)
{
    bool result;
    do
    {
        result = next_k_permutation(first, mid, last, comp);
    } while (std::adjacent_find( first, mid,
                             std::tr1::bind(comp, std::tr1::placeholders::_2
                                                , std::tr1::placeholders::_1) )
                                                                        != mid );
    return result;
}

Alternatives would be to use a reverse iterator instead of the parameter swapping bind call or to use std::greater explicitly if std::less is the comparison being used.

Hoecake answered 11/4, 2010 at 11:25 Comment(3)
In your next_combination algorithm, do you mean: result = next_k_permutation(first, mid, last, comp); ?Beverleybeverlie
@HowardHinnant: Yes, I do. Thanks. At least now it should give correct results even if it has rubbish performance compared to your solution.Hoecake
As far as minimal effort in implementation and for small examples this is the best optionSauternes
F
2

@ Charles Bailey above:

I could be wrong, but I think the first two algorithms above does not remove duplicates between first and mid? Maybe I am not sure how to use it.

4 choose 2 example:
12 34
12 43 (after sort)
13 24 (after next_permutation)
13 42 (after sort)
14 23 (after next_permutation)
14 32 (after sort)
21 34 (after next_permutation)

So I added a check to see if the value in italics is in order before returning, but definitely wouldn't have thought of the part you wrote though (very elegant! thanks!).

Not fully tested, just cursory tests..


template
bool next_combination(RandIt first, RandIt mid, RandIt last)
{
    typedef typename std::iterator_traits< RandIt >::value_type value_type;
    std::sort(mid, last, std::greater< value_type >() );
    while(std::next_permutation(first, last)){
        if(std::adjacent_find(first, mid, std::greater< value_type >() ) == mid){
            return true;
        }
        std::sort(mid, last, std::greater< value_type >() );
    return false;
}
Fatty answered 6/1, 2011 at 1:58 Comment(1)
Thanks for spotting that I was using the wrong definition of combination, I obviously wasn't thinking carefully enough. Note, though, that I don't get notified unless you actually leave a comment on my answer.Hoecake
F
0

Maybe it's already stated within the previous answers, but here I cannot find a full generic way for this with respect to the parameter types and I also didn't find it within existing library routines besides Boost. This is a generic way I needed during test case construction for scenarios with a wide spread of various parameter variations. Maybe it's helpful to you too, at least for similar scenarios. (Usable for permutation and combination with minor changes in doubt)

#include <vector>
#include <memory>

class SingleParameterToVaryBase
{
public:

  virtual bool varyNext() = 0;
  virtual void reset() = 0;
};

template <typename _DataType, typename _ParamVariationContType>
class SingleParameterToVary : public SingleParameterToVaryBase
{
public:

  SingleParameterToVary(
    _DataType& param,
    const _ParamVariationContType& valuesToVary) :
      mParameter(param)
    , mVariations(valuesToVary)
  {
    if (mVariations.empty())
      throw std::logic_error("Empty variation container for parameter");
    reset();
  }

  // Step to next parameter value, return false if end of value vector is reached
  virtual bool varyNext() override
  {
    ++mCurrentIt;

    const bool finished = mCurrentIt == mVariations.cend();

    if (finished)
    {
      return false;
    }
    else
    {
      mParameter = *mCurrentIt;
      return true;
    }
  }

  virtual void reset() override
  {
    mCurrentIt = mVariations.cbegin();
    mParameter = *mCurrentIt;
  }

private:

  typedef typename _ParamVariationContType::const_iterator ConstIteratorType;

  // Iterator to the actual values this parameter can yield
  ConstIteratorType mCurrentIt;

  _ParamVariationContType mVariations;

  // Reference to the parameter itself
  _DataType& mParameter;
};

class GenericParameterVariator
{
public:

  GenericParameterVariator() : mFinished(false)
  {
    reset();
  }

  template <typename _ParameterType, typename _ParameterVariationsType>
  void registerParameterToVary(
    _ParameterType& param,
    const _ParameterVariationsType& paramVariations)
  {
    mParametersToVary.push_back(
      std::make_unique<SingleParameterToVary<_ParameterType, _ParameterVariationsType>>(
        param, paramVariations));
  }

  const bool isFinished() const { return mFinished; }

  void reset()
  {
    mFinished = false;
    mNumTotalCombinationsVisited = 0;

    for (const auto& upParameter : mParametersToVary)
      upParameter->reset();
  }

  // Step into next state if possible
  bool createNextParameterPermutation()
  {
    if (mFinished || mParametersToVary.empty())
      return false;

    auto itPToVary = mParametersToVary.begin();
    while (itPToVary != mParametersToVary.end())
    {
      const auto& upParameter = *itPToVary;

      // If we are the very first configuration at all, do not vary.
      const bool variedSomething = mNumTotalCombinationsVisited == 0 ? true : upParameter->varyNext();
      ++mNumTotalCombinationsVisited;

      if (!variedSomething)
      {
        // If we were not able to vary the last parameter in our list, we are finished.
        if (std::next(itPToVary) == mParametersToVary.end())
        {
          mFinished = true;
          return false;
        }

        ++itPToVary;
        continue;
      }
      else
      {
        if (itPToVary != mParametersToVary.begin())
        {
          // Reset all parameters before this one
          auto itBackwd = itPToVary;
          do
          {
            --itBackwd;
            (*itBackwd)->reset();
          } while (itBackwd != mParametersToVary.begin());
        }

        return true;
      }
    }

    return true;
  }

private:

  // Linearized parameter set
  std::vector<std::unique_ptr<SingleParameterToVaryBase>> mParametersToVary;

  bool mFinished;
  size_t mNumTotalCombinationsVisited;

};

Possible usage:

GenericParameterVariator paramVariator;

  size_t param1;
  int param2;
  char param3;

  paramVariator.registerParameterToVary(param1, std::vector<size_t>{ 1, 2 });
  paramVariator.registerParameterToVary(param2, std::vector<int>{ -1, -2 });
  paramVariator.registerParameterToVary(param3, std::vector<char>{ 'a', 'b' });

  std::vector<std::tuple<size_t, int, char>> visitedCombinations;
  while (paramVariator.createNextParameterPermutation())
    visitedCombinations.push_back(std::make_tuple(param1, param2, param3));

Generates:

(1, -1, 'a')
(2, -1, 'a')
(1, -2, 'a')
(2, -2, 'a')
(1, -1, 'b')
(2, -1, 'b')
(1, -2, 'b')
(2, -2, 'b')

For sure, this can be further optimized/specialized. For instance you can simply add a hashing scheme and/or an avoid functor if you want to avoid effective repetitions. Also, since the parameters are held as references, one might consider to protect the generator from possible error-prone usage via deleting copy/assignement constructors and operators.

Time complexity is within the theoretical permutation complexity range.

Functionary answered 14/11, 2020 at 11:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.