The amortized complexity of std::next_permutation?
Asked Answered
M

3

21

I just read this other question about the complexity of next_permutation and while I'm satisfied with the response (O(n)), it seems like the algorithm might have a nice amortized analysis that shows a lower complexity. Does anyone know of such an analysis?

Microsurgery answered 11/2, 2011 at 19:16 Comment(3)
A lower complexity would require you to return elements in amortized time o(1) each. For a large dataset that implies that you must be able to return some elements in 0 clock cycles. That's impossible.Revocable
@btilly- Not necessarily. The complexity of one call to next_permutation is O(n), not for a sequence of calls. Hence my suspicion that this could be done in amortized o(n).Microsurgery
Commenting just for the algorithm junkie. I have a reformulation of next_permutation here (home.roadrunner.com/~hinnant/combinations.html) called for_each_permutation which generally runs 25% to 50% faster because it doesn't have to do the comparisons to find the next iteration. This implementation also allows you to permute your N items r at a time.Arlenarlena
M
20

So looks like I'm going to be answering my own question in the affirmative - yes, next_permutation runs in O(1) amortized time.

Before I go into a formal proof of this, here's a quick refresher on how the algorithm works. First, it scans backwards from the end of the range toward the beginning, identifying the longest contiguous decreasing subsequence in the range that ends at the last element. For example, in 0 3 4 2 1, the algorithm would identify 4 2 1 as this subsequence. Next, it looks at the element right before this subsequence (in the above example, 3), then finds the smallest element in the subsequence larger than it (in the above example, 4). Then, it exchanges the positions of those two elements and then reverses the identified sequence. So, if we started with 0 3 4 2 1, we'd swap the 3 and 4 to yield 0 4 3 2 1, and would then reverse the last three elements to yield 0 4 1 2 3.

To show that this algorithm runs in amortized O(1), we'll use the potential method. Define Φ to be three times the length of the longest contiguously decreasing subsequence at the end of the sequence. In this analysis, we will assume that all the elements are distinct. Given this, let's think about the runtime of this algorithm. Suppose that we scan backwards from the end of the sequence and find that the last m elements are part of the decreasing sequence. This requires m + 1 comparisons. Next, we find, of the elements of that sequence, which one is the smallest larger than the element preceding this sequence. This takes in the worst case time proportional to the length of the decreasing sequence using a linear scan for another m comparisons. Swapping the elements takes, say, 1 credit's worth of time, and reversing the sequence then requires at most m more operations. Thus the real runtime of this step is roughly 3m + 1. However, we have to factor in the change in potential. After we reverse this sequence of length m, we end up reducing the length of the longest decreasing sequence at the end of the range to be length 1, because reversing the decreasing sequence at the end makes the last elements of the range sorted in ascending order. This means that our potential changed from Φ = 3m to Φ' = 3 * 1 = 3. Consequently, the net drop in potential is 3 - 3m, so our net amortized time is 3m + 1 + (3 - 3m) = 4 = O(1).

In the preceding analysis I made the simplifying assumption that all the values are unique. To the best of my knowledge, this assumption is necessary in order for this proof to work. I'm going to think this over and see if the proof can be modified to work in the case where the elements can contain duplicates, and I'll post an edit to this answer once I've worked through the details.

Microsurgery answered 11/2, 2011 at 20:9 Comment(4)
Weird, you posted just a few seconds before I did. I confirm the O(1)! Of course, the literature claims there are better algorithms, so I am not sure if that is right. Of course, they might be talking about the constants, which do matter if n! is involved.Silin
@Moron- The STL also guarantees that the permutations come back in lexicographic order. If you break this constraint you can probably do a lot better.Microsurgery
Yes you can, in fact, I seem to recollect there was an algorithm which just did one swap at each call. Perhaps it is the same as that wiki page is talking about.Silin
Try this algorithm in the case where the vector contains a single 0 and n-1 1s; it will clearly take O(n) swaps per iteration. On the other hand, if the vector contains n/2 of each symbol, it takes O(1) swaps (the constant is 4 instead of e, which is the constant if all elements are distinct). I'd love to find a categorization of vectors which can be permuted in constant time per permutation using this algorithm, but so far no luck. Experimentation suggests that if the vector contains k each of m symbols, the number of swaps converges to e, but not even a clue how to prove it.Patterman
S
14

I am not really sure of the exact implementation of std::next_permutation, but if it is the same as Narayana Pandita's algorithm as desribed in the wiki here: http://en.wikipedia.org/wiki/Permutation#Systematic_generation_of_all_permutations,

assuming the elements are distinct, looks like it is O(1) amortized! (Of course, there might be errors in the below)

Let us count the total number of swaps done.

We get the recurrence relation

T(n+1) = (n+1)T(n) + Θ(n2)

(n+1)T(n) comes from fixing the first element and doing the swaps for the remaining n.

Θ(n2) comes from changing the first element. At the point we change the first element, we do Θ(n) swaps. Do that n times, you get Θ(n2).

Now let X(n) = T(n)/n!

Then we get

X(n+1) = X(n) + Θ(n2)/(n+1)!

i.e there is some constant C such that

X(n+1) <= X(n) + Cn2/(n+1)!

Writing down n such inequalities gives us

X(n+1) - X(n) <= Cn2/(n+1)!

X(n) - X(n-1) <= C(n-1)2/(n)!

X(n-1) - X(n-2) <= C(n-2)2/(n-1)!

...

X(2) - X(1) <= C12/(1+1)!

Adding these up gives us X(n+1) - X(1) <= C(\sum j = 1 to n (j^2)/(j+1)!).

Since the infinite series \sum j = 1 to infinity j^2/(j+1)! converges to C', say, we get X(n+1) - X(1) <= CC'

Remember that X(n) counts the average number of swaps needed (T(n)/n!)

Thus the average number of swaps is O(1).

Since finding the elements to swap is linear with the number of swaps, it is O(1) amortized even if you take other operations into consideration.

Silin answered 11/2, 2011 at 20:10 Comment(4)
I'm not the downvoter, but for the 2nd equation the nearest I can get is X(n+1) = X(n)/(n+1) + Θ(n^2)/(n+1)!, i.e. the 1st term on the RHS should be divided by (n+1) methinks. I got this by replacing T(n) with X(n)n!, T(n+1) with X(n+1)(n+1)!, and dividing all terms through by (n+1)!.Basrelief
@j_r: we get X(n+1)(n+1)! = (n+1)n!X(n) + theta(n^2) and so X(n+1) = X(n) + theta(n^2)/(n+1)!. You seem to have forgotten that (n+1) in the (n+1)T(n) term. In any case, the recurrence you got also implies X(n) = O(1).Silin
But I'm still confused by the next step -- could you spell it out? And separately, it seems to me that showing X(n) = O(1) only shows that T(n) = O(1)*n! -- clearly that's ridiculous, what am I missing?Basrelief
@j_r: I have added more clarification. Also, X(n) counts the average number of swaps, which is what we are interested in (at least in this case) when we consider amortized complexity. There was a typo earlier, where X(n) was said to be total, instead of average. I have corrected that too.Silin
B
0

Here n stands for the count of elements in the container, not the total count of possible permutations. The algorithm must iterate through an order of all elements at each call; it takes a pair of bidirectional iterators, which implies that to get to one element the algorithm must first visit the one before it (unless its the first or last element). A bidirectional iterator allows iterating backwards, so the algorithm can (must, in fact) perform half as many swaps as there are elements. I believe the standard could offer an overload for a forward iterator, which would support dumber iterators at the cost of n swaps rather than half n swaps. But alas, it didn't.

Of course, for n possible permutations the algorithm operates in O(1).

Brigitta answered 11/2, 2011 at 19:22 Comment(3)
Perhaps I'm missing something, but how does this work in O(1) for n possible permutations? Also, does this answer the original question about amortized complexity?Microsurgery
To compute the next permutation is independent of the permutation after that, or any other permutation. Hence, it is constant time.Brigitta
Yes, I do answer your original question. No, there is no amortized time better than O(n) here.Brigitta

© 2022 - 2024 — McMap. All rights reserved.