Heap's algorithm permutation generator
Asked Answered
C

3

16

I need to iterate over permutations of a tuple of integers. The order has to be generated by swapping a pair of elements at each step.

I found the Wikipedia article (http://en.wikipedia.org/wiki/Heap%27s_algorithm) for Heap's algorithm, which is supposed to do this. The pseudocode is:

procedure generate(n : integer, A : array of any):
    if n = 1 then
        output(A)
    else
        for i := 1; i ≤ n; i += 1 do
            generate(n - 1, A)
            if n is odd then
                j ← 1
            else
                j ← i
            swap(A[j], A[n])

I tried to write a generator for this in python:

def heap_perm(A):
    n = len(A)
    Alist = [el for el in A]
    for hp in _heap_perm_(n, Alist):
        yield hp


def _heap_perm_(n, A):
    if n == 1:
        yield A
    else:
        for i in range(1,n+1):
            for hp in _heap_perm_(n-1, A):
                yield hp
            if (n % 2) == 1:
                j = 1
            else:
                j = i
            swap = A[j-1]
            A[j-1] = A[n-1]
            A[n-1] = swap

This produces permutations in the following order (for input of [0,1,2,3]):

0,1,2,3
1,0,2,3
2,0,1,3
0,2,1,3
1,2,0,3
2,1,0,3
3,1,2,0

and so on

This seems fine until that last step, which isn't a swap of one pair.

What am I doing wrong?

Copartner answered 13/3, 2015 at 22:10 Comment(10)
Also, swap(A[j], A[n]) in Python can simply be implemented as A[n],A[j] = A[j],A[n], it's much clearer than a multi-step with temporary.Maddux
Finally, why does itertools.permutations() not fit the bill? For your input example, itertools.permutations([0,1,2,3]) produces the correct output sample. It will be much faster for large input sequences as it is implemented in C.Maddux
@aruisdante: itertools.permutations does not swap one pair of elements in each step.Salangia
@MartijnPieters I did this to make it as much like the pseudocode as possible for troubleshooting. Have I been inconsistent?Copartner
@Maddux Thanks, the swap advice is noted. As for itertools.permutations, that doesn't produce them in the order of pair-swapping. That is crucial for my needs.Copartner
@MartijnPieters Could you possibly explain a bit more? I agree that the indexing I have used is unusual, but I can't find the inconsistency.Copartner
I noted a similar problem on Rosetta Code that referenced the Steinhaus–Johnson–Trotter algorithm, but was not necessarily tied to that, so added a reference to rici's answer below. Python versions of this alternative algorithm are given.Caparison
@Paddy3118: The SJT algorithm has many advantages, one being that it is circular so you can return to the starting point with a single swap. (And it has centuries of history, since it was actually discovered by change ringers, an almost lost branch of musical mathematics.) But Heap's algo is simpler,, and the non-looping version is much simpler.Aerodontia
@Copartner I've deleted my comments; as Rici points out, the psuedocode you found is at the wrong here! Sorry for the confusion.Detrimental
@rici: First, thanks for the depth of your full answer below; now your comment: Heaps algo probably won't give me the smiles that watching a number traverse successive members of the SJT method gave me when I did the Python version :-) I had funCaparison
A
40

Historical prologue

The Wikipedia article on Heap's algorithm has been corrected, defaced and corrected again several times since this answer was written. The version referred to by the question and original answer was incorrect; you can see it in Wikipedia's change history. The current version may or may not be correct; at the time of this edit (March 2022), the page contained both correct and incorrect versions.

There's nothing wrong with your code (algorithmically), if you intended to implement the Wikipedia pseudocode. You have successfully implemented the algorithm presented.

However, the algorithm presented is not Heap's algorithm, and it does not guarantee that successive permutations will be the result of a single interchange. As you can see in the Wikipedia page, there are times when multiple swaps occur between generated permutations. See for example the lines

CBAD
DBCA

which have three swaps between them (one of the swaps is a no-op).

This is precisely the output from your code, which is not surprising since your code is an accurate implementation of the algorithm presented.

The correct implementation of Heap's algorithm, and the source of the error

Interestingly, the pseudocode is basically derived from the slides from a talk Sedgewick gave in 2002 (reference 3 on the Wikipedia page, also currently available on Sedgewick's home page). The incorrect pseudocode is on slide 13, and it is clearly wrong since it does not correspond to the useful graphic showing the working of the algorithm on the immediately preceding slide. I did some sleuthing around, and found many copies of this incorrect pseudocode, enough to start to doubt my analysis.

Fortunately, I also looked at Heap's short paper (reference 1 on the Wikipedia page), which is reasonably clear. What he says is: (emphasis added)

The program uses the same general method … i.e. for n objects, first permute the first (n—1) objects and then interchange the contents of the nth cell with those of a specified cell. In this method this specified cell is always the first cell if n is odd, but if n is even, it is numbered consecutively from 1 to (n−1).

The problem is that the recursive function as presented always does a swap before returning (unless N is 1). So it actually does swaps consecutively from 1 to n, not (n−1) as Heap specifies. Consequently, when (for example) the function is called with N==3, there will be two swaps at the end of the call before the next yield: one from the end of the N==2 call, and the other one from the loop with i==N. If if N==4, there will be three swaps, and so on. (Some of these will be no-ops, though.)

The last swap is incorrect: The algorithm should do swaps between recursive calls, not after them; it should never swap with i==N.

So this should work:

def _heap_perm_(n, A):
    if n == 1: yield A
    else:
        for i in range(n-1):
            for hp in _heap_perm_(n-1, A): yield hp
            j = 0 if (n % 2) == 1 else i
            A[j],A[n-1] = A[n-1],A[j]
        for hp in _heap_perm_(n-1, A): yield hp

Note: The above function was written to mimic the function of the same name in the original question, which performs the permutation in-place. Doing the permutation in place saves the cost of copying every permutation returned, making the full generation O(n!) (or O(1) per permutation generated) instead of O(n·n!) (or O(n) per permutation generated). That's fine if you're going to process each permutation immediately, but confusing if your plan is to keep them around, since the next call to the generator will modify the previously generated list. In that case, you might want to call the function as

for perm in map(tuple, heap_perm(n, A)):
    # Now each perm is independent of the previous one

Or collect them all into a gigantic list (NOT recommended!!; the lists are huge) with something like perms = list(map(tuple, heap_perm(n, A))).

Sedgewick's original paper

When I found Sedgewick's 1977 paper [see note 1], I was delighted to find that the algorithm he presents there is correct. However, it uses a looping control structure (credited to Donald Knuth) which might seem foreign to Python or C programmers: a mid-loop test:

Algorithm 1:

  procedure permutations (N);
      begin c:=1;
          loop:
              if N>2 then permutations(N-1)
              endif;
          while c<N:
              # Sedgewick uses :=: as the swap operator
              # Also see note below
              if N odd then P[N]:=:P[1] else P[N]:=:P[c] endif;
              c:=c+l
          repeat
       end;

Note: The swap line is taken from page 141, where Sedgewick explains how to modify the original version of Algorithm 1 (on page 140) to match Heap's algorithm. Aside from that line, the rest of the Algorithm is unchanged. Several variants are presented.


Notes:

  1. When I wrote this answer, the only public link to the paper was a paywalled URL in a reference on the Wikipedia page. It is now available on Sedgewick's home page.
Aerodontia answered 14/3, 2015 at 2:58 Comment(9)
This seems to work. Many thanks for doing the research.Copartner
The Wikipedia article is fixed now, finally. Since it's been requested, will one of you please do me a favour and review the new version?Soliloquize
@bluss: The new version looks fine to me. Since I'm not a wikiresident, I only have a vague idea of what is considered "original research". The program presented on that page isn't citable afaik; the actual program presented by Sedgwick (1977) appears in my answer, and (as I said) involves a mid-loop test which might be considered hard to understand by modern readers. (You could replace it with a conditional break, though.) Can you cite Sedgewick (undated) with a note that a typographical error has been fixed?Aerodontia
Thank you! Yes Sedgewick's code is quite cryptic to me too :-) I'll ponder what to do, but your review definitely helps!Soliloquize
@bluss: I updated the answer with references to both the historic and actual Wikipedia page, in the hopes that the answer continues to make sense. Thanks for reporting the Wikipedia change.Aerodontia
c:=c+l looks like an OCR error. His talk slides contain some typos too.African
Hi @Aerodontia ... nice answer! I just have one question, how do you call the function? I tried for example with _heap_perm_(3, list("abc")) but return a list with 6 copies of the same term. Did I missed anything?Topfull
@cards: as written, the function permutes in place so it always produces the same list. I did that because it was the way OP wrote their version, and it's necessary if you want each permutation step to take O(1) time. But if you actually want to keep the permutation for later, you need to make a copy, such as [list(p) for p in heap_perm("abc")]. (heap_perm is in the OP; I just changed the recursive helper.)Aerodontia
@Aerodontia now I get it, thanks!Topfull
D
-1

I just interpreted to Python the non-recursive pseudocode from the Wikipedia article about Heap's algorithm:

A = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] #the initial list to permute
n = len(A) #length of A
print(A) # output the first permutation (i.e. the original list A itself)
number_of_permutations = 1 #a variable to count a number of permutations produced by the script
total = [] #a list contaning all generated permutations
t = tuple(A) #tuple containing each permutation
total.append(t) #adding each permutation to total
c = [0] * n #c is the list used to track swapping of positions in each iteration of the Heap`s alrorythm.
i = 0 #index of positions in A and c. Serves as a pointer for swapping of positions in A.

while i < n:
    if c[i] < i: # test whether ith position was swapped more than i - 1 times
        if i % 2 == 0: #swapping the poitions in A
            A[0], A[i] = A[i], A[0]
        else:
            A[c[i]], A[i] = A[i], A[c[i]]
        print(A) #output of each permutation
        t = tuple(A) #convert each permutations into unmutable object (tuple)
        total.append(t) # appending each permutation as tuple in the list of all generated permutations
        number_of_permutations += 1 #increment number of performed permutations
        c[i] += 1 # incrementing c[i] is used to indicate that ith position was swapped
        i = 0 #returns the pointer to the 0th position (imitates recursion)
    else:
        c[i] = 0 #when ith position in A was swapped i - 1 times c[i] resets to zero
        i += 1 #pointer moves to the next position

print('Number of permutations generated by the script = ', number_of_permutations)

#Examining the correctness of permutions. Total number of permutations should be n! and there should not be any repeates
def factorial(n):
    """ Factorial function"""
    result = 1
    for number in range(1, n + 1):
        result *= number
    return result

print('Theoretical number of permutations (calculated as n!) = ', factorial(n))

for seq in total: #testing repeates
    count = total.count(seq)
    if count > 1:
        x = False
    else:
        x = True
print('The script does not generates repeats' if x == True else 'The script generates repeats')

I also introduced there a test for the correctness of the result (a number of all permutations without repeats should be equal to n!). And looks like the script works well. I still do not completely understand how it works. But I put comments into the code to according to my understanding of it.

Default answered 7/11, 2021 at 19:52 Comment(1)
Python has a factorial function in the built-in math module. To check whether all objects in a list are unique, you can use len(total) == len(set(total)), which is a lot faster than looping over all elements. (O(n) vs. (O(n²)). Also, your check doesn't work. Once failure has been detected, you should leave the loop; if you continue, you might reset x to True. Finally, the correctness of the implementation also requires that each step involve a single swap, which you don't check for.Aerodontia
S
-4

The easiest way to get the permutations of a list would be the permutations function, in the itertools module. So, if the algorithm is not binding, I would go with this:

from itertools import permutations

a = [1,2,3,4]
for item in permutations(a):
    print item
Septicidal answered 14/3, 2015 at 6:47 Comment(1)
There is a swapping restraint that you missed.Caparison

© 2022 - 2024 — McMap. All rights reserved.