Heap's algorithm in Clojure (can it be implemented efficiently?)
Asked Answered
C

1

6

Heap's algorithm enumerates the permutations of an array. Wikipedia's article on the algorithm says that Robert Sedgewick concluded the algorithm was ``at that time the most effective algorithm for generating permutations by computer,'' so naturally it would be fun to try to implement.

The algorithm is all about making a succession of swaps within a mutable array, so I was looking at implementing this in Clojure, whose sequences are immutable. I put the following together, avoiding mutability completely:

(defn swap [a i j]
  (assoc a j (a i) i (a j)))

(defn generate-permutations [v n]
  (if (zero? n)
    ();(println (apply str a));Comment out to time just the code, not the print
    (loop [i 0 a v]
      (if (<= i n)
        (do
          (generate-permutations a (dec n))
          (recur (inc i) (swap a (if (even? n) i 0) n)))))))

(if (not= (count *command-line-args*) 1)
  (do (println "Exactly one argument is required") (System/exit 1))
  (let [word (-> *command-line-args* first vec)]
    (time (generate-permutations word (dec (count word))))))

For an 11-character input string, the algorithm runs (on my computer) in 7.3 seconds (averaged over 10 runs).

The equivalent Java program, using character arrays, runs in 0.24 seconds.

So I would like to make the Clojure code faster. I used a Java array with type hinting. This is what I tried:

(defn generate-permutations [^chars a n]
  (if (zero? n)
    ();(println (apply str a))
    (doseq [i (range 0 (inc n))]
      (generate-permutations a (dec n))
      (let [j (if (even? n) i 0) oldn (aget a n) oldj (aget a j)]
        (aset-char a n oldj) (aset-char a j oldn)))))

(if (not= (count *command-line-args*) 1)
  (do
    (println "Exactly one argument is required")
    (System/exit 1))
  (let [word (-> *command-line-args* first vec char-array)]
    (time (generate-permutations word (dec (count word))))))

Well, it's slower. Now it averages 9.1 seconds for the 11-character array (even with the type hint).

I understand mutable arrays are not the Clojure way, but is there any way to approach the performance of Java for this algorithm?

Chickweed answered 1/11, 2015 at 18:50 Comment(5)
Have you accounted for JVM warmup/JIT at all in your comparisons? For me, running your initial code through criterium/bench gives an execution time around 80-82 µs for an 11-character string. The arrays version shaves that down to 59-60 µs.Fancher
That's great advice of course and thank you, though there's more warmup here than I would have expected, since the difference between doing the simple clock-time measurements with Java's System.currentTimeMillis and Clojure's time macro still very so much more widely than I would have expected. Great tip about criterium, I'll look into it.Chickweed
My eyes burn when you useCamelBack when you should-be-doing-this.Irreconcilable
OMG I AM SO SORRY. Edited. I hope your eyes recover. And yes, thanks for the note BTW. I often get pedantic about language conventions myself.Chickweed
this is a very interesting topic. i tried your code, and tried to tune it, but no much improvement. when i tried to optimize it, i actually tried to optimize the algorithm, so that says it is reasonable to classify algorithms into 'functional-implementation-oriented' ones and 'imperative-implementation-oriented' ones.Unfasten
T
2

It's not so much that Clojure is entirely about avoiding mutable state. It's that Clojure has very strong opinions on when it should be used.

In this case, I'd highly recommend finding a way to rewrite your algorithm using transients, as they're specifically designed to save time by avoiding the reallocation of memory and allowing a collection to mutable locally so long as the reference to the collection never leaves the function in which it was created. I recently managed to cut a heavily memory intensive operation's time by nearly 10x by using them.

This article explains transients fairly well!

http://hypirion.com/musings/understanding-clojure-transients

Also, you may want to look into rewriting your loop structure in a way that allows you to use recur to recursively call generatePermutations rather than using the whole name. You'll likely get a performance boost, and it'd tax the stack a lot less.

I hope this helps.

Transposition answered 7/1, 2016 at 12:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.