Idiomatic clojure for progress reporting?
Asked Answered
Q

4

35

How should I monitor the progress of a mapped function in clojure?

When processing records in an imperative language I often print a message every so often to indicate how far things have gone, e.g. reporting every 1000 records. Essentially this is counting loop repetitions.

I was wondering what approaches I could take to this in clojure where I am mapping a function over my sequence of records. In this case printing the message (and even keeping count of the progress) seem to be essentially side-effects.

What I have come up with so far looks like:

(defn report
  [report-every val cnt]
  (if (= 0 (mod cnt report-every))
    (println "Done" cnt))
    val)

(defn report-progress
  [report-every aseq]
  (map (fn [val cnt] 
          (report report-every val cnt)) 
       aseq 
       (iterate inc 1)))

For example:

user> (doall (report-progress 2 (range 10)))
Done 2
Done 4
Done 6
Done 8
Done 10
(0 1 2 3 4 5 6 7 8 9)

Are there other (better) ways of achieving this effect?

Are there any pitfalls in what I am doing? (I think I am preserving laziness and not holding the head for example.)

Quintero answered 7/1, 2010 at 19:9 Comment(0)
W
33

The great thing about clojure is you can attach the reporting to the data itself instead of the code that does the computing. This allows you to separate these logically distinct parts. Here is a chunk from my misc.clj that I find I use in just about every project:

(defn seq-counter 
  "calls callback after every n'th entry in sequence is evaluated. 
  Optionally takes another callback to call once the seq is fully evaluated."
  ([sequence n callback]
     (map #(do (if (= (rem %1 n) 0) (callback)) %2) (iterate inc 1) sequence))
  ([sequence n callback finished-callback]
     (drop-last (lazy-cat (seq-counter sequence n callback) 
                  (lazy-seq (cons (finished-callback) ())))))) 

then wrap the reporter around your data and then pass the result to the processing function.

(map process-data (seq-counter inc-progress input))
Wavy answered 7/1, 2010 at 20:17 Comment(5)
I think I am doing something crudely similar above, attaching the reporting to a seq with which anything could be done. I was envisioning attaching it to a sequence of results but it could equally well be the sequence of inputs. Your code is much nicer though. I hadn't progressed (pardon the pun) to using a callback for the reporting message (or more general function) and I was calling the reporting function for every value.Quintero
Is there anywhere that you share for misc.clj? I would certainly benefit from seeing other ideas and implementations of useful stuff like seq-counterQuintero
yes it is really the same as your initial example, i was a little fast on the "ohh thats in misk.clj" with out properly understanding the question. code.google.com/p/cryptovide/source/browse/src/com/cryptovide/….Wavy
It seems that the application example has typo. I'm expanding the example to explore my understanding. To take advantage of the formatting capability of in the field of solution, I have to add my fix in the above answer. In the meantime, I also found a surprise at unexpected output. Please help to review to see what I misunderstood. Thanks,Amphibian
I generalized the concept from the solution above into a library to make it easy to drop it into a project. Hopefully this helps someone. https://github.com/tmountain/seq-peek.Solvent
H
6

I would probably perform the reporting in an agent. Something like this:

(defn report [a]
  (println "Done " s)
  (+ 1 s))

(let [reports (agent 0)]
  (map #(do (send reports report)
            (process-data %))
       data-to-process)
Harkins answered 8/1, 2010 at 11:6 Comment(3)
That's an interesting approach. Curiously the reporting doesn't show up in my repl if I use slime-mode in emacs but does print in a normal repl.Quintero
On further reflection I could just increment things in the function sent to agent. The printing of the progress could be a regular function at the repl that accesses the agent's state.Quintero
Good point actually. In fact, if you are updating a GUI, you probably have to do it in the main thread (or defer it to the main thread, dispatchLater etc) anyway.Harkins
T
4

I don't know of any existing ways of doing that, maybe it would be a good idea to browse clojure.contrib documentation to look if there's already something. In the meantime, I've looked at your example and cleared it up a little bit.

(defn report [cnt]
  (when (even? cnt)
    (println "Done" cnt)))

(defn report-progress []
  (let [aseq (range 10)]
    (doall (map report (take (count aseq) (iterate inc 1))))
    aseq))

You're heading in the right direction, even though this example is too simple. This gave me an idea about a more generalized version of your report-progress function. This function would take a map-like function, the function to be mapped, a report function and a set of collections (or a seed value and a collection for testing reduce).

(defn report-progress [m f r & colls]
  (let [result (apply m
                 (fn [& args]
                   (let [v (apply f args)]
                     (apply r v args) v))
                 colls)]
    (if (seq? result)
      (doall result)
      result)))

The seq? part is there only for use with reduce which doesn't necessarily returns a sequence. With this function, we can rewrite your example like this:

user> 
(report-progress
  map
  (fn [_ v] v)
  (fn [result cnt _]
    (when (even? cnt)
      (println "Done" cnt)))
  (iterate inc 1)
  (range 10))

Done 2
Done 4
Done 6
Done 8
Done 10
(0 1 2 3 4 5 6 7 8 9)

Test the filter function:

user> 
(report-progress
  filter
  odd?
  (fn [result cnt]
    (when (even? cnt)
      (println "Done" cnt)))
  (range 10))

Done 0
Done 2
Done 4
Done 6
Done 8
(1 3 5 7 9)

And even the reduce function:

user> 
(report-progress
  reduce
  +
  (fn [result s v]
    (when (even? s)
      (println "Done" s)))
  2
  (repeat 10 1))

Done 2
Done 4
Done 6
Done 8
Done 10
12
Tracytrade answered 7/1, 2010 at 20:59 Comment(1)
I don't think you understood what I was trying to do with 'doall' (sorry for the lousy and unclear code). I was just using doall to test reporting at the repl to force reporting over processing the whole sequence (otherwise it would be lazily evaluated). doall wasn't part of my attempted reporting function or intended sequence processing.Quintero
S
-1

I have had this problem with some slow-running apps (e.g. database ETL, etc). I solved it by adding the function (tupelo.misc/dot ...) to the tupelo library. Sample:

(ns xxx.core 
  (:require [tupelo.misc :as tm]))

(tm/dots-config! {:decimation 10} )
(tm/with-dots
  (doseq [ii (range 2345)]
    (tm/dot)
    (Thread/sleep 5)))

Output:

     0 ....................................................................................................
  1000 ....................................................................................................
  2000 ...................................
  2345 total

API docs for the tupelo.misc namespace can be found here.

Suzette answered 5/10, 2016 at 2:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.