How do we do both left and right folds in Clojure?
Asked Answered
N

2

49

Reduce works fine but it is more like fold-left. Is there any other form of reduce that lets me fold to right ?

Nescience answered 28/5, 2013 at 11:20 Comment(0)
T
13

Let's look at a possible definition of each:

(defn foldl [f val coll]
  (if (empty? coll) val
    (foldl f (f val (first coll)) (rest coll))))

(defn foldr [f val coll]
  (if (empty? coll) val
    (f (foldr f val (rest coll)) (first coll))))

Notice that only foldl is in tail position, and the recursive call can be replaced by recur. So with recur, foldl will not take up stack space, while foldr will. That's why reduce is like foldl. Now let's try them out:

(foldl + 0 [1 2 3]) ;6
(foldl - 0 [1 2 3]) ;-6
(foldl conj  [] [1 2 3]) ;[1 2 3]
(foldl conj '() [1 2 3]) ;(3 2 1)

(foldr + 0 [1 2 3]) ;6
(foldr - 0 [1 2 3]) ;-6
(foldr conj  [] [1 2 3]) ;[3 2 1]
(foldr conj '() [1 2 3]) ;(1 2 3)

Is there some reason you want to fold right? I think the most common usage of foldr is to put together a list from front to back. In Clojure we don't need that because we can just use a vector instead. Another choice to avoid stack overflow is to use a lazy sequence:

(defn make-list [coll]
  (lazy-seq
    (cons (first coll) (rest coll))))

So, if you want to fold right, some efficient alternatives are

  1. Use a vector instead.
  2. Use a lazy sequence.
  3. Use reduced to short-circuit reduce.
  4. If you really want to dive down a rabbit hole, use a transducer.
Trott answered 21/1, 2017 at 2:2 Comment(0)
I
94

The reason that the clojure standard library only has fold-left (reduce) is actually very subtle and it is because clojure isn't lazy enough to get the main benefit of fold-right.

The main benefit of fold-right in languages like haskell is that it can actually short-circuit. If we do foldr (&&) True [False, True, True, True, True] the way that it actually gets evaluated is very enlightening. The only thing it needs to evaluate is the function and with 1 argument (the first False). Once it gets there it knows the answer and does not need to evaluate ANY of the Trues.

If you look very closely at the picture:

enter image description here

you will see that although conceptually fold-right starts and the end of the list and moves towards the front, in actuality, it starts evaluating at the FRONT of the list.

This is an example of where lazy/curried functions and tail recursion can give benefits that clojure can't.

Bonus Section (for those interested)

Based on a recommendation from vemv, I would like to mention that Clojure added a new function to the core namespace to get around this limitation that Clojure can't have the lazy right fold. There is a function called reduced in the core namespace which allows you to make Clojure's reduce lazier. It can be used to short-circuit reduce by telling it not to look at the rest of the list. For instance, if you wanted to multiply lists of numbers but had reason to suspect that the list would occasionally contain zero and wanted to handle that as a special case by not looking at the remainder of the list once you encountered a zero, you could write the following multiply-all function (note the use of reduced to indicate that the final answer is 0 regardless of what the rest of the list is).

(defn multiply-all [coll]
  (reduce
   (fn [accumulator next-value]
     (if (zero? next-value)
       (reduced 0)
       (* accumulator next-value)))
   1
   coll))

And then to prove that it short-circuits you could multiply an infinite list of numbers which happens to contain a zero and see that it does indeed terminate with the answer of 0

(multiply-all
 (cycle [1 2 3 4 0]))
Immobilize answered 28/5, 2013 at 13:39 Comment(12)
Very interesting question and answer (+1 for both).Adriell
Is tail recursion really relevant here? I didn't think foldr was tail recursive; if it is, it's some kind of strange thunk-related tail recursion that I don't understand yet and would love to learn about.Filomenafiloplume
what I did not get here is, fold right or we fold left. in case of &&, a short circuit should happen in both the cases as soon as we encounter a false.Nescience
Can you explain this part a little bit in detail - "although conceptually fold-right starts and the end of the list and moves towards the front, in actuality, it starts evaluating at the FRONT of the list."Nescience
Would it be worth to mention the new clojure.core/reduced function in your answer?Afoot
Also, I don't know why but I'm not able to reproduce Clojure's eagerness: (reduce #(and (eval %1) (eval %2)) false [`(println 42)]) doesn't cause 42 to be printed - only it will changing false to true (or and to or)Afoot
Amogh, what I mean by conceptually foldr-right starts at the end is that usually when I've heard people describe foldr, they describe it as starting at the end of the list and moving to the right. What I mean by in actuality, it starts at the front, is that if you look at the diagram, the ultimate goal of the expression is to determine the value of the topmost f. To do that, it looks at the 1 (first item in list) and then only if it needs to will it try to evaluate it's second argument (which is the 2nd f) which will in tern evaluate the 2 (2nd item in list) and then only if it needs toImmobilize
will it evaluate the 3rd f, which would evaluate 3 and so on and so forth down the list. But the main point to remember is that it only needs to do enough work to figure out the answer to the topmost f. (Note: if the topmost f produces something like a sequence, it only needs to evaluate enough of the input as it needs to to produce the part of the output sequence which actually gets used)Immobilize
vemv, in the example you give, it is actually the and which is short circuiting which causes 42 not to be printed, not the reduce which is short circuiting. You can see the difference if you try some other examples. For example (reduce #(and %1 %2) (cons false (repeat true))) will not terminate, however the equivalent right-fold in a lazier language would be able to avoid trying to evaluate the entire sequence.Immobilize
How does the haskell know when to go in forward and when to go in backward direction ? or if it always go from back to front and evaluates from start : then does it keep all of that in memory.. ? I guess it should not.Enchiridion
Laziness has everything to do with short-circuiting. In Clojure, sequences and macros can be lazy, but function application is not. In Haskell, function application itself is lazy (which is tied to the fact they are curried). In Clojure, it would be impossible to define a function multiply which when 0 was the left argument, didn't evaluate the right-hand argument, because in Clojure the arguments get evaluated before being passed to the function. That is why things like if, and, or, if-not etc are not functions. In Haskell, all these things can be regular functions.Immobilize
In Haskell: foldr (\x y -> if x >= 100 then x else x + y) 0 [0..] and in Clojure: (reduce (fn[x y] (if (> y 100) (reduced x) (+ x y))) (range)) achieve the same short-circuiting, but Haskell is using foldr with lazy evaluation of the second argument, since we don't use y on the condition it short-circuits. In Clojure it uses foldl, but detects the reduced return and stops looping.Geometrid
T
13

Let's look at a possible definition of each:

(defn foldl [f val coll]
  (if (empty? coll) val
    (foldl f (f val (first coll)) (rest coll))))

(defn foldr [f val coll]
  (if (empty? coll) val
    (f (foldr f val (rest coll)) (first coll))))

Notice that only foldl is in tail position, and the recursive call can be replaced by recur. So with recur, foldl will not take up stack space, while foldr will. That's why reduce is like foldl. Now let's try them out:

(foldl + 0 [1 2 3]) ;6
(foldl - 0 [1 2 3]) ;-6
(foldl conj  [] [1 2 3]) ;[1 2 3]
(foldl conj '() [1 2 3]) ;(3 2 1)

(foldr + 0 [1 2 3]) ;6
(foldr - 0 [1 2 3]) ;-6
(foldr conj  [] [1 2 3]) ;[3 2 1]
(foldr conj '() [1 2 3]) ;(1 2 3)

Is there some reason you want to fold right? I think the most common usage of foldr is to put together a list from front to back. In Clojure we don't need that because we can just use a vector instead. Another choice to avoid stack overflow is to use a lazy sequence:

(defn make-list [coll]
  (lazy-seq
    (cons (first coll) (rest coll))))

So, if you want to fold right, some efficient alternatives are

  1. Use a vector instead.
  2. Use a lazy sequence.
  3. Use reduced to short-circuit reduce.
  4. If you really want to dive down a rabbit hole, use a transducer.
Trott answered 21/1, 2017 at 2:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.