Process n items from a list at a time in Lisp
Asked Answered
C

4

7

Given a list, how do I process N items at a time? Ruby has each_slice method on the Enumerable that does this; what would be the Lisp equivalent?

Clem answered 19/6, 2013 at 18:30 Comment(2)
CL doesn't have anything built-in to do this, but you can easily construct it using SUBSEQ in a loop.Dnepropetrovsk
Using subseq in a loop is probably going to allocate a lot of intermediate storage. The loop macro supports this natively with (loop for ... on ... by ...).Isidore
I
8

Common Lisp's loop can be used for this very nicely, as in the following two examples. The first example loops for (x y z) in a list. However, the default step is cdr (rest), so if the list is (1 2 3 4 5), you get (1 2 3), (2 3 4), etc., for (x y z).

CL-USER> (loop for (x y z) on '(1 2 3 4 5 6 7 8 9 10 11 12)
              do (print (list z y x)))

(3 2 1) 
(4 3 2) 
(5 4 3) 
(6 5 4) 
(7 6 5) 
(8 7 6) 
(9 8 7) 
(10 9 8) 
(11 10 9) 
(12 11 10) 
(NIL 12 11) 
(NIL NIL 12) 
NIL

If you do not want the overlap between iterations, specify the stepping function to be something that moves farther down the list. For instance, if you're pulling three elements at a time, use cdddr:

CL-USER> (loop for (x y z) on '(1 2 3 4 5 6 7 8 9 10 11 12) by 'cdddr
              do (print (list z y x)))
(3 2 1) 
(6 5 4) 
(9 8 7) 
(12 11 10) 
NIL

Implementing partition with this technique

Another answer implemented the counterpart to each_slice using an auxiliary function. However, notice that partition (in that sense) is just each_slice with the identity function. This suggests that we should be able to implement it using the idiom above. The anonymous function

(lambda (list)
  (nthcdr n list))

is the step function that we need. Since we do not know how many elements the cells have until run time, we can't bind each element like we did above with (x y z). We do have to match each tail of the list as we step down and extract the subsequence n elements. Here's a loop based implementation of partition.

CL-USER> (defun partition (list cell-size)
           (loop for cell on list by #'(lambda (list)
                                         (nthcdr cell-size list))
              collecting (subseq cell 0 cell-size)))
PARTITION

CL-USER> (partition '(1 2 3 4 5 6) 2)
((1 2) (3 4) (5 6))
Isidore answered 19/6, 2013 at 19:17 Comment(2)
Thanks Joshua! Used your second code snippet; needed 2 elts so used cddr. Loving Lisp!Clem
@rebnoob Glad to hear it! Common Lisp has a whole lot of functionality packed in the standard, so it's not a bad idea to skim the table of contents and sub of the subsections, not to learn everything in detail, but to have an idea what's there, and to know where to look when you want something. For instance, I had to check the (loop ... on ... by ...) syntax to make sure it was right, but I had an idea where to look.Isidore
S
3
(defun partition-helper (lst acc x)
  (if (< (length lst) x)
    acc
    (partition-helper (subseq lst x) (cons (subseq lst 0 x) acc) x)))

(defun partition (lst x)
  (reverse (partition-helper lst '() x)))

Then you can:

[25]> (PARTITION '(1 2 3 4 5 6) 2)
((1 2) (3 4) (5 6))

or:

[26]> (PARTITION '(1 2 3 4 5 6) 3)
((1 2 3) (4 5 6))

and then just mapcar over the list to process it 2 or 3 elements at a time.

Slug answered 19/6, 2013 at 19:7 Comment(0)
N
3

If you wanted to split your list on a predicate (as opposed to a fixed length sublists), I would have recommended nsplit-list.

For fixed length sublists, you may want to use loop:

(defun by-N (list n fun) 
  (loop for tail on list by (lambda (l) (nthcdr n l)) 
    do (funcall fun (subseq tail 0 (min (length tail) n)))))
(by-n (loop for i from 0 to 20 collect i) 5 #'print)

(0 1 2 3 4) 
(5 6 7 8 9) 
(10 11 12 13 14) 
(15 16 17 18 19) 
(20) 

Note that this is not very efficient (it scans the list more than necessary and allocates a fresh sublist for passing to fun).

The efficient version is more complicated:

(defun batch-map (list batch-size function)
  "Call FUNCTION on sublists of LIST of size BATCH-SIZE.
Returns the list of return values of FUNCTION."
  (do ((tail list (cdr end)) end ret (bs1 (1- batch-size)))
      ((endp tail) (nreverse ret))
    (setq end (nthcdr bs1 tail))
    (if (consp end)
        (let ((next (cdr end)))
          (setf (cdr end) nil)
          (unwind-protect (push (funcall function tail) ret)
            (setf (cdr end) next)))
        (push (funcall function tail) ret))))
Nonconformance answered 19/6, 2013 at 19:30 Comment(0)
D
0

All the answers are practical and can be used, but I believe none replicates exactly the Ruby's behavior:

> 1.upto(7).each_slice(3) { |x, y, z| p [x, y, z] }
[1, 2, 3]
[4, 5, 6]
[7, nil, nil]

To emulate Ruby, I believe the proper code is something similar to:

CL-USER> (defun each-slice (n list thunk)
  (apply thunk (loop for i below n collect (nth i list)))

  (if (> (length list) n)
      (each-slice n (subseq list n) thunk)))

Generates a similar response when called:

CL-USER> (each-slice 3 '(1 2 3 4 5 6 7) (lambda (x y z) (print (list x y z))))

(1 2 3) 
(4 5 6) 
(7 NIL NIL) 
NIL
Demerit answered 15/4, 2020 at 10:14 Comment(2)
the accepted answer's approach already does this: (loop for (x y z) on '(1 2 3 4) by 'cdddr do (print (list x y z))) prints (1 2 3) (4 NIL NIL) and returns NIL.Ivanovo
Yes, you are right, it works adding the extra NIL values but requires you to adjust the "by" parameter either using cd*dr or nthcdr inside an extra lambda. There are also another behavior from Ruby's each_slice which the loop version emulates better... in Ruby if you call each_slice without given a block it will returns an Enumerator instance which can be chained and this can be better emulated using loop and collecting the result.Demerit

© 2022 - 2024 — McMap. All rights reserved.