Understanding function `tailp` in Common Lisp
Asked Answered
A

4

6

While looking through the "Common Lisp Quick Reference" by Bert Burgemeister, I stumbled over tailp.

First, I misunderstood the definitions of this function. And I tried:

(tailp '(3 4 5) '(1 2 3 4 5))

But it returned

NIL

CLTL2 says, tailp is true iff the first argument is any (nthcdr n list) with existing n.

(nthcdr 2 '(1 2 3 4 5))
;; (3 4 5)

I further tried:

(tailp '(3 4 5) '(1 2 3 4 5))
;; NIL - and I would expect: T following the definition above.

(tailp '() '(1 2 3 4 5))
;; T
(tailp '5  '(1 2 3 4 . 5))
;; T

Until I tried (and then understood tailp looks for cdr of l which share even the same address):

(defparameter l '(1 2 3 4 5 6))
(tailp (nthcdr 3 l) l)
;; T

But then I had my next question:

For what such a function is useful at all?

Wouldn't be a function more useful which looks whether a sublist is part of a list? (Or looks like a part of a list, instead that it has to share the same address?)

Remark:

Ah okay slowly I begin to understand, that maybe that this is kind of a eq for cdr parts of a list ... Kind of ... "Any cdr-derivative of given list eq to the first argument?".

But maybe someone can explain me in which situations such test is very useful?

Remark:

In a long discussion with @Lassi here, we found out:

Never Use tailp On Circular Lists!

Because the behavior is undefined (already in SBCL problematic). So tailp is for usage on non-circular lists.

Arc answered 26/5, 2018 at 11:51 Comment(2)
When the hyperspec says that some things are "same", but doesn't mention under what test, EQL is the default. TAILP doesn't seem like an especially useful test. A quick grep of my installed quicklisp dists shows cl-yacc as the only user. From a quick glance over the code, it seems like it gets two cons-cells (OP1-TAIL and OP2-TAIL) from the same list (the grammar precedence), and uses TAILP to check if OP2-TAIL is a tail of the cdr of OP1-TAIL, which would mean that OP1 came first in the precedence list.Khiva
Thank you, @jkliski! Ah, I should have come to the idea to grep for it in the lisp source codes which I have! - Thank you for the hint - same == EQL in the lisp explanations and definitions. Okay ... TAILP could then serve as a precedence/successor test within lists ... since it looks directly into he loci, it might then be more performant than other constructed tests, isn't it?Arc
B
6

The basic purpose of tailp is to check whether there is list structure shared. This means whether the cons cells are the same (which means EQL as a predicate) - not just the content of the cons cells.

One can also check if an item is in the last cdr:

CL-USER 87 > (tailp t '(1 2 3 4 . t))
T

CL-USER 88 > (tailp nil '(1 2 3 4 . nil))
T

CL-USER 89 > (tailp nil '(1 2 3 4))
T

CL-USER 90 > (tailp #1="e" '(1 2 3 4 . #1#))
T

This is one of the rarely used functions in Common Lisp.

Babushka answered 26/5, 2018 at 20:14 Comment(6)
True, I first also tried around with numbers and symbols. But it is a little bit misleading. Because it works with your examples only because nil, t and all numbers and all symbols have by definition only one physical representation/address. e.g. this (tailp "e" '("a" "b" "c" "d" . "e")) ` gives NIL.Arc
@Gwang-Jin Kim: no, there can be multiple objects of the same number value. The test EQ can be false while EQL can be true for numbers. The default sameness predicate is EQL which treats numbers and characters special. Your example works if the "e" is actually the same object.Babushka
yes, true ... I made once for myself a list for equality ;; same symbols? (share the same adress?) eq ;; same type & same case character/number? eql ;; same number (mathematical sense, type insensitive) = ;; same case strings, list of equal objects (vec/obj eq!) equal ;; same objects (also vectors), case-insensitive string equalp - strings begin to be "same" when using equal.Arc
Oh what is this kind of notation. I saw it when creating circular lists. Kind of address reference, isn't it?Arc
@Gwang-JinKim: That's used for reading and printing s-expressions, where objects are shared in the structure. See: *print-circle*.Babushka
Thank you! - I think now we understood tailp so far ;) . Thank you for your contributions!Arc
H
4

Here is a case where tailp is useful:

(defun circular-list-p (l)
  (and (consp l)
       (tailp l (rest l))))

A couple of notes.

This terminates in all cases: tailp is allowed not to terminate on circular lists if the first argument is not a tail of the second (ie there's no requirement for it to check circularity), but it must terminate if the first argument is a tail of the second. But, if the list is circular, that's exactly what we check for here, so it terminates. (I was confused about this for a while).

The consp check is so (circular-list-p nil) is false: I think this is pragmatically the useful choice although you a pedant might argue that nil is circular.

Hushaby answered 29/5, 2018 at 21:50 Comment(8)
This won't work if l has a circular sublist after one or more initial elements. E.g. (circular-list-p (list* 1 2 3 (let ((sub (list 4 5 6))) (setf (cdr (last sub)) sub))))Thorman
Detailed exploration of tailp for circular lists: https://mcmap.net/q/1632889/-check-for-proper-list-in-common-lispThorman
@Lassi: yes, I'd not call such a list circular though: I'd say it has a circular tail or something like that. However I don't know whether there is standard terminology around this and mine probably is not standard if there is.Hushaby
Circular tail is a nice term. Common Lisp considers any list with a circular tail to be circular itself: "circular list n. a chain of conses that has no termination because some cons in the chain is the cdr of a later cons." I don't know whether this is the standard definition more widely, but it is probably the most useful one. Is there an application that would only require detecting a particular cycle but not others?Thorman
According to that definition, the empty list () is not a circular list, since it has no conses. Though it would make for a fun mathematical problem :) There's probably a standard definition in graph theory. According to the Wikipedia and MathWorld cyclic graph pages, they don't consider an empty graph cyclic since it doesn't have any vertices through which a cycle could be made, though this is not stated clearly. However, it would seem that an empty graph is considered an acyclic graph.Thorman
@Lassi: I don't know enough graph theory to know the right terms but I'd certainly like to be able to distinguish between a structure where each cons was the cdr of exactly one cons (which I'd call 'circular'), one which contains one or more cycles but which may be less simple (this is what the spec calls 'circular', which I think I'd call 'cyclic'), and one which doesn't but where one or more conses are the cdrs of more than one other cons which I think I'd call a DAG which is not a tree.Hushaby
You could cover the first two cases with a function (length-cycle xs) => length-of-non-repeating-part; length-of-repeating-part. That might be good for a standard library. You can only detect one cycle algorithmically in a list, since the algorithm will get stuck in that cycle during traversal. Your third case requires traversing the cars of some conses, so it's not clear how to do it in a standard library function.Thorman
@Lassi: I need to think about this some more (and SE comments are the wrong place to talk about it as they'll get pruned), but I think I've convinced myself that, because the cdr of a cons can point at at most one object, a list, considered as a chain of cdrs (so ignoring cars) can only have one cycle at most: as you walk down the conses, each cdr either points at something not a cons, at a new cons, or at a cons in its own history.Hushaby
A
3

I'm pretty sure the answer to (tailp '(3 4 5) '(1 2 3 4 5)) can be both t and nil as a smart compiler might do (tailp '#1=#(3 4 5) '(1 2 . #1#)) to reduce memory footprint. Quoted stuff are immutable literals so why not use the same memory twice?

Here is how tailp is working:

(defparameter *tail* (list 3 4 5))
(defparameter *larger* (list* 1 2 *tail*))
(defparameter *replica* (copy-list *larger*))

(tailp *tail* *replica*) ; ==> nil
(tailp *tail* *larger*)  ; ==> t

Since copy-list creates new cons for each element in a list it will share nothing but the empty list with any other list. It is unique.

*larger* has been made with a common tail with *tail* and thus (tailp *tail* *larger*) will be t.

It's important that it compares the arguments as same objects. Since the tail does not need to be a list it compares with eql. When comparing if stuff look the same you use equal so tailp is more specific than that. It has to be pointer equal (eq) or eql atomic value.

So where do you use this? I'm thinking a functional data structure where you typically reuse shared structure. tailp might be used to identify a subtree of a parent. It's basically a more performant version of this:

(defun my-tailp (needle haystack)
  (cond ((eql needle haystack) t)
        ((atom haystack) nil)
        (t (my-tailp needle (cdr haystack)))))
Aloisia answered 28/5, 2018 at 18:57 Comment(1)
Thank you, @Sylwester! Longer answer to your answer added as new answer! (below or above ...) ;)Arc
A
2

@Sylwester:

Thank you, @Sylwester!

I recently read in Edi Weitz's book about tail wagging a list:

(defparameter *list* (list 'a 'b 'c 'd))
(defparameter *tail* (cdddr *list*))

This names the car of the last cons cell of the list as *tail* - and now, one can add to it a new element and rename the new last car of the last cons cell of the list *tail*.

(setf (cdr *tail*) (cons 'e 'nil) 
      *tail*       (cdr *tail*)) 
;; (E)

Now the list is:

*list* 
;; (A B C D E) 

and one can add via setf further things to *tail* without having to traverse the list again. (So improves performance. But with a warning of course, because this is a destructive action).

Perhaps, if one would name a list like you did:

(defparameter *my-new-tail* '(F G H))

and tail wagg it to the end of the new list

(setf (cdr *tail*) *my-new-tail*
      *tail* (cddr *my-new-tail*))

Ah or alternatively:

(defparameter *tail-of-my-new-tail* (cddr *my-new-tail*))
;; and then
(setf (cdr *tail*) *my-new-tail*
      *tail*       *tail-of-my-new-tail*)

Then

(tailp *my-new-tail* *list*)
  • would be the test, whether this procedure was done correctly
  • would also be a test, whether I added a new tail in addition to *my-new-tail* or not or if *my-new-tail* was the last tail which has been tail wagged to *list* or not ...
  • therefore tailp would be a quite useful test in the context of tail wagging ...
  • or like you said, if an implementation uses then tail wagging for performance reasons, (and perhaps keeping track of the tails of the lists constantly) in this context, the implementation could use tailp as a test whether a given list contributes to another list as the recently added tail ...

This just came into my thoughts, when reading your answer, @Sylwester! (I didn't realize this while reading about tail wagging in the book - (which is by the way a super useful one!) Thank you that you answered!

Arc answered 28/5, 2018 at 20:55 Comment(2)
Queues can implemented like that, see e.g. the small Queues section in norvig.com/paip/auxfns.lispUkase
Thank you for the nice link @coredump! I also now saw in Edi Weitz's book that ĥe mentions tailp for exactly such purposes ... and he recommends ldiff for extracting the part of the list (ldiff *list* *tail*) which is the non-common part between list and tail. I should have searched in his and Norvig's book first :D .Arc

© 2022 - 2024 — McMap. All rights reserved.