Avoiding the pitfall of using anaphoric macro unwittingly
Asked Answered
A

2

7

How do I know whether I'm calling an anaphoric macro? If I do so without knowing it, some seemingly unbound symbols might behave quite different from what one would expect.

Example

Collecting all even numbers from a list is easy:

> (loop for i in '(1 2 3 4)        ;correct
       when (evenp i) collect i)
(2 4)

However, if someone gets the great idea of giving the iterating variable the name it (because "it" seems like a good abbreviation of "item"; also the C++ folks routinely iterate with an iterator called it), the result is suddenly quite different:

> (loop for it in '(1 2 3 4)         ;wrong
       when (evenp it) collect it)
(T T)

This might sound contrived but such an awkward bug happened recently to, ehm, to someone I knew.

So how to avoid falling into the same pitfall again this type of bugs?

Altruism answered 31/7, 2018 at 19:47 Comment(1)
(T T) -- Lexical Scope Fairy crying.Degression
S
8

This is a major reason anaphoric macros are not unanimously liked and people generally try to use them sparingly. Explicit bindings like if-let seems more accepted in practice.

The LOOP macro is, however, the only construct in the specification as far as I know that offers implicit bindings, with the exception maybe of NIL blocks, if you consider that to be the same. Besides, it is quite extensively documented and not going to change soon. The example as given thus feels a little bit artificial. At the same time, there is no denying that this kind of bug may happen.

So how to avoid this type of bugs?

Maybe you don't need to do anything. Mistakes happen, but this one is not likely to occur often.

But if you wanted to, you could decide to restrict the language to forbid the use of it in LOOP (because you fear that you or someone else will introduce the same bug):

(defpackage mycl (:use :cl) (:shadows #:loop))
(in-package mycl)

The above defines a custom dialect of CL which shadows the loop symbol. The loop symbol which is accessible (resolved when no package prefix is given) from package MYCL is the one from MYCL, not CL:LOOP. Then, you can add your own checks:

(defmacro loop (&body body)
  (when (find "IT" body :test #'string=)
    (error "Forbidden IT keyword"))
  `(cl:loop ,@body))

That definition should be enough (it might miss some cases). Then, you choose to use this package instead of CL in your project, and thus, the following fails with an error:

(defun test ()
  (loop
    for it in '(1 2 3 4)
    when (evenp it) collect it))

...
  error: 
    during macroexpansion of
    (LOOP
      FOR
      IT
      ...).
    Use *BREAK-ON-SIGNALS* to intercept.

     Forbidden IT keyword

Compilation failed.

Another approach for the check could be as follows (it is stricter by trying to look in all the trees rooted under LOOP, and might thus error even for otherwise valid cases):

(defmacro loop (&body body)
  (unless (tree-equal body (subst nil
                                  "IT"
                                  body
                                  :test #'string=
                                  :key (lambda (u)
                                         (typecase u
                                           ((or symbol string) (string u))
                                           (t "_")))))
    (error "Forbidden IT keyword"))
  `(cl:loop ,@body))

You can apply the same approach for other constructs you find problematic, but note that typically anaphoric macros are brought by depending on an external system, which is done on purpose and should thus not come as a surprise. But even if you don't know some of your macros are anaphoric, their documentation and even their naming conventions should be enough to prevent mistakes (the anaphora system introduce symbols which start with a, like aif, awhen, or s, like scase). Showing the documentation attached to a function or a macro is easily done if you work in an interactive environment (e.g. Emacs/Slime, but other ones too).

Sextuplet answered 31/7, 2018 at 22:37 Comment(2)
Thank you for the suggestions. Actually, hearing from an experienced user that they don't know of any further example of this sort was the kind of answer I was expecting. Since I'm unlikely to forget about "it" any time soon, I'll probably go with the option 1 (i.e., don't do anything special). But if you notice any further example, please, let me know.Reversioner
The problem isn't with anaphoric macros, because when you use one, you use it deliberately to exploit its it symbol. The problem is that loop isn't an anaphoric macro, which has it as an obscure feature, and no required diagnostics about surprising shadowing situations involving it.Degression
F
6

That's one reason why I'm not too big fan of anaphoric macros.

The LOOP macro makes it slightly worse, since the identifiers are not used as symbols with packages - but by name only. Example:

One might be in an user package which does not have direct access to the symbol cl::it:

(cl:loop for it in '(1 2 3 4)
         when (cl:evenp it) collect it)  ; this is still problematic

Thus the local symbol it will still be affected, since the anaphoric variable it is still shadowing the iteration variable it. It thus does not help to use your own package for the symbol it.

I don't have an answer what an user can do - other than carefully reading the docs, where the anaphoric variable will surely be prominently mentioned (?!?):

CLHS 6.1.9 Notes about Loop

Use caution when using a variable named IT (in any package) in connection with loop, since it is a loop keyword that can be used in place of a form in certain contexts.

The developer of the macro might want to check whether the user defines a variable with the name of an anaphoric variable - in the same macro form - and issue a warning. Still the variable might also be defined outside of the macro - which still can be a source of problems.

Functions

Similar things might happen with functions:

(defmethod bar (a) (print (list :foo a)))

(defmethod bar :around (a)
  (flet ((call-next-method ()
           (print a)))
    (call-next-method)))

Here we would need to know that the DEFMETHOD makes a local function CALL-NEXT-METHOD available. If we accidentally define a local function with the same name, then we would call our version - and not use the CLOS version...

Fecund answered 31/7, 2018 at 22:4 Comment(5)
Thank you for the link, I should have checked that. The only reason I did not accept your answer was that I found coredump's yet better.Reversioner
@DominikMokriš : this stuff is hard to find and one can not expect that one remembers it...Fecund
For a decent diagnostic, the loop macro has check whether a variable called it is bound inside loop or exists in the surrounding environment. If so, it has to code-walk all of the interior forms to determine whether they a reference to it emanates from any of them. If so, warn about the possible ambiguity.Degression
I think that rebinding cl:call-next-method is forbidden, but the point still stands with user-defined functions that are made accidently accessible (btw, I don't think there is a cl::it symbol in the standard)Sextuplet
@coredump: the effects of rebinding cl:call-next-method is undefined in the standard. Means: 'The consequences may range from harmless to fatal'. Yes, cl::it does not exist.Fecund

© 2022 - 2024 — McMap. All rights reserved.