Is it possible to implement Common Lisp's macro system in scheme?
Asked Answered
U

3

23

Hopefully this is not a redundant question.

As a newcomer to scheme I am aware that syntax-case macros are more powerful than the syntax-rules alternative, at the cost of unwanted complexity.

Is it possible, however, to implement Common Lisp's macro system in scheme, which is more powerful than syntax-rules, using syntax-case?

Upward answered 29/10, 2013 at 17:18 Comment(6)
The short answer is "yes". A quick Google search turns up Scheme (and Scheme-like language) implementations that support Common Lisp style macros. E.g., Racket's legacy macro support has an implementation of defmacro.Polliwog
That said, this doesn't appear to be a specific programming problem; answers to it seem like they'll be enumerations of implementations, and those kind of list questions are typically off-topic for Stack Overflow.Polliwog
You can't really call it 'unwanted complexity' if you need defmacro implemented - then it is 'required functionality'Flagship
Since when does one "need" defmacro in Scheme? syntax-case is strictly more powerful than defmacro.Individuality
@Joshua-taylor: His question is if it's possible to make defmacro by using syntax-case. I think it's a fair question.Dupre
@Dupre I missed the "using syntax-case" at the end. Thanks for pointing that out. It still probably isn't specific enough (e.g., "what has Bracket tried so far?"), but the bit about "using syntax-case" is good; I retract my close vote.Polliwog
P
40

I'll try to be brief -- which is hard since this is generally a very deep issue, more than the average SO level of Q&A... so this is still going to be pretty long. I'll also try to be unbiased; even though I come from a Racket perspective, I have been using Common Lisp in the past, and I always liked to use macros in both worlds (and in others, actually). It won't look like a direct answer to your question (at an extreme end, that would be just "yes"), but it is comparing the two systems and hopefully this will help in clarifying the issue for people -- especially people outside of lisp (all flavors) who would wonder why is this such a big deal. I'll also describe how defmacro can be implemented in "syntax-case" systems, but only generally to keep things clear (and since you can just find such implementations, some given in the comments and in other answers).

First, your question is very much not redundant -- it is very justified, and (as I implied), one of the things that Lisp newcomers to Scheme and Scheme newcomers to Lisp encounter.

Second, the very shallow, very short answer is what people have told you: yes, it's possible to implement CL's defmacro in a Scheme that supports syntax-case, and as expected you've got multiple pointers to such implementations. Going the other way and implementing syntax-case using simple defmacro is a trickier subject that I won't talk about too much; I'll just say it has been done, only at a very high cost of re-implementing lambda and other binding constructs, which means that it's basically a re-implementation of a new language which you should commit to if you want to use that implementation.

And another clarification: people, especially CLers, very often collapse two things related to Scheme macros: hygiene, and syntax-rules. The thing is that in R5RS all you have is syntax-rules, which is a very limited pattern-based rewrite system. As with other rewrite systems, you can use it naively, or get all the way up to using rewrites to define a small language which you can then use to write macros in. See this text for a known explanation on how this is done. While it's possible to do that, the bottom line is that it's hard, you're using some weird small language thing that is not directly related to your actual language, and this makes it very far from Scheme programming -- probably in an even worse way that using the hygienic macro implementation in CL is not really using plain CL. In short, it's possible to use only syntax-rules, but this is mostly in a theoretical sense, not something that you'll want to use in "real" code. The main point here is that hygiene does not imply being limited to syntax-rules.

However, syntax-rules was not intended as "the" Scheme macro system -- the idea was always that you have some "low-level" macro implementation which is used to implement syntax-rules but can also implement hygiene-breaking macros -- it's just that there was no agreement on the particular low-level implementation. R6RS fixes that by standardizing the "syntax-case" macro system (note that I'm using "syntax-case" as the name of the system, different from syntax-case which is the form that is it's main highlight). As if to make a point that the discussion is still alive, R7RS took a step back and excluded it, reverting back to having syntax-rules with no commitment on a low-level system, at least as far as the "small language" goes.

Now, to actually understand the difference between the two systems, the best thing to clarify is the difference between the types that they're dealing with. With defmacro, a transformer is basically a function that takes in an S-expression(s) and returns an S-expression. An S-expression here is a type that is made of a bunch of literal types (numbers, strings, booleans), symbols, and list-nested structures of these. (The actual types used are a little more than that, but that's enough to make the point.) The thing is that this is a very simple world: you're getting in something that is very concrete -- you can actually print the input/output values and that's all you have. Note that this system uses symbols to denote identifiers -- and a symbol is something very concrete in just that sense: an x is a piece of code that has just that name, x.

However, this simplicity comes at a cost: you cannot use it for hygienic macros since you have no way to distinguish two different identifiers that are both called x. The usual CL-based defmacro does have some addtional bits that compensate for some of this. One such bit is gensym -- a tool to create "fresh" symbols that are uninterned and therefore are guaranteed to be different from any other symbol, including ones that have the same name. Another such bit is the &environment argument to defmacro transformers, which holds some representation of the lexical environment of the location where the macro is used.

It's obvious that these things complicate the defmacro world since it's no longer dealing with plain printable values, and since you need to be aware of some representation of an environment -- which makes it even more clear that a macro is actually a piece of code that is a compiler hook (since this environment is essentially some datatype that the compiler usually deals with, and one that is more complicated than just S-expressions). But as it turns out, they're not enough to implement hygiene. Using gensym you can get one easy aspect of hygiene knocked off (avoiding macro-side captures of user code), but the other aspect (avoiding user code capturing macro code) is still left open. Some people settle for this with arguing that the kind of capture that you can avoid is sufficient -- but when you're dealing with a modular system where a macro's environment often has different bindings than the ones used in its implementation, the other side becomes much more important.

Switch over to syntax-case macro systems (and happily skipping over syntax-rules, which is trivially implemented using syntax-case). In this system, the idea is that if plain symbolic S-expressions are not expressive enough to represent the complete lexical knowledge (ie, the difference between two different bindings, both called x), then we're going to "enrich" them and use a datatype that does. (Note that there are other low-level macro systems that take different approaches to providing the extra information, like explicit renaming and syntactic closures.)

The way that this is done is by making macro transformers be functions that consume and return "syntax objects", which are exactly that kind of a representation. More precisely, these syntax objects are usually built on top of the plain symbolic representation, only wrapped in structures that have additional information that represent the lexical scope. In some systems (notably in Racket), everything gets wrapped in syntax objects -- symbols as well as other literals and lists. Given this, it's not surprising that it's easy to get S-expressions out of syntax objects: you just pull out the symbolic contents, and if it's a list, then continue doing that recursively. In syntax-case systems, this is done by syntax-e which implements the accessor for the symbolic content of a syntax object, and syntax->datum that implements the versions that goes down the result recursively to produce a full S-expression. As a side-note, this is a rough explanation why in Scheme people don't talk about bindings being represented as symbols, but as identifiers.

On the other side, the question is how do you start from a given symbolic name and construct such a syntax object. The way that this is done is with the datum->syntax function -- but instead of making the api specify how is the lexical scope information represented, the function takes in a syntax object as a first argument and a symbolic S-expression as the second argument, and it creates a syntax object by properly wrapping the S-expression with the lexical scope information taken from the first. This means that to break hygiene what you usually do is start with a user-supplied syntax object (eg, a body form of a macro), and use its lexical information to create some new identifier like this that is visible in the same scope.

This quick description is enough to see how the macros that you were shown work. The macro that @ChrisJester-Young showed simply takes in the syntax object(s), strips it to a raw S-expression with syntax->datum, sends that to the defmacro transformer and gets back an S-expression, then it uses syntax->datum to convert the result back to a syntax object using the user code's lexical context. Racket's defmacro implementation is a bit fancier: during the stripping stage it keeps a hash table that maps the resulting S-expressions to their original syntax objects, and during the reconstruction step it consults this table to get the same context as the bits of code originally had. This makes it a more robust implementation for some more complex macros, but it's also more useful in Racket since syntax objects have much more information in them, like source location, properties, etc, and this careful reconstruction will usually result in output values (syntax objects) that keep the information they had on their way into the macro.

For a slightly more technical introduction for defmacro programmers to the syntax-case system, see my writing syntax-case macros blog post. If you're coming from the Scheme side it won't be as useful, but it will still be useful in clarifying the whole issue.

To get this closer to a conclusion, I should note that dealing with unhygienic macros can still be tricky. More specifically, there are various ways to achieve such bindings, but they are different in various subtle ways, and can usually come back and bite you leaving slightly different teeth marks in each case. In a "true" defmacro system like CL, you learn to live with a specific set of teeth marks, ones that are relatively well known, and therefore there are things that you just don't do. Most notably here is the kind of modular composition of languages with different bindings for the same names that Racket uses so frequently. In syntax-case systems a better approach is fluid-let-syntax that is used to "adjust" the meaning of a lexically scoped name -- and more recently, that has evolved into "syntax parameters". There is a good overview of the problems of hygiene-breaking macros, which includes a description of how you can attempt to solve it with just hygienic syntax-rules, with basic syntax-case, with CL-style defmacro, and finally with syntax parameters. This text gets a bit more technical, but its relatively easy to read the first few pages, and if you understand this, then you will have a very good picture of the whole debate. (There is also an older blog post that is covered better in the paper.)

I should also mention that this is far from being the only "hot" issue around macros. The debate within Scheme circles about which low-level macro system is better can get pretty hot at times. And there are other issues around macros like the question of how to make them work in a module system where a library can provide macros as well as values and functions, or whether to separate macro-expansion time and runtime into separate phases, and more.

Hopefully this presents a more complete picture of the issue, to the point of being aware of the tradeoffs and being able to decide for yourself what works best for you. I also hope that this clarifies some of the sources for the usual flames: hygienic macros are certainly not useless, but since the new type is more than just simple S-expressions, there is more functionality around them -- and all too often shallow-reading bypassers jump to a conclusion that "it's too complex". Even worse are flames in the spirit of "in the Scheme world people know close to nothing about metaprogramming": being very painfully aware of the added cost and also of the desired benefits, people in the Scheme world has spent orders of magnitude more collective efforts on the subject. It's a fine choice to stick with defmacro if the extra wrappers around S-expressions are too complicated for your taste, but you should be aware of the cost involved in learning that vs what you pay by dumping hygiene (and vs what you get by embracing it).

Unfortunately, macros of any flavor are overall a pretty hard subject for newbies (perhaps excluding the extremely limited syntax-rules), so people tend to find themselves in the middle of such flames without having enough experience to know your left from your right. Ultimately, nothing beats having good experience in both worlds to clarify the tradeoffs. (And that's from very concrete personal experience: had PLT Scheme not switch to syntax case N years ago, I would have probably never bother with it... Once they did switch, it took me a long while to convert my code -- and only then did I realize just how great it is to have a robust system where no names get "confused" by mistake (which would result in weird bugs, and obfuscated %%__names__).)

(Still, it's very likely that comment flames will happen...)

Pretty answered 29/10, 2013 at 23:6 Comment(2)
In Guile, some of us are in the process of converting legacy define-macro macros to syntax-rules and syntax-case, and yes, it's absolutely great. For example, hygienic macros could call package-private (unexported) functions the simple, obvious way---which would have been much more burdensome to do in a defmacro-style macro. Oh, and by the way, I took on board your advice to use syntax parameters: git.savannah.gnu.org/cgit/guile.git/commit/…Individuality
By the way, this whole defmacro purge started when ijp made the observation that hygienic macros and defmacros don't mix, and stis was like, what, we still have defmacros in Guile?! Granted, we started "N years" later than Racket, but I'm glad it's actually being done now.Individuality
I
9

Here's Guile's implementation of define-macro. Note that it's implemented entirely with syntax-case:

(define-syntax define-macro
  (lambda (x)
    "Define a defmacro."
    (syntax-case x ()
      ((_ (macro . args) doc body1 body ...)
       (string? (syntax->datum #'doc))
       #'(define-macro macro doc (lambda args body1 body ...)))
      ((_ (macro . args) body ...)
       #'(define-macro macro #f (lambda args body ...)))
      ((_ macro transformer)
       #'(define-macro macro #f transformer))
      ((_ macro doc transformer)
       (or (string? (syntax->datum #'doc))
           (not (syntax->datum #'doc)))
       #'(define-syntax macro
           (lambda (y)
             doc
             #((macro-type . defmacro)
               (defmacro-args args))
             (syntax-case y ()
               ((_ . args)
                (let ((v (syntax->datum #'args)))
                  (datum->syntax y (apply transformer v)))))))))))

Guile has special support for Common Lisp-style docstrings, so if your Scheme implementation doesn't use docstrings, your define-macro implementation could be even simpler:

(define-syntax define-macro
  (lambda (x)
    (syntax-case x ()
      ((_ (macro . args) body ...)
       #'(define-macro macro (lambda args body ...)))
      ((_ macro transformer)
       #'(define-syntax macro
           (lambda (y)
             (syntax-case y ()
               ((_ . args)
                (let ((v (syntax->datum #'args)))
                  (datum->syntax y (apply transformer v)))))))))))
Individuality answered 29/10, 2013 at 18:17 Comment(0)
Y
5

Here is the implementation of define-macro from my Standard Prelude, along with the examples from Paul Graham's book:

(define-syntax (define-macro x)
  (syntax-case x ()
    ((_ (name . args) . body)
      (syntax (define-macro name (lambda args . body))))
    ((_ name transformer)
      (syntax
       (define-syntax (name y)
         (syntax-case y ()
           ((_ . args)
             (datum->syntax-object
               (syntax _)
               (apply transformer
                 (syntax-object->datum (syntax args)))))))))))

(define-macro (when test . body) `(cond (,test . ,body)))

(define-macro (aif test-form then-else-forms)
  `(let ((it ,test-form))
     (if it ,then-else-forms)))

(define-macro (awhen pred? . body)
  `(aif ,pred? (begin ,@body)))
Yordan answered 29/10, 2013 at 18:22 Comment(12)
Why on earth would a standard prelude provide unhygienic macro support? That seems to go against what Scheme is about.Individuality
Because it is sometimes useful.Yordan
@ChrisJester-Young, hygienic macros are nearly useless - you can't do anything besides trivial syntax sugar with them. Proper, powerful metaprogramming is only possible with non-hygienic macros.Tapioca
@Tapioca Perhaps, but in the Scheme world, we use syntax-case (or perhaps explicit renaming) for macros with controlled unhygiene, not defmacro, which is uncontrolled unhygiene. I frequently use syntax-case, but I would absolutely 100% avoid defmacro.Individuality
@Yordan As I just said to SK-logic, syntax-case is sometimes useful. define-macro is only good for compatibility with CL-style macros. But it's seriously deprecated for new Scheme code.Individuality
@ChrisJester-Young, in the Scheme world people know close to nothing about metaprogramming, so no surprise you're avoiding the most powerful and flexible tool.Tapioca
@Tapioca Given that you can implement defmacro using syntax-case, clearly it's powerful enough to break hygiene wherever you want to. I am saying that you do not need 100% uncontrolled unhygiene in order to have metaprogramming.Individuality
@ChrisJester-Young, likewise, you can use defmacro to define syntax-case. But, given that in practice hygiene can be useful only in the trivial, marginal cases, there is little point in bothering implementing a hygienic macro expander. defmacro-style macros should be used most of the time, therefore, the simpler macro system should be fundamental, and hygienic must be an add-on on top, not the other way around.Tapioca
@ChrisJester-Young: You have made my point: define-macro is good for maintaining compatibility with CL-style macros. Therefore, as I said, it is sometimes useful. And thus I keep it in my Standard Prelude. I also use define-macro because such definitions are sometimes easier to read than the equivalent hygienic macro. And my Standard Prelude also has gensym, because that is sometimes useful in conjunction with define-macro. They purity of hygienic macros is also sometimes useful, and I use them more often than define-macro. But not exclusively. It's good to have more than one way to do things.Yordan
defmacro has both benefits and disadvantages. So has syntax-rules and syntax-case too. Whats better is a matter of opinion so the discussion, sadly, becomes a peeing contest.Dupre
Thanks! This worked for me! Very useful, coming from a Clojure background.Patel
Hi, this doesn't seem to work with r6rs libraries. In the namespace where the macro is defined, if I import some symbols from a library with a prefix, the macro still seems to only recognize them if the symbols are not prefixed. Also, it doesn't seem to recognize other symbols defined in the context where it is being expanded.Patel

© 2022 - 2024 — McMap. All rights reserved.