Can someone explain the concept of 'hygiene' to me (I'm a scheme programmer)?
Asked Answered
C

6

30

So... I'm new to scheme r6rs, and am learning macros. Can somebody explain to me what is meant by 'hygiene'?

Thanks in advance.

Ceylon answered 9/6, 2010 at 22:36 Comment(7)
Oh, you know, bathing, brushing your teeth, picking the food out of your beard. Things that most programmers have trouble with.Glidebomb
@JS: I've been sitting here for days - haven't moved away from my computer for any reason whatsoever. That's how hard I'm trying to learn hygiene, but I still can't grasp the concept. In fact, the longer I sit here, the worse it seems to be getting :(Ceylon
I'm curious, what are you using Scheme for?Hubblebubble
I suggest some good sleep. That does wonders for the understanding. I am never so productive and clear-headed as when I have my 8.5 hours per night. :)Relate
@gnucom: For fun. Also I've heard (and am seeing) that it's quick to use for prototyping algorithms/ideas, so once I'm better with it I'll use it for that.Ceylon
@incrediman: Heh, thats funny, thats exactly what I use it for (fun), too.Hubblebubble
I'm not sure what you mean in your title: Scheme programmers are typically more hygienic than say C programmers. They're certainly more hygienic than most Perl programmers I have met. I guess you mean that as a Scheme programmer you just never think about it.Phrixus
V
26

Hygiene is often used in the context of macros. A hygienic macro doesn't use variable names that can risk interfering with the code under expansion. Here is an example. Let's say we want to define the or special form with a macro. Intuitively,

(or a b c ... d) would expand to something like (let ((tmp a)) (if tmp a (or b c ... d))). (I am omitting the empty (or) case for simplicity.)

Now, if the name tmp was actually added in the code like in the above sketched expansion, it would be not hygienic, and bad because it might interfere with another variable with the same name. Say, we wanted to evaluate

(let ((tmp 1)) (or #f tmp))

Using our intuitive expansion, this would become

(let ((tmp 1)) (let ((tmp #f)) (if tmp (or tmp)))

The tmp from the macro shadows the outer-most tmp, and so the result is #f instead of 1.

Now, if the macro was hygienic (and in Scheme, it's automatically the case when using syntax-rules), then instead of using the name tmp for the expansion, you would use a symbol that is guaranteed not to appear anywhere else in the code. You can use gensym in Common Lisp.

Paul Graham's On Lisp has advanced material on macros.

Violaviolable answered 9/6, 2010 at 22:56 Comment(0)
S
9

If you imagine that a macro is simply expanded into the place where it is used, then you can also imagine that if you use a variable a in your macro, there might already be a variable a defined at the place where that macro is used.

This is not the a that you want!

A macro system in which something like this cannot happen, is called hygienic.

There are several ways to deal with this problem. One way is simply to use very long, very cryptic, very unpredictable variable names in your macros.

A slightly more refined version of this is the gensym approach used by some other macro systems: instead of you, the programmer coming up with a very long, very cryptic, very unpredictable variable name, you can call the gensym function which generates a very long, very cryptic, very unpredictable and unique variable name for you.

And like I said, in a hygienic macro system, such collisions cannot happen in the first place. How to make a macro system hygienic is an interesting question in itself, and the Scheme community has spent several decades on this question, and they keep coming up with better and better ways to do it.

Swatter answered 9/6, 2010 at 22:53 Comment(2)
waves hand These are not the as you are looking for.Letaletch
This has several wrong parts: (a) using long, cryptic names in macros is not a solution, it's only a way to delay the macro -- specifically, it can fail in glaring ways when the macro is used inside itself. (b) you're mentioning only one side of the problem, the one that gensyms solve; the other side is that bindings in the macro definition cannot be shadowed by bindings in its use, for example: (let ((if "bleh")) (my-macro)). This cannot be addressed with just gensyms.Stopped
H
3

I'm so glad to know that this language is still being used! Hygienic code is code that when injected (via a macro) does not cause conflicts with existing variables.

There is lots of good information on Wikipedia about this: http://en.wikipedia.org/wiki/Hygienic_macro

Hubblebubble answered 9/6, 2010 at 22:55 Comment(0)
M
2

Here's what I found. Explaining what it means is another matter altogether!

http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-1.html#node_toc_node_sec_12.1

Mirisola answered 9/6, 2010 at 22:53 Comment(2)
It is quite plain, if you know the jargon! It means, "hygenic macros don't pollute the symbol table".Relate
It's very obscure and gnarly jargon, and it basically took me a graduate course to get what is going on(I think other people are smarter).Relate
A
2

Macros transform code: they take one bit of code and transform it into something else. As part of that transformation, they may surround that code with more code. If the original code references a variable a, and the code that's added around it defines a new version of a, then the original code won't work as expected because it will be accessing the wrong a: if

(myfunc a)

is the original code, which expects a to be an integer, and the macro takes X and transforms it to

(let ((a nil)) X)

Then the macro will work fine for

(myfunc b)

but (myfunc a) will get transformed to

(let ((a nil)) (myfunc a))

which won't work because myfunc will be applied to nil rather than the integer it is expecting.

A hygienic macro avoids this problem of the wrong variable getting accessed (and a similar problem the other way round), by ensuring that the names used are unique.

Wikipedia has a good explanation of hygienic macros.

Ade answered 9/6, 2010 at 22:59 Comment(0)
F
2

Apart from all the things mentioned, there is one important other thing to Scheme's hygienic macros, which follow from the lexical scope.

Say we have:

(syntax-rules () ((_ a b) (+ a b)))

As part of a macro, surely it will insert the +, it will also insert it when there's a + already there, but then another symbol which has the same meaning as +. It binds symbols to the value they had in the lexical environment in which the syntax-rules lies, not where it is applied, we are lexically scoped after all. It will most likely insert a completely new symbol there, but one which is globally bound to the same meaning as + is at the place the macro is defined. This is most handy when we use a construct like:

(let ((+ *))
  ; piece of code that is transformed
)

The writer, or user of the macro thus needn't be occupied with ensuring its use goes well.

Figure answered 10/6, 2010 at 22:33 Comment(2)
+1, that's interesting; this way it follows the expected behavior for both the macro writer and user.Ceylon
Yeah, hygiene is just a way to ensure correctness irrelevant of what symbols either the writer or the user has used. I really broke my head a long time trying to figure out the logic of why (let ((if +)) ...) for instance didn't fail drastically, until I realize it was because of lexical scope.Figure

© 2022 - 2024 — McMap. All rights reserved.