Why aren't SAS Macro Variables Local-Scope by Default?
Asked Answered
M

3

5

I found this very helpful SO page while trying to resolve an issue related to macro variable scope. why doesn't %let create a local macro variable?

So to summarize, writing %let x = [];or %do x = [] %to []; in a macro will:

  • create a local-scope macro variable x if there is no "x" already in the global symbol table, or
  • update the global-scope macro variable "x" if an "x" is in the global symbol table

This strikes me as very non-intuitive. I would be willing to bet there are tons of bugs out in the SAS wilderness due to this design choice. I rarely see %local statements in macros, even above loop statements using common variable names like "i" or "counter." For example, I just pulled up the first paper with the word "macro" in the title from this list of SUGI and SAS Global Forum papers http://www.lexjansen.com/cgi-bin/xsl_transform.php?x=sgf2015&c=sugi

And indeed, I found this code in the first SAS conference paper I opened:

%macro flag;
data CLAIMS;
 set CLAIMS;
 %do j= 1 %to 3;
 if icd9px&j in (&codelist)
 then _prostate=1;
 %end;
run;
%mend;
%flag;

http://support.sas.com/resources/papers/proceedings15/1340-2015.pdf

Woe unto anyone who calls %flag and also has their own &j variable. They could easily end up with no log errors but bogus results because their &j is 4 everywhere after they call %flag, which will be (from experience) a bug that is no fun to track down. Or worse, they may never recognize their results are bogus.

So my question is, why was the decision made not to have all macro variables be local scope by default? Are there good reasons why SAS macro variable scope works the way it does?

Manaker answered 21/2, 2016 at 7:2 Comment(4)
This is an interesting question but may be off/topic for s.o. as there may now be a right answer, it's close to opinion. Suggest asking at communities.sas.com or SAS-L, both have more discussions. That said, I agree with you that the scoping rules are not intuitive and have high likelihood of causing bugs in the wild.Tumor
Thanks for the suggestion Quentin. Based on the description of the "language-design" tag, and the types of questions I see here for that tag, I didn't think this question would be off topic on SO. Maybe I'll post this on one of the sites you mentioned as well though.Manaker
Good point regarding the language-design tag, I would have assumed many of those questions were off-topic as well. (Even when I thought it was OT, I was still going to write an answer. : )Tumor
as a consultant that lives in the "wild" everyday, I can confirm this is indeed a problem. It allows people to create bad code that is really hard to debug. I recently remarked to a colleague that I spend 80% of my coding time working with others code trying to find where macro variables are set.Quarterage
A
4

Largely, because SAS is a 50 year old language which existed before lexical scoping was clearly preferred.

SAS has a mixture of the two scoping concepts, but is mostly dynamically scoped unless you intentionally change it. This means that just by reading a function's definition, you can't tell what variables will be available to it at run-time; and assignment statements apply to the version of a variable which is currently available at run-time (Rather than being enforced to be in the most local scope available).

That means that the macro compiler can't tell if a particular assignment statement is intended to be assigning a local macro variable, or a possibly-existing-at-runtime higher scope macro variable. SAS could enforce the local macro variable as you state, but that would turn SAS into a lexical scoping language, which isn't desired both based on consistency with past (keeping backwards compatibility) and based on functionality; SAS offers the ability to enforce lexical scoping (use %local) but doesn't offer the ability to intentionally alter a variable in a higher scope (some form of parent?) other than %global.

Note that Dynamic Scoping was very common back in the 60s and 70s. S-Plus, Lisp, etc. all had dynamic scoping. SAS tends to prefer backwards compatibility as far back as possible. SAS also is commonly used analysts, rather than programmers, and so needs to avoid complexity whenever possible. They offer %local for those of us who do want the advantages of lexical scoping

Armenia answered 22/2, 2016 at 16:14 Comment(3)
Thanks for the insight Joe. Can I ask you to clarify what you mean that SAS "doesn't offer the ability to intentionally alter a variable in a higher scope...other than %global?" In the code in my question, if there were a "%let j = 5" statement above the macro definition, then the %flag macro would change the value of that higher scope &j without a %global statement. Presumably sometimes that's intentional.Manaker
The CALL SYMPUTX() function allows you to override the default behavior of writing to the most locally defined macro variable, but only by writing to the GLOBAL macro scope. You cannot use it to write to some arbitrary intermediate scoping level.Bartholomew
@Max, that's exactly because SAS is dynamically scoped - but that's not something you can 'turn on' when you want it. You can force a variable to be local scope (effectively lexically scoped) but you can't force a variable to be less locally scoped but not global - so you can't force it to be in an intermediate scope.Armenia
T
4

Answering WHY were the scoping rules defined this way is hard for me, without knowing the history of the macro language.

When I learned the macro language (on 6.12), I was lucky to be taught from early on that macros should always declare their variables to be %LOCAL, unless they had a really good reason not to. Sometimes if a macro var was not declared to be %local or %global I would even put a /* Not Local: MyMacVar */ comment in them to document that I did not intend to declare the scope (this is unusual but sometimes useful). It pains me to see UG papers, SO answers, etc, that do not declare variables as %LOCAL.

I'm going to guess (this is just a guess), that there was some early version of SAS which had (global) macro variables for text generation in code, but did not have macros. So in such a version, people would have gotten used to having lots of global macro variables, and the associated problems (e.g. collisions). Then, when SAS designed macros, the question would have come up, "Can I reference my macro vars from inside a macro?" And the designer chose to answer "yes, not only can you reference them, you can also assign values to them, and I'll make it easy by allowing you to do that by default. But also, a macro will create its own scope that can hold local macro variables. If you reference a macro var or assign a macro var that with the same name as a macro var that exists in a global scope (or any outer scope), I'll assume you are referencing the global macro variable (like you are used to already), unless you have explicitly declared the macro var to be %LOCAL."

From the perspective of the current macro language / macro developer, most folks think most global macro vars should be avoided. And one of the benefits of the macro language is that it provides macros that allow for modularization/encapsulation/information-hiding. When viewed from this perspective, %local variables are more useful, and macro variables that are not declared to be %local are a threat to encapsulation (i.e. collision threat). So I would tend to agree that if I were redesigning the macro language, I would make macro variables %local by default. But of course at this point, it's too late for a change.

Tumor answered 21/2, 2016 at 11:36 Comment(1)
This account of SAS history isn't quite detailed enough to confirm - but it does seem like macro language came to be in stages that could have led to what you describe.Cecillececily
A
4

Largely, because SAS is a 50 year old language which existed before lexical scoping was clearly preferred.

SAS has a mixture of the two scoping concepts, but is mostly dynamically scoped unless you intentionally change it. This means that just by reading a function's definition, you can't tell what variables will be available to it at run-time; and assignment statements apply to the version of a variable which is currently available at run-time (Rather than being enforced to be in the most local scope available).

That means that the macro compiler can't tell if a particular assignment statement is intended to be assigning a local macro variable, or a possibly-existing-at-runtime higher scope macro variable. SAS could enforce the local macro variable as you state, but that would turn SAS into a lexical scoping language, which isn't desired both based on consistency with past (keeping backwards compatibility) and based on functionality; SAS offers the ability to enforce lexical scoping (use %local) but doesn't offer the ability to intentionally alter a variable in a higher scope (some form of parent?) other than %global.

Note that Dynamic Scoping was very common back in the 60s and 70s. S-Plus, Lisp, etc. all had dynamic scoping. SAS tends to prefer backwards compatibility as far back as possible. SAS also is commonly used analysts, rather than programmers, and so needs to avoid complexity whenever possible. They offer %local for those of us who do want the advantages of lexical scoping

Armenia answered 22/2, 2016 at 16:14 Comment(3)
Thanks for the insight Joe. Can I ask you to clarify what you mean that SAS "doesn't offer the ability to intentionally alter a variable in a higher scope...other than %global?" In the code in my question, if there were a "%let j = 5" statement above the macro definition, then the %flag macro would change the value of that higher scope &j without a %global statement. Presumably sometimes that's intentional.Manaker
The CALL SYMPUTX() function allows you to override the default behavior of writing to the most locally defined macro variable, but only by writing to the GLOBAL macro scope. You cannot use it to write to some arbitrary intermediate scoping level.Bartholomew
@Max, that's exactly because SAS is dynamically scoped - but that's not something you can 'turn on' when you want it. You can force a variable to be local scope (effectively lexically scoped) but you can't force a variable to be less locally scoped but not global - so you can't force it to be in an intermediate scope.Armenia
D
0

Then we couldn't do this or a least not without a new declarative statement.

33         %let c=C is global;
34         %macro b(arg);
35            %let &arg=Set by B;
36            %mend b;
37         %macro a(arg);
38            %local c;
39            %b(c);
40            %put NOTE: &=c;
41            %mend a;
42         %a();
NOTE: C=Set by B
Darb answered 22/2, 2016 at 10:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.