Oh, the wonderful smell of globals...
All of the answers in this post gave R examples, and the OP wanted some Stata examples, as well. So let me chime in with these.
Unlike R, Stata does take care of locality of its local macros (the ones that you create with local
command), so the issue of "Is this this a global z or a local z that is being returned?" never comes up. (Gosh... how can you R guys write any code at all if locality is not enforced???) Stata has a different quirk, though, namely that a non-existent local or global macro is evaluated as an empty string, which may or may not be desirable.
I have seen globals used for several main reasons:
Globals are often used as shortcuts for variable lists, as in
sysuse auto, clear
regress price $myvars
I suspect that the main usage of such construct is for someone who switches between interactive typing and storing the code in a do-file as they try multiple specifications. Say they try regression with homoskedastic standard errors, heteroskedastic standard errors, and median regression:
regress price mpg foreign
regress price mpg foreign, robust
qreg price mpg foreign
And then they run these regressions with another set of variables, then with yet another one, and finally they give up and set this up as a do-file myreg.do
with
regress price $myvars
regress price $myvars, robust
qreg price $myvars
exit
to be accompanied with an appropriate setting of the global macro. So far so good; the snippet
global myvars mpg foreign
do myreg
produces the desirable results. Now let's say they email their famous do-file that claims to produce very good regression results to collaborators, and instruct them to type
do myreg
What will their collaborators see? In the best case, the mean and the median of mpg
if they started a new instance of Stata (failed coupling: myreg.do
did not really know you meant to run this with a non-empty variable list). But if the collaborators had something in the works, and too had a global myvars
defined (name collision)... man, would that be a disaster.
You can take it a half step further in obscurity. Let's say that the global macro myvars
is defined as global myvars mpg foreign, robust
(nobody enforces what goes into the macro, right?). Then the first reg $myvars
will produce the regression with HCE standard errors; the second reg $myvars, robust
is going to complain that the variable robust
isn't found, and qreg $myvars
will complain about option robust
not being supported.
Globals are used for directory or file names, as in:
use $mydir\data1, clear
God only knows what will be loaded. In large projects, though, it does come handy. You would want to define global mydir
somewhere in your master do-file, may be even as
global mydir `c(pwd)'
Globals can be used to store an unpredictable crap, like a whole command:
capture $RunThis
God only knows what will be executed; let's just hope it is not ! format c:\
. This is the worst case of implicit strong coupling, but since I am not even sure that RunThis
will contain anything meaningful, I put a capture
in front of it, and will be prepared to treat the non-zero return code _rc
. (See, however, my example below.)
Globals as behavior switches (page 16 of https://hwpi.harvard.edu/files/sdp/files/sdp-toolkit-coding-style-guide.pdf). Don't. This just means you need to break your code into separate do-files and run each as needed. Even if the switch is preceded by extensive data manipulation that takes computing time... it means that the said computing should write the results to disk, and the next step that they have as // STUFF
should use that_data, clear
first.
Stata's own use of globals is for God settings, like the type I error probability/confidence level: the global $S_level
is always defined (and you must be a total idiot to redefine this global, although of course it is technically doable). This is, however, mostly a legacy issue with code of version 5 and below (roughly), as the same information can be obtained from less fragile system constant:
set level 90
display $S_level
display c(level)
Thankfully, globals are quite explicit in Stata, and hence are easy to debug and remove. In some of the above situations, and certainly in the first one, you'd want to pass parameters to do-files which are seen as the local `0'
inside the do-file. Instead of using globals in the myreg.do
file, I would probably code it as
unab varlist : `0'
regress price `varlist'
regress price `varlist', robust
qreg price `varlist'
exit
The unab
thing will serve as an element of protection: if the input is not a legal varlist, the program will stop with an error message.
In the worst cases I've seen, the global was used only once after having been defined.
There are occasions when you do want to use globals, because otherwise you'd have to pass the bloody thing to every other do-file or a program. One example where I found the globals pretty much unavoidable was coding a maximum likelihood estimator where I did not know in advance how many equations and parameters I would have. Stata insists that the (user-supplied) likelihood evaluator will have specific equations. So I had to accumulate my equations in the globals:
global my_parameters
forvalues k=1/`number_of_equations' {
local this_equation: piece `k' of syntax
// maybe do more parsing of the equation as needed
global my_parameters ${my_parameters} (eq`k': parsed_specification)
}
... and then call my evaluator with the globals in the descriptions of the syntax that Stata would need to parse:
args lf ${my_parameters}
where lf
was the objective function (the log-likelihood). I encountered this at least twice, in the normal mixture package (denormix
) and confirmatory factor analysis package (confa
); you can findit
both of them, of course.