How can I handle R CMD check "no visible binding for global variable" notes when my ggplot2 syntax is sensible?
Asked Answered
S

8

218

EDIT: Hadley Wickham points out that I misspoke. R CMD check is throwing NOTES, not Warnings. I'm terribly sorry for the confusion. It was my oversight.

The short version

R CMD check throws this note every time I use sensible plot-creation syntax in ggplot2:

no visible binding for global variable [variable name]

I understand why R CMD check does that, but it seems to be criminalizing an entire vein of otherwise sensible syntax. I'm not sure what steps to take to get my package to pass R CMD check and get admitted to CRAN.

The background

Sascha Epskamp previously posted on essentially the same issue. The difference, I think, is that subset()'s manpage says it's designed for interactive use.

In my case, the issue is not over subset() but over a core feature of ggplot2: the data = argument.

An example of code I write that generates these notes

Here's a sub-function in my package that adds points to a plot:

JitteredResponsesByContrast <- function (data) {
  return(
    geom_point(
             aes(
               x = x.values, 
               y = y.values
             ),
             data     = data,
             position = position_jitter(height = 0, width = GetDegreeOfJitter(jj))
    )
  )
}

R CMD check, on parsing this code, will say

granovagg.contr : JitteredResponsesByContrast: no visible binding for
  global variable 'x.values'
granovagg.contr : JitteredResponsesByContrast: no visible binding for
  global variable 'y.values'

Why R CMD check is right

The check is technically correct. x.values and y.values

  • Aren't defined locally in the function JitteredResponsesByContrast()
  • Aren't pre-defined in the form x.values <- [something] either globally or in the caller.

Instead, they're variables within a dataframe that gets defined earlier and passed into the function JitteredResponsesByContrast().

Why ggplot2 makes it difficult to appease R CMD check

ggplot2 seems to encourage the use of a data argument. The data argument, presumably, is why this code will execute

library(ggplot2)
p <- ggplot(aes(x = hwy, y = cty), data = mpg)
p + geom_point()

but this code will produce an object-not-found error:

library(ggplot2)
hwy # a variable in the mpg dataset

Two work-arounds, and why I'm happy with neither

The NULLing out strategy

Matthew Dowle recommends setting the problematic variables to NULL first, which in my case would look like this:

JitteredResponsesByContrast <- function (data) {
  x.values <- y.values <- NULL # Setting the variables to NULL first
  return(
    geom_point(
             aes(
               x = x.values, 
               y = y.values
             ),
             data     = data,
             position = position_jitter(height = 0, width = GetDegreeOfJitter(jj))
    )
  )
}

I appreciate this solution, but I dislike it for three reasons.

  1. it serves no additional purpose beyond appeasing R CMD check.
  2. it doesn't reflect intent. It raises the expectation that the aes() call will see our now-NULL variables (it won't), while obscuring the real purpose (making R CMD check aware of variables it apparently wouldn't otherwise know were bound)
  3. The problems of 1 and 2 multiply because every time you write a function that returns a plot element, you have to add a confusing NULLing statement

The with() strategy

You can use with() to explicitly signal that the variables in question can be found inside some larger environment. In my case, using with() looks like this:

JitteredResponsesByContrast <- function (data) {
  with(data, {
      geom_point(
               aes(
                 x = x.values, 
                 y = y.values
               ),
               data     = data,
               position = position_jitter(height = 0, width = GetDegreeOfJitter(jj))
      )
    }
  )
}

This solution works. But, I don't like this solution because it doesn't even work the way I would expect it to. If with() were really solving the problem of pointing the interpreter to where the variables are, then I shouldn't even need the data = argument. But, with() doesn't work that way:

library(ggplot2)
p <- ggplot()
p <- p + with(mpg, geom_point(aes(x = hwy, y = cty)))
p # will generate an error saying `hwy` is not found

So, again, I think this solution has similar flaws to the NULLing strategy:

  1. I still have to go through every plot element function and wrap the logic in a with() call
  2. The with() call is misleading. I still need to supply a data = argument; all with() is doing is appeasing R CMD check.

Conclusion

The way I see it, there are three options I could take:

  1. Lobby CRAN to ignore the notes by arguing that they're "spurious" (pursuant to CRAN policy), and do that every time I submit a package
  2. Fix my code with one of two undesirable strategies (NULLing or with() blocks)
  3. Hum really loudly and hope the problem goes away

None of the three make me happy, and I'm wondering what people suggest I (and other package developers wanting to tap into ggplot2) should do.

Spinks answered 24/2, 2012 at 23:0 Comment(8)
I like #1 and #3.Reassure
@BenBolker those are my go-to techniques too.Punctuate
There is a 4th option: modify 'R CMD check' and submit a patch to r-devel for consideration. I suspect you'll find it's quite difficult (and possibly impossible) to detect which are spurious and which aren't. If anyone came up with a piece of code to do that, then ...Weiland
Another strategy is to use aes_stringPunctuate
This seems to be a problem with transform and subset too (not 100% sure, but it makes sense).Bordure
See #23475809 for dealing with this problem with reference classes (it was incorrectly marked as a dup)Geophagy
This is also a problem with the base graphics::curve function that takes a placeholder "x" variable.Lao
It's a problem with defining column names with base::data.frame() as well...Hedger
I
52

Have you tried with aes_string instead of aes? This should work, although I haven't tried it:

aes_string(x = 'x.values', y = 'y.values')
Inflame answered 24/2, 2012 at 23:10 Comment(9)
just a warning: aes does while aes_string doesn't define positional parameters x and y.Saddlery
Just another warning. aes_string does not allow you to use functions to manipulate the x and y values. Say that you would like to log transform y in which case aes_string(x = 'x.values', y='log(y.values)') of course doesn't work. I use these kind of transformations a lot myself so aes_string is not always an option for me.Mecham
Perhaps this answer (and the one with the most votes) should be updated since the documentation of aes_string says: "All these functions are soft-deprecated. Please use tidy evaluation idioms instead (see the quasiquotation section in aes() documentation)." (ggplot2 version 3.2.1). That probably makes rlang::.data the best candidate to silence these notes.Ionogen
@Ionogen Indeed, this seems to now be the preferred way to bypass these CRAN notes, however, this comes at the cost of importing yet another package...Rola
@Ionogen Although, after a bit more reading, it seems that the import of rlang can be avoided by explicitly declaring the .data pronoun as utils::globalVariables(".data") (e.g., as discussed here).Rola
Just here to note that aes_string has been deprecated since ggplot2 v3.0Experimentalize
Update to the hint from @alanocallaghan : using aes_string now actually generates a warning, making it unuseful for CRAN appeasement purposes.Mag
It's irrelevant now, but per @Dr.Mike 's comment, you could actually do transforms in aes_string, eg library(ggplot2); ggplot(mtcars, aes_string(x="mpg", y="log(cyl)")) + geom_point(). It's a real shame they deprecated it imoExperimentalize
@Rola Very helpful, but it looks like the link may have changed: dplyr.tidyverse.org/articles/in-packages.htmlGilchrist
P
104

You have two solutions:

  • Rewrite your code to avoid non-standard evaluation. For ggplot2, this means using aes_string() instead of aes() (as described by Harlan)

  • Add a call to globalVariables(c("x.values", "y.values")) somewhere in the top-level of your package.

You should strive for 0 NOTES in your package when submitting to CRAN, even if you have to do something slightly hacky. This makes life easier for CRAN, and easier for you.

(Updated 2014-12-31 to reflect my latest thoughts on this)

Punctuate answered 14/9, 2012 at 17:35 Comment(12)
there's now utils::globalVariables to prevent those notes (but don't forget if (getRversion() >= "2.15.1") or the package will really fail)Crites
globalVariables is a hideous hack and I will never use it.Punctuate
For what is worth, my package submission was rejected because of these notes and was told to use the utils::globalVariables function. Since I am not in a position to argue, that is what I did.Apophyllite
My package was also rejected with that argument. We really need to get the gate keepers over at CRAN to remove this restriction. It's wasting everybody's time.Lydell
I agree that it would be best to ignore them, but my code uses lots of ggplot and data.table, and thus has tons of these warnings, which have kept me from noticing other more important warnings that really were problems I needed to fix.Apostil
@Punctuate you shouldn't say you'll never use things when only two years later you think it's finePunctuate
new year resolution? I'll keep my eyes open for ggplot::scale_dualAxis.sqrt and 3D pie charts with fill patterns.Crites
@Punctuate For the record I agree more with the 2012 you than the present you. I don't know what made you change your mind but to me globalVariables indeed feels like a "hideous hack"..Mecham
@Dr.Mike it's still a hideous hack, but you just have to suck it up if you want your package on CRANPunctuate
@Dr.Mike, @hadley, perhaps suppressForeignCheck can solve this issue, see my answer below. https://mcmap.net/q/125341/-how-can-i-handle-r-cmd-check-quot-no-visible-binding-for-global-variable-quot-notes-when-my-ggplot2-syntax-is-sensibleBurck
Is there a way to automate getting the correct call to globalVariables with only variables in dplyr pipelines, and not the actual mistakes?Nephralgia
Note that aes_string() is deprecated and so you should remove it... Remains the globalVariables solution.Excerpta
I
52

Have you tried with aes_string instead of aes? This should work, although I haven't tried it:

aes_string(x = 'x.values', y = 'y.values')
Inflame answered 24/2, 2012 at 23:10 Comment(9)
just a warning: aes does while aes_string doesn't define positional parameters x and y.Saddlery
Just another warning. aes_string does not allow you to use functions to manipulate the x and y values. Say that you would like to log transform y in which case aes_string(x = 'x.values', y='log(y.values)') of course doesn't work. I use these kind of transformations a lot myself so aes_string is not always an option for me.Mecham
Perhaps this answer (and the one with the most votes) should be updated since the documentation of aes_string says: "All these functions are soft-deprecated. Please use tidy evaluation idioms instead (see the quasiquotation section in aes() documentation)." (ggplot2 version 3.2.1). That probably makes rlang::.data the best candidate to silence these notes.Ionogen
@Ionogen Indeed, this seems to now be the preferred way to bypass these CRAN notes, however, this comes at the cost of importing yet another package...Rola
@Ionogen Although, after a bit more reading, it seems that the import of rlang can be avoided by explicitly declaring the .data pronoun as utils::globalVariables(".data") (e.g., as discussed here).Rola
Just here to note that aes_string has been deprecated since ggplot2 v3.0Experimentalize
Update to the hint from @alanocallaghan : using aes_string now actually generates a warning, making it unuseful for CRAN appeasement purposes.Mag
It's irrelevant now, but per @Dr.Mike 's comment, you could actually do transforms in aes_string, eg library(ggplot2); ggplot(mtcars, aes_string(x="mpg", y="log(cyl)")) + geom_point(). It's a real shame they deprecated it imoExperimentalize
@Rola Very helpful, but it looks like the link may have changed: dplyr.tidyverse.org/articles/in-packages.htmlGilchrist
S
34

This question has been asked and answered a while ago but just for your information, since version 2.1.0 there is another way to get around the notes: aes_(x=~x.values,y=~y.values).

Subinfeudation answered 19/9, 2016 at 10:16 Comment(1)
Just a note that aes_() was deprecated in ggplot2 3.0.0.Thrifty
R
34

In 2019, the best way to get around this is to use the .data prefix from the rlang package, which also gets exported to ggplot2. This tells R to treat x.values and y.values as columns in a data.frame (so it won't complain about undefined variables).

Note: This works best if you have predefined columns names that you know will exist in you data input

#' @importFrom ggplot2 .data
my_func <- function(data) {
    ggplot(data, aes(x = .data$x, y = .data$y))
}

EDIT: Updated to export .data from ggplot2 instead of rlang based off @Noah comment

Radionuclide answered 14/8, 2019 at 14:24 Comment(4)
Note that .data is exported from ggplot2 so you don't need to add rlang as an separate dependency.Centrepiece
.data is explained nicely at Programming with dplyrPeneus
Now deprecated, love the sheer frequency at which tidyverse will ditch stuff like this making me rewrite packages tidyverse.org/blog/2022/10/tidyselect-1-2-0Experimentalize
Interestingly @alanocallaghan, that article only references .data use in dplyr not ggplot2. When I try to use their fix of just quoting the column names in the ggplot() call, I don't get what I want.Radionuclide
B
14

If

getRversion() >= "3.1.0"

You can add a call at the top level of the package:

utils::suppressForeignCheck(c("x.values", "y.values"))

from:

help("suppressForeignCheck")
Burck answered 4/5, 2015 at 11:12 Comment(8)
That's a fair solution. Thanks! I'd considered this, but the problem is that I have a great many variables like x.values and y.values, so I'd have to register ALL of them.Spinks
Yes I agree, its not idealBurck
Agreed. But again, thanks for your answer submission!Spinks
That is not what suppressForeignCheck is used forPunctuate
Where is actually the top level? In which file do I add this command?Savate
@bquast ... any reply for @drmariod? im also unsure about this.Crustal
By custom, this is put in a zzz.R file in ./R/. For example, github.com/HughParsonage/grattan/blob/master/R/zzz.REades
@hadley, what is it used for? help("suppressForeignCheck") seems to imply it is for a " run-time calculated native symbol", but what the heck is that?Craver
D
8

Add this line of code to the file in which you provide package-level documentation:

if(getRversion() >= "2.15.1")  utils::globalVariables(c("."))

Example here

Duff answered 2/9, 2019 at 10:32 Comment(2)
The comment in the code you linked to says that this is just to suppress notes for places where . is used in pipelines, not for other variables or the columns in the OP's data frame. Would the same code apply to column names?Scherer
@Scherer I had a pilfer through some source code, and I can't spot anything, so based on that, I don't think anything else is necessary (although I will update if I find something). Perhaps the best way to know for sure is to check with something like R CMD check --as-cran and see if it produces notesDuff
S
2

Because the manual for ?aes_string says

All these functions are soft-deprecated. Please use tidy evaluation idioms instead (see the quasiquotation section in aes() documentation).

So I read that page, and came up with this pattern:

ggplot2::aes(x = !!quote(x.values),
             y = !!quote(y.values))

It is about as fugly as an IIFE, and mixes base expressions with tidy-bang-bangs. But does not require the global variables workaround, either, and doesn't use anything that is deprecated afaict. It seems like it also works with calculations in aesthetics and the derived variables like ..count..

Selenaselenate answered 15/11, 2021 at 22:2 Comment(0)
S
1

how about using get()?

geom_point(
         aes(
           x = get('x.values'), 
           y = get('y.values')
         ),
         data     = data,
         position = position_jitter(height = 0, width = GetDegreeOfJitter(jj))
)
Sampler answered 26/2, 2021 at 9:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.