I was trying to make Python 3-style assignment unpacking possible in R (e.g., a, *b, c = [1,2,3], "C"
), and although I got so close (you can check out my code here), I ultimately ran into a few (weird) problems.
My code is meant to work like this:
a %,*% b %,% c <- c(1,2,3,4,5)
and will assign a
= 1
, b
= c(2,3,4)
and c
= 5
(my code actually does do this, but with one small snag I will get to later).
In order for this to do anything, I have to define:
`%,%` <- function(lhs, rhs) {
...
}
and
`%,%<-` <- function(lhs, rhs, value) {
...
}
(as well as %,*%
and %,*%<-
, which are slight variants of the previous functions).
First issue: why R substitutes *tmp*
for the lhs
argument
As far as I can tell, R evaluates this code from left to right at first (i.e., going from a
to c
, until it reaches the last %,%
, where upon, it goes back from right to left, assigning values along the way. But the first weird thing I noticed is that when I do match.call()
or substitute(lhs)
in something like x %infix% y <- z
, it says that the input into the lhs
argument in %infix%
is *tmp*
, instead of say, a
or x
.
This is bizarre to me, and I couldn't find any mention of it in the R manual or docs. I actually make use of this weird convention in my code (i.e., it doesn't show this behavior on the righthand side of the assignment, so I can use the presence of the *tmp*
input to make %,%
behave differently on this side of the assignment), but I don't know why it does this.
Second issue: why R checks for object existence before anything else
My second problem is what makes my code ultimately not work. I noticed that if you start with a variable name on the lefthand side of any assignment, R doesn't seem to even start evaluating the expression---it returns the error object '<variable name>' not found
. I.e., if x
is not defined, x %infix% y <- z
won't evaluate, even if %infix%
doesn't actually use or evaluate x
.
Why does R behave like this, and can I change it or get around it? If I could to run the code in %,%
before R checks to see if x
exists, I could probably hack it so that I wouldn't be a problem, and my Python unpacking code would be useful enough to actually share. But as it is now, the first variable needs to already exist, which is just too limiting in my opinion. I know that I could probably do something by changing the <-
to a custom infix operator like %<-%
, but then my code would be so similar to the zeallot
package, that I wouldn't consider it worth it. (It's already very close in what it does, but I like my style better.)
Edit:
Following Ben Bolker's excellent advice, I was able to find a way around the problem... by overwriting <-
.
`<-` <- function(x, value) {
base::`<-`(`=`, base::`=`)
find_and_assign(match.call(), parent.frame())
do.call(base::`<-`, list(x = substitute(x), value = substitute(value)),
quote = FALSE, envir = parent.frame())
}
find_and_assign <- function(expr, envir) {
base::`<-`(`<-`, base::`<-`)
base::`<-`(`=`, base::`=`)
while (is.call(expr)) expr <- expr[[2]]
if (!rlang::is_symbol(expr)) return()
var <- rlang::as_string(expr) # A little safer than `as.character()`
if (!exists(var, envir = envir)) {
assign(var, NULL, envir = envir)
}
}
I'm pretty sure that this would be a mortal sin though, right? I can't exactly see how it would mess anything up, but the tingling of my programmer senses tells me this would not be appropriate to share in something like a package... How bad would this be?
%,%<-
? – Boydboydenbase
functions though (tidyverse already does a lot of this, although not for anything as fundamental as<-
- on the other hand unlike your code it intends to change the behaviour in a non-compatible way (e.g.filter()
) – Boydboyden