Chaining data.tables with magrittr
I have a method I'm using, with magrittr, using the .
object with [
:
library(magrittr)
library(data.table)
bar <- foo %>%
.[etcetera] %>%
.[etcetera] %>%
.[etcetera]
working example:
out <- data.table(expand.grid(x = 1:10,y = 1:10))
out %>%
.[,z := x*y] %>%
.[,w := x*z] %>%
.[,v := w*z]
print(out)
Additional examples
Edit: it's also not just syntactic sugar, since it allows you to refer to the table from the previous step as .
, which means that you can do a self join,
or you can use %T>%
for some logging in-between steps (using futile.logger or the like):
out %>%
.[etcetera] %>%
.[etcetera] %T>%
.[loggingstep] %>%
.[etcetera] %>%
.[., on = SOMEVARS, allow.cartesian = TRUE]
EDIT:
This is much later, and I still use this regularly. But I have the following caveat:
magrittr adds overhead
I really like doing this at the top level of a script. It has a very clear and readable flow, and there are a number of neat tricks you can do with it.
But I've had to remove this before when optimizing if it's part of a function that's being called lots of times.
You're better off chaining data.tables the old fashioned way in that case.
EDIT 2: Well, I'm back here to say that it doesn't add much overhead, I just tried benchmarking it on a few tests, but can't really find any major differences:
library(magrittr)
library(data.table)
toplevel <- data.table::CJ(group = 1:100, sim = 1:100, letter = letters)
toplevel[, data := runif(.N)]
processing_method1 <- function(dt) {
dt %>%
.[, mean(data), by = .(letter)] %>%
.[, median(V1)]
}
processing_method2 <- function(dt) {
dt[, mean(data), by = .(letter)][, median(V1)]
}
microbenchmark::microbenchmark(
with_pipe = toplevel[, processing_method1(.SD), by = group],
without_pipe = toplevel[, processing_method2(.SD), by = group]
)
Unit: milliseconds
expr min lq mean median uq max neval
with_pipe 87.18837 91.91548 101.96456 100.7990 106.2750 230.5221 100
without_pipe 86.81728 90.74838 98.43311 99.2259 104.6146 129.8175 100```
Almost no overhead here
]
; from there, it's up to personal taste exactly how you'd like to slice-and-dice – Playwriting