I am working with an extremely large dataset in R and have been operating with data frames and have decided to switch to data.tables to help speed up with operations. I am having trouble understanding the J operations, in particular I'm trying to generate dummy variables but I can't figure out how to code conditional operations within data.tables[].
MWE:
test <- data.table("index"=rep(letters[1:10],100),"var1"=rnorm(1000,0,1))
What I would like to do is to add columns a
through j
as dummy variables such that column a
would have a value 1
when the index == "a"
and 0
otherwise. In the data.frame environment it would look something like:
test$a <- 0
test$a[test$index=='a'] <- 1
model.matrix(~var1+index-1, test)
too slow? – Scrawlmodel.matrix
I run out of memory. – Jacquejacquelin