I am trying to do data analysis in R on a group of medium sized datasets. One of the analyses I need to do requires me to do a full outer join amongst around 24-48 files, each of with has about 60 columns and up to 450,000 lines. So I've been running into memory issues a lot.
I thought at ffbase or sqldf would help, but apparently full outer join is not possible with either of them.
Is there a workaround? A package I haven't found yet?
data.table
. How much RAM do you have? – Blockbustingsqldf(..., dbname = tempfile())
. – Unaccustomed