I hope that what I am about to write makes some sense. If you look at
How to deal with a 50GB large csv file in r language?
it is explained how to query à la SQL, a csv file from R. In my case, I have vast amount of data stored as large (or larger than my RAM) flat files.
I would like to storer for instance one of these as an SQLite database without loading it in memory entirely. Imagine if you could automatically read a limited chunk of that file which is suitable for your RAM, store it into a SQL, then free up some memory, process the next chunk and so on and so forth until all the file is in the database. Is this doable in R? If the table could be stored as tibble, it would be even better, but it is not crucial. Any suggestion is appreciated. Thanks!
sqldf
: while it does provide accessing large amounts of data in a SQL way, it is presuming that the data is already resident in memory. What you're talking about here is only loading a portion of the data into memory at a time, which suggestsDBI
andRSQLite
as the packages you need. You should probably figure out how to get the 50GB of data into the sqlite file, whether through R or direct import. – Correy