How do I "flush" data to my RSQLite disk database?
Asked Answered
I

2

6

I'm creating a database using R package dbplyr, using RSQLite, but my database is zero-bytes in size on disk despite my writing (and reading back) a table. Here is my script:

library("RSQLite")
library("dbplyr")
library("dplyr")

data(mtcars)

con <- DBI::dbConnect(RSQLite::SQLite(), dbname = "./mtcars.db")
copy_to(con, mtcars, "mtcars")

print(tbl(con, "mtcars"))

But as you can see from the ls -l at the end my database size is 0, even though the script did read mtcars from the database (so it's in there). I want to use the database file to share data with another program, so how do I periodically "flush" the data to disk?

tbrowne@calculon:~/scratch$ R -f dplysqlite.r 

R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
> library("RSQLite")
> library("dbplyr")
Warning messages:
1: replacing previous import by ‘rlang::enquo’ when loading ‘dbplyr’ 
2: replacing previous import by ‘rlang::quo’ when loading ‘dbplyr’ 
3: replacing previous import by ‘rlang::quos’ when loading ‘dbplyr’ 
4: replacing previous import by ‘rlang::quo_name’ when loading ‘dbplyr’ 
> library("dplyr")

Attaching package: ‘dplyr’

The following objects are masked from ‘package:dbplyr’:

    ident, sql

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

> 
> data(mtcars)
> 
> con <- DBI::dbConnect(RSQLite::SQLite(), dbname = "./mtcars.db")
> copy_to(con, mtcars, "mtcars")
> 
> print(tbl(con, "mtcars"))
# Source:   table<mtcars> [?? x 11]
# Database: sqlite 3.19.3 [/home/tbrowne/scratch/mtcars.db]
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1  21.0     6 160.0   110  3.90 2.620 16.46     0     1     4     4
 2  21.0     6 160.0   110  3.90 2.875 17.02     0     1     4     4
 3  22.8     4 108.0    93  3.85 2.320 18.61     1     1     4     1
 4  21.4     6 258.0   110  3.08 3.215 19.44     1     0     3     1
 5  18.7     8 360.0   175  3.15 3.440 17.02     0     0     3     2
 6  18.1     6 225.0   105  2.76 3.460 20.22     1     0     3     1
 7  14.3     8 360.0   245  3.21 3.570 15.84     0     0     3     4
 8  24.4     4 146.7    62  3.69 3.190 20.00     1     0     4     2
 9  22.8     4 140.8    95  3.92 3.150 22.90     1     0     4     2
10  19.2     6 167.6   123  3.92 3.440 18.30     1     0     4     4
# ... with more rows
> 
> 
tbrowne@calculon:~/scratch$ ls -l
total 4
-rw-rw-r-- 1 tbrowne tbrowne 194 Oct 31 11:04 dplysqlite.r
-rw-r--r-- 1 tbrowne tbrowne   0 Oct 31 11:04 mtcars.db
Interoceptor answered 31/10, 2017 at 11:13 Comment(0)
L
6

You're not using the pattern suggested by the RSQLite documentation. That documentation uses dbWriteTable to copy a data frame into a SQLite table:

dbWriteTable(con, "mtcars", mtcars)

According to this documentation, your full code would look something like this:

con <- dbConnect(RSQLite::SQLite(), "./mtcars.db")
data(mtcars)
dbWriteTable(con, "mtcars", mtcars)
dbListTables(con)
# Fetch all query results into a data frame:
dbGetQuery(con, "SELECT * FROM mtcars")
Leipzig answered 31/10, 2017 at 11:25 Comment(2)
funnily enough I am actually using the suggested dplyr way of writing to tables. I guess I should go straight to the driver, as you are doing, rather than relying on dplyr.Interoceptor
@ThomasBrowne I also suspected this, which is why I suspect that maybe you were right and there is some sort of flush...but that would also be coming from dplyr, since the SQLite documentation mentions nothing of this.Leipzig
M
7

The copy_to() method for dbplyr sources (dbplyr:::copy_to.src_sql()) has a temporary argument which is set to TRUE by default. This means that the new table will be visible only for your active connection and disappear after you close the connection. The following should work as expected:

copy_to(con, mtcars, "mtcars", temporary = FALSE)

Alternatively, use dbWriteTable() as Tim suggests.

Marston answered 1/11, 2017 at 9:44 Comment(0)
L
6

You're not using the pattern suggested by the RSQLite documentation. That documentation uses dbWriteTable to copy a data frame into a SQLite table:

dbWriteTable(con, "mtcars", mtcars)

According to this documentation, your full code would look something like this:

con <- dbConnect(RSQLite::SQLite(), "./mtcars.db")
data(mtcars)
dbWriteTable(con, "mtcars", mtcars)
dbListTables(con)
# Fetch all query results into a data frame:
dbGetQuery(con, "SELECT * FROM mtcars")
Leipzig answered 31/10, 2017 at 11:25 Comment(2)
funnily enough I am actually using the suggested dplyr way of writing to tables. I guess I should go straight to the driver, as you are doing, rather than relying on dplyr.Interoceptor
@ThomasBrowne I also suspected this, which is why I suspect that maybe you were right and there is some sort of flush...but that would also be coming from dplyr, since the SQLite documentation mentions nothing of this.Leipzig

© 2022 - 2024 — McMap. All rights reserved.