Bind variables in R DBI
Asked Answered
K

3

10

In R's DBI package, I'm not finding a facility for using bound variables. I did find a document (the original vignette from 2002) that says about bound variables, "Perhaps the DBI could at some point in the future implement this feature", but it looks like so far that's left undone.

What do people in R use for a substitute? Just concatenate strings right into the SQL? That's got some obvious problems for safety & performance.

EDIT:

Here's an example of how placeholders could work:

query <- "SELECT numlegs FROM animals WHERE color=?"
result <- dbGetQuery(caseinfo, query, bind="green")

That's not a very well-thought-out interface, but the idea is that you can use a value for bind and the driver handles the details of escaping (if the underlying API doesn't handle bound variables natively) without the caller having to reimplement it [badly].

Kp answered 2/2, 2010 at 17:5 Comment(7)
can you provide some example code that would operate like you would like it to operate? what behaviour would you expect bound variables to have?Pegues
Do you mean something like this? #2182837Therefor
I smell a new DBI with better features coming along. We will all be happy beta-testers, Ken....Welltimed
That would be great if there's a new DBI in the works. Is there a code repository for DBI somewhere that I could hack on, in the vanishingly-small chance that I get some round tuits?Kp
No I was half-joking. DBI goes back to David James who no longer works where he once did. I mentored a Google Summer of Code student to create Postgresql bindings using DBI as a given; that project is now on CRAN as RPostgreSQL (and could do with a new maintainer for further extension as the student has left). A new DBI would be a worthy rewrite, or 'DBI 2.0', but also quite an undertaking. Are you up for it? You seem to be the one with the itch to scratch so ...Welltimed
Oh, you thought the smell of a new DBI was coming from me! I don't smell that. =) But I can take a look and see at least how difficult the technical aspects of this would be.Kp
Possible duplicate of Proper way to pass parameters to query in R DBICosmonaut
R
18

For anyone coming to this question like I just did after googling for rsqlite and dbgetpreparedquery, it seems that in the latest version of rsqlite you can run a SELECT query with bind variables. I just ran the following:

query <- "SELECT probe_type,next_base,color_channel FROM probes WHERE probeid=?"
probe.types.df <- dbGetPreparedQuery(con,que,bind.data=data.frame(probeids=ids))

This was relatively fast (selecting 2,000 rows out of a 450,000 row table) and is incredibly useful.

FYI.

Rebecarebecca answered 11/7, 2011 at 17:8 Comment(2)
I'm unclear what the difference is between dbgetpreparedquery and dbsendpreparedqueryWornout
As I understand it, the GetQuery functions return the results as a dataframe, whereas the SendQuery functions return a cursor from which you can request results in batches using the fetch method. Check out the documentation for RSQLITE::query cran.r-project.org/web/packages/RSQLite/RSQLite.pdfArched
M
3

Below is a summary of what's currently supported in RSQLite for bound parameters. You are right that there is not currently support for SELECT, but there is no good reason for this and I would like to add support for it.

If you feel like hacking, you can get a read-only checkout of all of the DBI related packages here:

use --user=readonly --password=readonly

https://hedgehog.fhcrc.org/compbio/r-dbi/trunk
https://hedgehog.fhcrc.org/compbio/r-dbi/trunk/DBI
https://hedgehog.fhcrc.org/compbio/r-dbi/trunk/SQLite/RSQLite

I like to receive patches, especially if they include tests and documentation. Unified diff, please. I actually do all my development using git and so best case is to create a git clone of say RSQLite and then send me diffs as git format-patch -n git-svn..

Anyhow, here are some examples:

library("RSQLite")

make_data <- function(n)
{
    alpha <- c(letters, as.character(0:9))
    make_key <- function(n)
    {
        paste(sample(alpha, n, replace = TRUE), collapse = "")
    }
    keys <- sapply(sample(1:5, replace=TRUE), function(x) make_key(x))
    counts <- sample(seq_len(1e4), n, replace = TRUE)
    data.frame(key = keys, count = counts, stringsAsFactors = FALSE)
}

key_counts <- make_data(100)


db <- dbConnect(SQLite(), dbname = ":memory:")

sql <- "
create table keys (key text, count integer)
"

dbGetQuery(db, sql)

bulk_insert <- function(sql, key_counts)
{
    dbBeginTransaction(db)
    dbGetPreparedQuery(db, sql, bind.data = key_counts)
    dbCommit(db)
    dbGetQuery(db, "select count(*) from keys")[[1]]
}

##  for all styles, you can have up to 999 parameters

## anonymous
sql <- "insert into keys values (?, ?)"
bulk_insert(sql, key_counts)


## named w/ :, $, @
## names are matched against column names of bind.data

sql <- "insert into keys values (:key, :count)"
bulk_insert(sql, key_counts[ , 2:1])

sql <- "insert into keys values ($key, $count)"
bulk_insert(sql, key_counts)

sql <- "insert into keys values (@key, @count)"
bulk_insert(sql, key_counts)

## indexed (NOT CURRENTLY SUPPORTED)
## sql <- "insert into keys values (?1, ?2)"
## bulk_insert(sql)
Marlite answered 11/2, 2010 at 17:56 Comment(0)
K
1

Hey hey - I just discovered that RSQLite, which is what I'm using in this case, does indeed have bound-variable support:

http://cran.r-project.org/web/packages/RSQLite/NEWS

See the entry about dbSendPreparedQuery() and dbGetPreparedQuery().

So in theory, that turns this nastiness:

df <- data.frame()
for (x in data$guid) {
  query <- paste("SELECT uuid, cites, score FROM mytab WHERE uuid='",
                 x, "'", sep="")
  df <- rbind(df, dbGetQuery(con, query))
}

into this:

df <- dbGetPreparedQuery(
     con, "SELECT uuid, cites, score FROM mytab WHERE uuid=:guid", data)

Unfortunately, when I actually try it, it seems that it's only for INSERT statements and the like, not for SELECT statements, because I get an error: RS-DBI driver: (cannot have bound parameters on a SELECT statement).

Providing that capability would be fantastic.

The next step would be to hoist this up into DBI itself so that all DBs can take advantage of it, and provide a default implementation that just pastes it into the string like we're all doing ourselves now.

Kp answered 2/2, 2010 at 23:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.