Failed to connect the database when using sqldf in r
Asked Answered
M

2

10

I loaded a csv file to my R, and when I Tried to use sqldf to select some column, it always went to

Error in .local(drv, ...) : 
  Failed to connect to database: Error: Access denied for user 
  'User'@'localhost' (using password: NO)
Error in !dbPreExists : invalid argument type

I don't know how to fix it.

Here is my script:

library("RMySQL")
library(sqldf)
acs<-read.csv("getdata_data_ss06pid.csv",head = T)
sqldf("select pwgtp1 from acs where AGEP < 50")
Misogyny answered 17/7, 2016 at 0:1 Comment(2)
can you add output of head(acs) to your question. Why do you need "RMySQL" ?Otherwise
See ?sqldf and read the description of the dbname argument, the drv argument and also the part two paragraphs above the References section that begins with "On MySQL..."Subkingdom
F
17

It doesn't seem like you need to load the RMySQL library when using sqldf, since you have already read the data into memory, which seems to be the problem here:

library(RMySQL)
library(sqldf)
sqldf("select * from df limit 6")

Error in .local(drv, ...) : Failed to connect to database: Error: Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2) Error in !dbPreExists : invalid argument type

However if RMySQL library is detached, sqldf works:

detach("package:RMySQL", unload=TRUE)
sqldf("select * from df limit 6")
#   time  type
# 1    1 type1
# 2    2 type1
# 3    3 type2
# 4    4 type1
# 5    5 type2
# 6    6 type1
Fatally answered 17/7, 2016 at 0:55 Comment(0)
C
15

TL;DR

Instead of unlaoding the RMySql package, explicitly set the sqldf default driver option to SQLite before calling the sqldf function:

options(sqldf.driver = "SQLite")
sqldf("select * from df limit 6")

Explanation

If not explicitly defined, the sqldf package decides which DB driver to use as follows:

If not specified then the "dbDriver" option is checked and if that is not set then sqldf checks whether RPostgreSQL, RMySQL or RH2 is loaded in that order and the driver corresponding to the first one found is used. If none are loaded then "SQLite" is used. dbname=NULL causes the default to be used.

In your case, RMySql has already been loaded and sqldf will try to use the MySQL DB and write into a schema called test. Detaching and unloading the RMySQL package is one option, but not necessary. As mentioned by @GaborGrothendieck in his comment, the easiest fix is to simply tell sqldf which DB driver to use explicitly, i.e.

sqldf("select * from df limit 6", drv="SQLite")

To not always having to add drv="SQLite", you can permanently set the default driver for the session to SQLite:

options(sqldf.driver = "SQLite")
Coax answered 22/3, 2018 at 14:35 Comment(1)
This seems to be the best option, especially for larger or more complex scripts, where you also might need RMySQL or other DB drivers.Ragen

© 2022 - 2024 — McMap. All rights reserved.