MS-SQL Bulk Insert with RODBC
Asked Answered
C

6

8

Is it possible to perform a bulk insert into an MS-SQL Server (2000, 2005, 2008) using the RODBC package?

I know that I can do this using freebcp, but I'm curious if the RODBC package implements this portion of the Microsoft SQL API and if not, how difficult it would be to implement it.

Czarra answered 9/9, 2009 at 20:48 Comment(0)
S
6

check out the new odbc and DBI packages. DBI::dbWriteTable writes around 20,000 records per second... Much much faster than the Row Inserts from RODBC::sqlSave()

Subfamily answered 27/2, 2017 at 15:7 Comment(2)
I cannot install odbc, it generates error: ERROR: lazy loading failed for package 'blob'Ravid
That sounds like it requires a different questionSubfamily
R
2

You're probably looking for ?sqlSave which uses a parametrized INSERT INTO query (taking place in one operation) when you set Fast=True.

Rafi answered 30/9, 2009 at 17:32 Comment(5)
Nah, sqlSave will do multiple INSERTS. I want a BULK INSERT which is a single transaction.Czarra
does fast=true not do it as a single transaction?Rafi
from the rodbc docs: "logical. If false, write data a row at a time. If true, use a parametrized INSERT INTO or UPDATE query to write all the data in one operation."..however it doesn't seem to make any difference (when writing to Netezza in my case)Fixate
I am afraid in my experience setting fast = T or fast = F doesn't change a thing. Records are inserted one by one no matter what. At least in my case with SQL Server 2012.Jura
For me, fast = TRUE, makes the sqlSave function ~ 3 times faster. Also on sql server 2012.Couscous
L
2

Now You can use dbBulkCopy from the new rsqlserver package:

A typical scenario:

  1. You create a matrix
  2. you save it as a csv file
  3. You call dbBulkCopy to read fil and insert it using internally bcp tool of MS Sql server.

This assume that your table is already created in the data base:

dat <- matrix(round(rnorm(nrow*ncol),nrow,ncol)
id.file = "temp_file.csv"                      
write.csv(dat,file=id.file,row.names=FALSE)
dbBulkCopy(conn,'NEW_BP_TABLE',value=id.file)
Leiva answered 10/11, 2013 at 23:38 Comment(2)
any reason why rsqlserver is not on cran?Thumbsdown
@MusX because it is under development(specially documentation and tests part) and it uses rClr package which is not on CRAN too. But you are encouraged to use it from GITHUB and will be happy with any feedback.Leiva
S
2

Using RODBC, the fastest insert we've been able to create (260 million row insert) looks like the following (in R pseudo code):

ourDataFrame <- sqlQuery(OurConnection, "SELECT myDataThing1, myDataThing2
                                         FROM myData")
ourDF <- doStuff(ourDataFrame)
write.csv(ourDF,ourFile)  
sqlQuery(OurConnection, "CREATE TABLE myTable ( la [La], laLa [LaLa]);
                         BULK INSERT myTable FROM 'ourFile' 
                              WITH YOURPARAMS=yourParams;")

If you're running this from between servers, you need a network drive that the R server can write to (e.g. one server with permissions for writing to the DB uses Rscript to productionalize the code), and the SQL Server can read from.

Subfamily answered 10/3, 2018 at 17:3 Comment(0)
S
1

From everything I can find, there is NO solution for bulk insert to MySQL and nothing that works with SSIS which is why Microsoft is including in-database analytics with SQL Server 2016 after buying Revolution R Analytics.

I tried to comment on the previous answer but don't have the reputation to do it.

The rsqlserver package needs to run with rClr and neither of those packages are well-behaved, especially because rsqlserver's INSERT functions have poor data type handling. So if you use it, you'll have no idea what you're looking at in the SQL table as much of the information in your data.frame will have been transformed.

Considering the RODBC package has been around for 15 years, I'm pretty disappointed that no one has created a bulk insert function...

Subfamily answered 19/8, 2015 at 14:5 Comment(2)
important point on rsqlserver, but for many of us we don't need to 'look at' the data (from an R standpoint). If it's been modeled and shaped and processed in R we just need the result back in the database and it doesnt matter what R transforms the type to in the database (as long as they are reasonable and can be read by other systems)Fixate
@Joe I don't think I'd ever hire someone who says "we don't need to 'look at' the data".Subfamily
Q
1

Our n2khelper package can use bcp (bulkcopy) when it is available. When not available it falls back to multiple INSERT statements.

You can find the package on https://github.com/INBO-Natura2000/n2khelper

Install it with devtools::install_git("INBO-Natura2000/n2khelper") and look for the odbc_insert() function.

Quartis answered 21/4, 2016 at 10:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.