Amazon RDS w/ SQL Server wont allow bulk insert from CSV source
Asked Answered
S

3

5

I've tried two methods and both fall flat...

BULK INSERT TEMPUSERIMPORT1357081926 
FROM 'C:\uploads\19E0E1.csv' 
WITH (FIELDTERMINATOR = ',',ROWTERMINATOR = '\n')

You do not have permission to use the bulk load statement.

but you cannot enable that SQL Role with Amazon RDS?

So I tried... using openrowset but it requires AdHoc Queries to be enabled which I don't have permission to do!

Statant answered 1/1, 2013 at 23:23 Comment(0)
S
3

You can enable ad hoc distributed queries via heading to your Amazon Management Console, navigating to your RDS menu and then creating a DB Parameter group with ad hoc distributed queries set to 1, and then attaching this parameter group to your DB instance.

Don't forget to reboot your DB once you have made these changes.

Here is the source of my information:

http://blogs.lessthandot.com/index.php/datamgmt/dbadmin/turning-on-optimize-for-ad/

Hope this helps you.

Streamlined answered 3/2, 2014 at 9:18 Comment(4)
It's good information, I followed it. But OPENROWSET is still throwing this error: You do not have permission to use the bulk load statement. So I'm not sure this solution will work for anyone.Consecrate
This solution doesn't work. It's stated in AWS docs that this feature (BULK INSERT) is not supported: docs.aws.amazon.com/AmazonRDS/latest/UserGuide/…Expeditious
A reason to use Azure instead of RDS - what use is RDS in a production environment without bulk inserts?Serra
Not only OPENROWSET, but linked servers are also not allowed to anything outside the same VPC. See the comments on this page.Abscess
S
5

I know this question is really old, but it was the first question that came up when I searched bulk inserting into an aws sql server rds instance. Things have changed and you can now do it after integrating the RDS instance with S3. I answered this question in more detail on this question. But overall gist is that you setup the instance with the proper role, put your file on S3, then you can copy the file over to RDS with the following commands:

exec msdb.dbo.rds_download_from_s3
        @s3_arn_of_file='arn:aws:s3:::bucket_name/bulk_data.csv',
        @rds_file_path='D:\S3\seed_data\data.csv',
        @overwrite_file=1;

Then BULK INSERT will work:

FROM 'D:\S3\seed_data\data.csv'
WITH
(
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n'
)

AWS doc

Simmer answered 5/5, 2020 at 2:5 Comment(0)
S
3

You can enable ad hoc distributed queries via heading to your Amazon Management Console, navigating to your RDS menu and then creating a DB Parameter group with ad hoc distributed queries set to 1, and then attaching this parameter group to your DB instance.

Don't forget to reboot your DB once you have made these changes.

Here is the source of my information:

http://blogs.lessthandot.com/index.php/datamgmt/dbadmin/turning-on-optimize-for-ad/

Hope this helps you.

Streamlined answered 3/2, 2014 at 9:18 Comment(4)
It's good information, I followed it. But OPENROWSET is still throwing this error: You do not have permission to use the bulk load statement. So I'm not sure this solution will work for anyone.Consecrate
This solution doesn't work. It's stated in AWS docs that this feature (BULK INSERT) is not supported: docs.aws.amazon.com/AmazonRDS/latest/UserGuide/…Expeditious
A reason to use Azure instead of RDS - what use is RDS in a production environment without bulk inserts?Serra
Not only OPENROWSET, but linked servers are also not allowed to anything outside the same VPC. See the comments on this page.Abscess
K
0

2022

I'm adding for anyone like me who wants to quickly insert data into RDS from C#

While RDS allows csv bulk uploads directly from S3 instances, there are times when you just want to directly upload data straight from your program.

I've written a C# utility method which does inserts using a StringBuilder to concatenate statements to do 2000 inserts per call, which is way faster than an ORM like dapper which does one insert per call.

This method should handle date, int, double, and varchar fields, but I haven't had to use it for character escaping or anything like that.

//call as
FastInsert.Insert(MyDbConnection, new object[]{{someField = "someValue"}}, "my_table");


class FastInsert
{
    static int rowSize = 2000;
    internal static void Insert(IDbConnection connection, object[] data, string targetTable)
    {
        var props = data[0].GetType().GetProperties();

        var names = props.Select(x => x.Name).ToList();

        
        foreach(var batch in data.Batch(rowSize))
        {
            var sb = new StringBuilder($"insert into {targetTable} ({string.Join(",", names)})");
            string lastLine = "";

            foreach(var row in batch)
            {
                sb.Append(lastLine);
                var values = props.Select(prop => CreateSQLString(row, prop));
                lastLine = $"select '{string.Join("','", values)}' union all ";
            }

            lastLine = lastLine.Substring(0, lastLine.Length - " union all".Length) + " from dual";
            sb.Append(lastLine);
            var fullQuery = sb.ToString();
            connection.Execute(fullQuery);
        }
    }

    private static string CreateSQLString(object row, PropertyInfo prop)
    {
        var value = prop.GetValue(row);
        if (value == null) return "null";

        if (prop.PropertyType == typeof(DateTime))
        {
            return $"'{((DateTime)value).ToString("yyyy-MM-dd HH:mm:ss")}'";
        }

        //if (prop.PropertyType == typeof(string))
        //{
            return $"'{value.ToString().Replace("'", "''")}'";
        //}
    }
}

static class Extensions
{
    public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> source, int size) //split an IEnumerable into batches
    {
        T[] bucket = null;
        var count = 0;

        foreach (var item in source)
        {
            if (bucket == null)
                bucket = new T[size];

            bucket[count++] = item;

            if (count != size)
                continue;

            yield return bucket;

            bucket = null;
            count = 0;
        }

        // Return the last bucket with all remaining elements
        if (bucket != null && count > 0)
        {
            Array.Resize(ref bucket, count);
            yield return bucket;
        }
    }
}
Kymry answered 19/1, 2022 at 20:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.