C# multiple parallel inserts in database
Asked Answered
B

4

6

I have a datatable with around 3000 rows. Each of those rows need to be inserted in a database table. Currently, i am running a foreach loop as under:

obj_AseCommand.CommandText = sql_proc;
obj_AseCommand.CommandType = CommandType.StoredProcedure;
obj_AseCommand.Connection = db_Conn;
obj_AseCommand.Connection.Open();

foreach (DataRow dr in dt.Rows)                
{
    obj_AseCommand.Parameters.AddWithValue("@a", dr["a"]);
    obj_AseCommand.Parameters.AddWithValue("@b", dr["b"]);
    obj_AseCommand.Parameters.AddWithValue("@c", dr["c"]);

    obj_AseCommand.ExecuteNonQuery();
    obj_AseCommand.Parameters.Clear();
}

obj_AseCommand.Connection.Close();

Can you please advise how can I do parallelly execute the SP in database since the above approach takes about 10 minutes to insert 3000 rows.

Biafra answered 5/1, 2015 at 6:10 Comment(4)
Just out of curiousity - why are you adding 3000 rows to the database with a stored procedure? if this is from some input file, why not import it directly into the database with some management tool?Implement
The datatable is populated from other database - a master database. My goal is to fetch data from master database and insert the same in my database.The stored procedure is used to insert data in my database.Biafra
Why not make use of bulk insert, or a link server, and the have SQL fetch and insert the dataPneumothorax
Harsh - I think it is much better to do this with bulk insert / import / whatever your DBMS calls it. It will save you a lot of trouble and time.Implement
H
13

Edit

In hindsight, using a Parallel.ForEach to parallelize DB insertions is slightly wasteful, as it will also consume a thread for each Connection. Arguably, an even better parallel solution would be to use the asynchronous versions of the System.Data Db Operations, such as ExecuteNonQueryAsync , start the executions (concurrently), and then use await Task.WhenAll() to wait upon completion - this will avoid the Thread overhead to the caller, although the overall Db performance won't likely be any quicker. More here

Original Answer, multiple Parallel Inserts into Database

You can do this in parallel using TPL, e.g. specifically with the localInit overload of Parallel.ForEach. You will almost certainly want to look at throttling the amount of parallelism by tweaking MaxDegreeOfParalelism so that you don't inundate your database:

Parallel.ForEach(dt.Rows,
    // Adjust this for optimum throughput vs minimal impact to your other DB users
    new ParallelOptions { MaxDegreeOfParallelism = 4 },
    () =>
    {
        var con = new SqlConnection();
        var cmd = con.CreateCommand();
        cmd.CommandText = sql_proc;
        cmd.CommandType = CommandType.StoredProcedure;
        con.Open();

        cmd.Parameters.Add(new SqlParameter("@a", SqlDbType.Int));
        // NB : Size sensitive parameters must have size
        cmd.Parameters.Add(new SqlParameter("@b", SqlDbType.VarChar, 100));
        cmd.Parameters.Add(new SqlParameter("@c", SqlDbType.Bit));
        // Prepare won't help with SPROCs but can improve plan caching for adhoc sql
        // cmd.Prepare();
        return new {Conn = con, Cmd = cmd};
    },
    (dr, pls, localInit) =>
    {
        localInit.Cmd.Parameters["@a"] = dr["a"];
        localInit.Cmd.Parameters["@b"] = dr["b"];
        localInit.Cmd.Parameters["@c"] = dr["c"];
        localInit.Cmd.ExecuteNonQuery();
        return localInit;
    },
    (localInit) =>
    {
        localInit.Cmd.Dispose();
        localInit.Conn.Dispose();
    });

Notes:

  • Unless you really know what you are doing, in general we should leave TPL to decide on the degree of parallelism. However, depending on how much contention (read: locks for database work) for resources, restricting the upper limit of concurrent tasks may be required (trial and error may be useful, e.g. try with concurrencies of 4, 8, 16 concurrent tasks etc to see which gives most throughput, and monitor the locking and CPU load on your Sql Server.
  • Similarly, leaving TPL's default partitioner is usually good enough to partition the DataRows across the tasks.
  • Each Task will need its own separate Sql Connection.
  • Rather than creating and disposing the command on each call, create it once per task and then keep reusing the same Command, just updating parameters each time.
  • Use the LocalInit / Local Finally lambdas to do per task set up and cleanup, like Disposing commands and connections.
  • You could also consider using .Prepare() if you are using AdHoc Sql or Sql versions prior to 2005
  • I'm assuming enumerating a DataTable's rows is thread safe. You'll want to double check this of course.

Side Note:

10 minutes for 3000 rows is excessive even with a wide table and a single thread. What does your proc do? I've assumed the processing isn't trivial, hence the need for the SPROC, but if you are just doing simple inserts, as per @3dd's comment, SqlBulkCopy will yield inserts of ~ 1M rows per minute on a reasonably narrow table.

Hot answered 5/1, 2015 at 6:21 Comment(2)
Hi Stuart, Thanks for the input. I tried it using the below code by using AseBulkCopy since we have Sybase database AseBulkCopy obj_AseBulkCopy = new AseBulkCopy(db_Conn); obj_AseBulkCopy.DestinationTableName = "db_table"; obj_AseBulkCopy.BatchSize = 1000; db_Conn.Open(); obj_AseBulkCopy.WriteToServer(dt); db_Conn.Close(); However, it still took the same time to execute. Am I missing something in this code?Biafra
There is something very strange going on with your RDBMS if even a BulkCopy takes 10 mins for 3000 rows. I'm guessing you have triggers on the inserted table which are doing a lot of logic, or there is a lot of lock contention, or possibly you have a lot of constraints, rules, indexes and the table is extremely wide. You'll need a DBA to have a serious look at the table - insertion technologies and lack of parallelism aren't the bottleneck here :(Hot
S
5

it is better to pass whole data table into data base

obj_AseCommand.CommandText = sql_proc;
obj_AseCommand.CommandType = CommandType.StoredProcedure;
obj_AseCommand.Connection = db_Conn;
obj_AseCommand.Connection.Open();
obj_AseCommand.Parameters.AddWithValue("@Parametername",DataTable);
obj_AseCommand.ExecuteNonQuery();

in database you have to create table type which exactly match with you data table

CREATE TYPE EmpType AS TABLE 
(
    ID INT, Name VARCHAR(3000), Address VARCHAR(8000), Operation SMALLINT //your columns
)

in store procedure you can do something like this...

create PROCEDURE demo

@Details EmpType READONLY // it must be read only
AS
BEGIN
    insert into yourtable   //insert data
    select * from @Details 
    END
Seligman answered 5/1, 2015 at 6:22 Comment(1)
This is possible SQL Server 2008 onwards.Lightsome
A
3

You can use SqlBulkCopy. See sample code below. The WriteToServer method, writes The datatable to the database, provided they are of the same mapping

using (SqlBulkCopy bulkCopy = new SqlBulkCopy(ConSQL)) {
if (ConSQL.State == ConnectionState.Closed) {
    ConSQL.Open();
}

bulkCopy.ColumnMappings.Add(0, 0);
bulkCopy.ColumnMappings.Add(1, 1);
bulkCopy.ColumnMappings.Add(2, 2);

bulkCopy.DestinationTableName = "dbo.TableName";

bulkCopy.WriteToServer(dataTable);

bulkCopy.Close(); //redundant - since using will dispose the object

}
Abbatial answered 5/1, 2015 at 6:32 Comment(0)
B
0

You can use SqlBulkCopy

guide is here

Bizarre answered 5/1, 2015 at 6:26 Comment(3)
This is more a comment, than an answerPneumothorax
Ok sorry, I am a new user :)Bizarre
It's a valid answer (although may be not a good answer), it could be improved by adding some explanation of why and how.Creak

© 2022 - 2024 — McMap. All rights reserved.