Edit
In hindsight, using a Parallel.ForEach
to parallelize DB insertions is slightly wasteful, as it will also consume a thread for each Connection. Arguably, an even better parallel solution would be to use the asynchronous versions of the System.Data
Db Operations, such as ExecuteNonQueryAsync , start the executions (concurrently), and then use await Task.WhenAll()
to wait upon completion - this will avoid the Thread overhead to the caller, although the overall Db performance won't likely be any quicker. More here
Original Answer, multiple Parallel Inserts into Database
You can do this in parallel using TPL, e.g. specifically with the localInit
overload of Parallel.ForEach. You will almost certainly want to look at throttling the amount of parallelism by tweaking MaxDegreeOfParalelism so that you don't inundate your database:
Parallel.ForEach(dt.Rows,
// Adjust this for optimum throughput vs minimal impact to your other DB users
new ParallelOptions { MaxDegreeOfParallelism = 4 },
() =>
{
var con = new SqlConnection();
var cmd = con.CreateCommand();
cmd.CommandText = sql_proc;
cmd.CommandType = CommandType.StoredProcedure;
con.Open();
cmd.Parameters.Add(new SqlParameter("@a", SqlDbType.Int));
// NB : Size sensitive parameters must have size
cmd.Parameters.Add(new SqlParameter("@b", SqlDbType.VarChar, 100));
cmd.Parameters.Add(new SqlParameter("@c", SqlDbType.Bit));
// Prepare won't help with SPROCs but can improve plan caching for adhoc sql
// cmd.Prepare();
return new {Conn = con, Cmd = cmd};
},
(dr, pls, localInit) =>
{
localInit.Cmd.Parameters["@a"] = dr["a"];
localInit.Cmd.Parameters["@b"] = dr["b"];
localInit.Cmd.Parameters["@c"] = dr["c"];
localInit.Cmd.ExecuteNonQuery();
return localInit;
},
(localInit) =>
{
localInit.Cmd.Dispose();
localInit.Conn.Dispose();
});
Notes:
- Unless you really know what you are doing, in general we should leave TPL to decide on the degree of parallelism. However, depending on how much contention (read: locks for database work) for resources, restricting the upper limit of concurrent tasks may be required (trial and error may be useful, e.g. try with concurrencies of 4, 8, 16 concurrent tasks etc to see which gives most throughput, and monitor the locking and CPU load on your Sql Server.
- Similarly, leaving TPL's default partitioner is usually good enough to partition the DataRows across the tasks.
- Each Task will need its own separate Sql Connection.
- Rather than creating and disposing the command on each call, create it once per task and then keep reusing the same Command, just updating parameters each time.
- Use the LocalInit / Local Finally lambdas to do per task set up and cleanup, like Disposing commands and connections.
- You could also consider using
.Prepare()
if you are using AdHoc Sql or Sql versions prior to 2005
- I'm assuming enumerating a
DataTable's
rows is thread safe. You'll want to double check this of course.
Side Note:
10 minutes for 3000 rows is excessive even with a wide table and a single thread. What does your proc do? I've assumed the processing isn't trivial, hence the need for the SPROC, but if you are just doing simple inserts, as per @3dd's comment, SqlBulkCopy will yield inserts of ~ 1M rows per minute on a reasonably narrow table.