How should I multiple insert multiple records?
Asked Answered
I

11

51

I have a class named Entry declared like this:

class Entry{
    string Id {get;set;}
    string Name {get;set;}
}  

and then a method that will accept multiple such Entry objects for insertion into the database using ADO.NET:

static void InsertEntries(IEnumerable<Entry> entries){
    //build a SqlCommand object
    using(SqlCommand cmd = new SqlCommand()){
        ...
        const string refcmdText = "INSERT INTO Entries (id, name) VALUES (@id{0},@name{0});";
        int count = 0;
        string query = string.Empty;
        //build a large query
        foreach(var entry in entries){
            query += string.Format(refcmdText, count);
            cmd.Parameters.AddWithValue(string.Format("@id{0}",count), entry.Id);
            cmd.Parameters.AddWithValue(string.Format("@name{0}",count), entry.Name);
            count++;
        }
        cmd.CommandText=query;
        //and then execute the command
        ...
    }
}  

And my question is this: should I keep using the above way of sending multiple insert statements (build a giant string of insert statements and their parameters and send it over the network), or should I keep an open connection and send a single insert statement for each Entry like this:

using(SqlCommand cmd = new SqlCommand(){
    using(SqlConnection conn = new SqlConnection(){
        //assign connection string and open connection
        ...
        cmd.Connection = conn;
        foreach(var entry in entries){
            cmd.CommandText= "INSERT INTO Entries (id, name) VALUES (@id,@name);";
            cmd.Parameters.AddWithValue("@id", entry.Id);
            cmd.Parameters.AddWithValue("@name", entry.Name);
            cmd.ExecuteNonQuery();
        }
    }
 }  

What do you think? Will there be a performance difference in the Sql Server between the two? Are there any other consequences I should be aware of?

Infrequency answered 4/6, 2010 at 9:45 Comment(2)
Thanks for all your suggestions! i'll take @Giorgi's answer cause it more or less answers the original questionInfrequency
you can use user-definedtable type in SQl server to pass DataTable to the SQL server fourthbottle.com/2014/09/…Chrysoprase
E
29

If I were you I would not use either of them.

The disadvantage of the first one is that the parameter names might collide if there are same values in the list.

The disadvantage of the second one is that you are creating command and parameters for each entity.

The best way is to have the command text and parameters constructed once (use Parameters.Add to add the parameters) change their values in the loop and execute the command. That way the statement will be prepared only once. You should also open the connection before you start the loop and close it after it.

Ewe answered 4/6, 2010 at 10:2 Comment(5)
The SQLCommand is optimized for the process that Giorgi describes. The underlying connection will be maintained as Tim points out. I would also use a Transaction recommended by Tim.Surgy
so, also the cmd.ExecuteNonQuery(); should be inside the loop?Mo
@Ewe What to do when the number of parameters can exceed the limit of 2100?Wellheeled
Who would do multiple inserts if SqlBulkCopy could be used??? See the answer below.Reagan
The downside of this approach is that you are executing the command each time through the loop. That means for n-rows you are calling .ExecuteNonQuery n-times. This is horrendous for performance. At the very least, parameterize 5-rows into one batch with the VALUES clause. Hell, even generating a batch with 5 inserts would be better than executing 5 separate INSERT statements.Arytenoid
S
69
static void InsertSettings(IEnumerable<Entry> settings) {
    using (SqlConnection oConnection = new SqlConnection("Data Source=(local);Initial Catalog=Wip;Integrated Security=True")) {
        oConnection.Open();
        using (SqlTransaction oTransaction = oConnection.BeginTransaction()) {
            using (SqlCommand oCommand = oConnection.CreateCommand()) {
                oCommand.Transaction = oTransaction;
                oCommand.CommandType = CommandType.Text;
                oCommand.CommandText = "INSERT INTO [Setting] ([Key], [Value]) VALUES (@key, @value);";
                oCommand.Parameters.Add(new SqlParameter("@key", SqlDbType.NChar));
                oCommand.Parameters.Add(new SqlParameter("@value", SqlDbType.NChar));
                try {
                    foreach (var oSetting in settings) {
                        oCommand.Parameters[0].Value = oSetting.Key;
                        oCommand.Parameters[1].Value = oSetting.Value;
                        if (oCommand.ExecuteNonQuery() != 1) {
                            //'handled as needed, 
                            //' but this snippet will throw an exception to force a rollback
                            throw new InvalidProgramException();
                        }
                    }
                    oTransaction.Commit();
                } catch (Exception) {
                    oTransaction.Rollback();
                    throw;
                }
            }
        }
    }
}
Surgy answered 5/6, 2010 at 7:53 Comment(2)
I'd rather reference parameters by their name than int index. Still, +1. Old answer, still applicable.Ekaterina
True, but in this case, the statements adding parameters and then indexing them are close together in the same execution flow, so there is no confusion. Also, using the index is the fastest way to access the parameter. I wrote this will performance in mind and based in on some existing Microsoft Library Source Code.Surgy
E
29

If I were you I would not use either of them.

The disadvantage of the first one is that the parameter names might collide if there are same values in the list.

The disadvantage of the second one is that you are creating command and parameters for each entity.

The best way is to have the command text and parameters constructed once (use Parameters.Add to add the parameters) change their values in the loop and execute the command. That way the statement will be prepared only once. You should also open the connection before you start the loop and close it after it.

Ewe answered 4/6, 2010 at 10:2 Comment(5)
The SQLCommand is optimized for the process that Giorgi describes. The underlying connection will be maintained as Tim points out. I would also use a Transaction recommended by Tim.Surgy
so, also the cmd.ExecuteNonQuery(); should be inside the loop?Mo
@Ewe What to do when the number of parameters can exceed the limit of 2100?Wellheeled
Who would do multiple inserts if SqlBulkCopy could be used??? See the answer below.Reagan
The downside of this approach is that you are executing the command each time through the loop. That means for n-rows you are calling .ExecuteNonQuery n-times. This is horrendous for performance. At the very least, parameterize 5-rows into one batch with the VALUES clause. Hell, even generating a batch with 5 inserts would be better than executing 5 separate INSERT statements.Arytenoid
A
28

The truly terrible way to do it is to execute each INSERT statement as its own batch:

Batch 1:

INSERT INTO Entries (id, name) VALUES (1, 'Ian Boyd);

Batch 2:

INSERT INTO Entries (id, name) VALUES (2, 'Bottlenecked);

Batch 3:

INSERT INTO Entries (id, name) VALUES (3, 'Marek Grzenkowicz);

Batch 4:

INSERT INTO Entries (id, name) VALUES (4, 'Giorgi);

Batch 5:

INSERT INTO Entries (id, name) VALUES (5, 'AMissico);

Note: Parameterization, error checking, and any other nit-picks elided for expoistory purposes.

This is truly, horrible, terrible way to do things. It gives truely awful performance, because you suffer the network round-trip-time every time.

A much better solution is to batch all the INSERT statements into one batch:

Batch 1:

INSERT INTO Entries (id, name) VALUES (1, 'Ian Boyd');
INSERT INTO Entries (id, name) VALUES (2, 'Bottlenecked');
INSERT INTO Entries (id, name) VALUES (3, 'Marek Grzenkowicz');
INSERT INTO Entries (id, name) VALUES (4, 'Giorgi');
INSERT INTO Entries (id, name) VALUES (5, 'AMissico');

This way you only suffer one-round trip. This version has huge performance wins; on the order of 5x faster.

Even better is to use the VALUES clause:

INSERT INTO Entries (id, name)
VALUES 
(1, 'Ian Boyd'),
(2, 'Bottlenecked'),
(3, 'Marek Grzenkowicz'),
(4, 'Giorgi'),
(5, 'AMissico');

This gives you some performance improvements over the 5 separate INSERTs version; it lets the server do what it's good at: operating on sets:

  • each trigger only has to operate once
  • foreign keys are only checked once
  • unique constraints are only checked once

SQL Sever loves to operate on sets of data; it's where it's a viking!

Parameter limit

The above T-SQL examples have all the parameteriztion stuff removed for clarity. But in reality you want to parameterize queries

  • Not because you want to avoid SQL injection; because you're already a good developer who's using QuotedString(firstName)
  • Not because you want the performance bonus of saving the server from having to compile each T-SQL batch (Although, during a high-speed bulk-import, saving the parsing time really adds up)
  • but because you want to avoid flooding the server's query plan cache with gibibytes upon gibibytes of ad-hoc query plans. (I've seen SQL Server's working set, i.e. RAM usage, not memory usage, be 2 GB of just unparameterized SQL query plans)

But Bruno has an important point; SQL Server's driver only lets you include 2,100 parameters in a batch. The above query has two values:

@id, @name

If you import 1,051 rows in a single batch, that's 2,102 parameters - you'll get the error:

Too many parameters were provided in this RPC request

That is why i generally insert 5 or 10 rows at a time. Adding more rows per batch doesn't improve performance; there's diminishing returns.

It keeps the number of parameters low, so it doesn't get anywhere near the T-SQL batch size limit. There's also the fact that a VALUES clause is limited to 1000 tuples anyway.

Implementing it

Your first approach is good, but you do have the issues of:

  • parameter name collisions
  • unbounded number of rows (possibly hitting the 2100 parameter limit)

So the goal is to generate a string such as:

INSERT INTO Entries (id, name) VALUES
(@p1, @p2),
(@p3, @p4),
(@p5, @p6),
(@p7, @p8),
(@p9, @p10)

I'll change your code by the seat of my pants

IEnumerable<Entry> entries = GetStuffToInsert();

SqlCommand cmd = new SqlCommand();
StringBuilder sql = new StringBuilder();
Int32 batchSize = 0; //how many rows we have build up so far
Int32 p = 1; //the current paramter name (i.e. "@p1") we're going to use

foreach(var entry in entries)
{
   //Build the names of the parameters
   String pId =   String.Format("@p{0}", p);   //the "Id" parameter name (i.e. "p1")
   String pName = String.Format("@p{0}", p+1); //the "Name" parameter name (i.e. "p2")
   p += 2;

   //Build a single "(p1, p2)" row
   String row = String.Format("({0}, {1})", pId, pName); //a single values tuple

   //Add the row to our running SQL batch
   if (batchSize > 0)
      sb.AppendLine(",");
   sb.Append(row);
   batchSize += 1;

   //Add the parameter values for this row
   cmd.Parameters.Add(pID,   System.Data.SqlDbType.Int   ).Value = entry.Id;
   cmd.Parameters.Add(pName, System.Data.SqlDbType.String).Value = entry.Name;

   if (batchSize >= 5)
   {
       String sql = "INSERT INTO Entries (id, name) VALUES"+"\r\n"+
                    sb.ToString();
       cmd.CommandText = sql;
       cmd.ExecuteNonQuery();
       cmd.Parameters.Clear();
       sb.Clear();
       batchSize = 0;
       p = 1;
   }
}

//handle the last few stragglers
if (batchSize > 0)
{
    String sql = "INSERT INTO Entries (id, name) VALUES"+"\r\n"+
                 sb.ToString();
    cmd.CommandText = sql;
    cmd.ExecuteNonQuery();
}
Arytenoid answered 5/5, 2020 at 23:26 Comment(5)
Hey, thank you very much for going to the trouble of answering such an old question! I've since become experienced and batching is my first instinct as well. Plus your suggestion to limit items in a batch will also play good with the push-back I get from our DBAs: they insist on sending as few statements as possible to avoid longer transaction times and give a fairer share of time to all clients reading/writing on the same tables and rows. Don't really know how much water that holds (not intimately familiar with SQL internals) but sounds like a good advice so I follow it :)Infrequency
@Infrequency If you're inserting a few thousand rows at a time, it's not going to be a very long transaction. But it would be bad if you i) started a transaction, ii) inserted or updated a few million rows, taking 6 or 7 hours to do it. Not only would you (potentially) lock out other users (due to blocking), but you'd have a transaction log that had to grow-and-grow to accomodate everything. But generally the way you should write such things is do the import into a staging table, do all the data massaging there, and once everything's ready: do the MERGE into the read table.Arytenoid
This feels like the correct solution to me. Thank you for writing it up! Though once I start writing ado.net to this level of complexity, my eye starts wandering over to dapperCantata
Great Explanation!! Just worked as intended.Rockrose
Works for me, though I have found that the 2100 parameter limit sometimes fires prematurely.Clemens
H
13

Following up @Tim Mahy - There's two possible ways to feed SqlBulkCopy: a DataReader or via DataTable. Here the code for DataTable:

DataTable dt = new DataTable();
dt.Columns.Add(new DataColumn("Id", typeof(string)));
dt.Columns.Add(new DataColumn("Name", typeof(string)));
foreach (Entry entry in entries)
    dt.Rows.Add(new string[] { entry.Id, entry.Name });

using (SqlBulkCopy bc = new SqlBulkCopy(connection))
{   // the following 3 lines might not be neccessary
    bc.DestinationTableName = "Entries";
    bc.ColumnMappings.Add("Id", "Id");
    bc.ColumnMappings.Add("Name", "Name");

    bc.WriteToServer(dt);
}
Hyperesthesia answered 22/2, 2018 at 14:11 Comment(2)
how to do this with Oledb connection type any idea?Twinflower
As far as I know - & I've spent some time researching - there's no bulk copy functionality for OleDb. So open your connection once, do your inserts one at a time & close the connection afterwards.Hyperesthesia
F
10

You should execute the command on every loop instead of building a huge command Text(btw,StringBuilder is made for this) The underlying Connection will not close and re-open for each loop, let the connection pool manager handle this. Have a look at this link for further informations: Tuning Up ADO.NET Connection Pooling in ASP.NET Applications

If you want to ensure that every command is executed successfully you can use a Transaction and Rollback if needed,

Fatwitted answered 4/6, 2010 at 9:59 Comment(2)
Or better use Linq2Sql and let Linq2Sql handle this.Feller
The SQLCommand is optimized for the process that Giorgi describes. The underlying connection will be maintained as Tim points out. I would also use a Transaction recommended by Tim.Surgy
P
3

When it are a lot of entries consider to use SqlBulkCopy. The performance is much faster than a series of single inserts.

Pandowdy answered 5/6, 2010 at 8:7 Comment(1)
What is "a lot"?Trike
D
1

You can directly insert a DataTable if it is created correctly.

First make sure that the access table columns have the same column names and similar types. Then you can use this function which I believe is very fast and elegant.

public void AccessBulkCopy(DataTable table)
{
    foreach (DataRow r in table.Rows)
        r.SetAdded();

    var myAdapter = new OleDbDataAdapter("SELECT * FROM " + table.TableName, _myAccessConn);

    var cbr = new OleDbCommandBuilder(myAdapter);
    cbr.QuotePrefix = "[";
    cbr.QuoteSuffix = "]";
    cbr.GetInsertCommand(true);

    myAdapter.Update(table);
}
Daugavpils answered 6/9, 2016 at 21:42 Comment(0)
R
0

Just format the query string to add all set of values to be inserted.

Something like this -

for (int i = 0; i < nimbusUserIds.Count; i++)
            {
                parameterValues[i] =
                    $"(0, 0, SYSDATETIME(),0, SYSDATETIME(), SYSDATETIME(),SYSDATETIME(), '{nimbusUserIds[i]}')";
            }

            string query =
                string.Format(@"INSERT INTO [dbo].[NimbusUserEmailInviteStatus] 
([AdRegistrationStatus],
[EmailSentStatus],
[EmailSentDateTime],
[InvitationStatus],
[InvitationAcceptedDateTime],
[CreatedDateTime],
[UpdatedDateTime],
[NimbusUserId]) VALUES {0}", string.Join(", ", parameterValues));
Restore answered 17/4, 2021 at 11:11 Comment(0)
E
0

Consider using TransactionScope. Any connection and query execution inside the scope will automatically be wrapped into the transaction and all you need to do is call scope.Complete() at the end. If something goes wrong all the executions inside will be rolled back. Much simpler and nicer code. Read the MS doc and see the illustrated example.

https://learn.microsoft.com/en-us/dotnet/api/system.transactions.transactionscope?view=net-8.0

Eras answered 14/4, 2023 at 14:39 Comment(0)
R
-3

Stored procedure to insert multiple records using single insertion:

ALTER PROCEDURE [dbo].[Ins]
@i varchar(50),
@n varchar(50),
@a varchar(50),
@i1 varchar(50),
@n1 varchar(50),
@a1 varchar(50),
@i2 varchar(50),
@n2 varchar(50),
@a2 varchar(50) 
AS
INSERT INTO t1
SELECT     @i AS Expr1, @i1 AS Expr2, @i2 AS Expr3
UNION ALL
SELECT     @n AS Expr1, @n1 AS Expr2, @n2 AS Expr3
UNION ALL
SELECT     @a AS Expr1, @a1 AS Expr2, @a2 AS Expr3
RETURN

Code behind:

protected void Button1_Click(object sender, EventArgs e)
{
    cn.Open();
    SqlCommand cmd = new SqlCommand("Ins",cn);
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.Parameters.AddWithValue("@i",TextBox1.Text);
    cmd.Parameters.AddWithValue("@n",TextBox2.Text);
    cmd.Parameters.AddWithValue("@a",TextBox3.Text);
    cmd.Parameters.AddWithValue("@i1",TextBox4.Text);
    cmd.Parameters.AddWithValue("@n1",TextBox5.Text);
    cmd.Parameters.AddWithValue("@a1",TextBox6.Text);
    cmd.Parameters.AddWithValue("@i2",TextBox7.Text);
    cmd.Parameters.AddWithValue("@n2",TextBox8.Text);
    cmd.Parameters.AddWithValue("@a2",TextBox9.Text);
    cmd.ExecuteNonQuery();
    cn.Close();
    Response.Write("inserted");
    clear();
}
Rothberg answered 2/12, 2011 at 5:39 Comment(1)
What if I have 2 million entries?Pia
M
-5
ClsConectaBanco bd = new ClsConectaBanco();

StringBuilder sb = new StringBuilder();
sb.Append("  INSERT INTO FAT_BALANCETE ");
sb.Append(" ([DT_LANCAMENTO]           ");
sb.Append(" ,[ID_LANCAMENTO_CONTABIL]  ");
sb.Append(" ,[NR_DOC_CONTABIL]         ");
sb.Append(" ,[TP_LANCAMENTO_GERADO]    ");
sb.Append(" ,[VL_LANCAMENTO]           ");
sb.Append(" ,[TP_NATUREZA]             ");
sb.Append(" ,[CD_EMPRESA]              ");
sb.Append(" ,[CD_FILIAL]               ");
sb.Append(" ,[CD_CONTA_CONTABIL]       ");
sb.Append(" ,[DS_CONTA_CONTABIL]       ");
sb.Append(" ,[ID_CONTA_CONTABIL]       ");
sb.Append(" ,[DS_TRIMESTRE]            ");
sb.Append(" ,[DS_SEMESTRE]             ");
sb.Append(" ,[NR_TRIMESTRE]            ");
sb.Append(" ,[NR_SEMESTRE]             ");
sb.Append(" ,[NR_ANO]                  ");
sb.Append(" ,[NR_MES]                  ");
sb.Append(" ,[NM_FILIAL])              ");
sb.Append(" VALUES                     ");
sb.Append(" (@DT_LANCAMENTO            ");
sb.Append(" ,@ID_LANCAMENTO_CONTABIL   ");
sb.Append(" ,@NR_DOC_CONTABIL          ");
sb.Append(" ,@TP_LANCAMENTO_GERADO     ");
sb.Append(" ,@VL_LANCAMENTO            ");
sb.Append(" ,@TP_NATUREZA              ");
sb.Append(" ,@CD_EMPRESA               ");
sb.Append(" ,@CD_FILIAL                ");
sb.Append(" ,@CD_CONTA_CONTABIL        ");
sb.Append(" ,@DS_CONTA_CONTABIL        ");
sb.Append(" ,@ID_CONTA_CONTABIL        ");
sb.Append(" ,@DS_TRIMESTRE             ");
sb.Append(" ,@DS_SEMESTRE              ");
sb.Append(" ,@NR_TRIMESTRE             ");
sb.Append(" ,@NR_SEMESTRE              ");
sb.Append(" ,@NR_ANO                   ");
sb.Append(" ,@NR_MES                   ");
sb.Append(" ,@NM_FILIAL)               ");

SqlCommand cmd = new SqlCommand(sb.ToString(), bd.CriaConexaoSQL());
bd.AbrirConexao();

cmd.Parameters.Add("@DT_LANCAMENTO", SqlDbType.Date);
cmd.Parameters.Add("@ID_LANCAMENTO_CONTABIL", SqlDbType.Int);
cmd.Parameters.Add("@NR_DOC_CONTABIL", SqlDbType.VarChar,255);
cmd.Parameters.Add("@TP_LANCAMENTO_GERADO", SqlDbType.VarChar,255);
cmd.Parameters.Add("@VL_LANCAMENTO", SqlDbType.Decimal);
cmd.Parameters["@VL_LANCAMENTO"].Precision = 15;
cmd.Parameters["@VL_LANCAMENTO"].Scale = 2;
cmd.Parameters.Add("@TP_NATUREZA", SqlDbType.VarChar, 1);
cmd.Parameters.Add("@CD_EMPRESA",SqlDbType.Int);
cmd.Parameters.Add("@CD_FILIAL", SqlDbType.Int);
cmd.Parameters.Add("@CD_CONTA_CONTABIL", SqlDbType.VarChar, 255);
cmd.Parameters.Add("@DS_CONTA_CONTABIL", SqlDbType.VarChar, 255);
cmd.Parameters.Add("@ID_CONTA_CONTABIL", SqlDbType.VarChar,50);
cmd.Parameters.Add("@DS_TRIMESTRE", SqlDbType.VarChar, 4);
cmd.Parameters.Add("@DS_SEMESTRE", SqlDbType.VarChar, 4);
cmd.Parameters.Add("@NR_TRIMESTRE", SqlDbType.Int);
cmd.Parameters.Add("@NR_SEMESTRE", SqlDbType.Int);
cmd.Parameters.Add("@NR_ANO", SqlDbType.Int);
cmd.Parameters.Add("@NR_MES", SqlDbType.Int);
cmd.Parameters.Add("@NM_FILIAL", SqlDbType.VarChar, 255);
cmd.Prepare();

 foreach (dtoVisaoBenner obj in lista)
 {
     cmd.Parameters["@DT_LANCAMENTO"].Value = obj.CTLDATA;
     cmd.Parameters["@ID_LANCAMENTO_CONTABIL"].Value = obj.CTLHANDLE.ToString();
     cmd.Parameters["@NR_DOC_CONTABIL"].Value = obj.CTLDOCTO.ToString();
     cmd.Parameters["@TP_LANCAMENTO_GERADO"].Value = obj.LANCAMENTOGERADO;
     cmd.Parameters["@VL_LANCAMENTO"].Value = obj.CTLANVALORF;
     cmd.Parameters["@TP_NATUREZA"].Value = obj.NATUREZA;
     cmd.Parameters["@CD_EMPRESA"].Value = obj.EMPRESA;
     cmd.Parameters["@CD_FILIAL"].Value = obj.FILIAL;
     cmd.Parameters["@CD_CONTA_CONTABIL"].Value = obj.CONTAHANDLE.ToString();
     cmd.Parameters["@DS_CONTA_CONTABIL"].Value = obj.CONTANOME.ToString();
     cmd.Parameters["@ID_CONTA_CONTABIL"].Value = obj.CONTA;
     cmd.Parameters["@DS_TRIMESTRE"].Value = obj.TRIMESTRE;
     cmd.Parameters["@DS_SEMESTRE"].Value = obj.SEMESTRE;
     cmd.Parameters["@NR_TRIMESTRE"].Value = obj.NRTRIMESTRE;
     cmd.Parameters["@NR_SEMESTRE"].Value = obj.NRSEMESTRE;
     cmd.Parameters["@NR_ANO"].Value = obj.NRANO;
     cmd.Parameters["@NR_MES"].Value = obj.NRMES;
     cmd.Parameters["@NM_FILIAL"].Value = obj.NOME;
     cmd.ExecuteNonQuery();
     rowAffected++;
 }
Monaxial answered 12/9, 2012 at 14:8 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.