Correct method of deleting over 2100 rows (by ID) with Dapper
Asked Answered
F

3

16

I am trying to use Dapper support my data access for my server app.

My server app has another application that drops records into my database at a rate of 400 per minute.

My app pulls them out in batches, processes them, and then deletes them from the database.

Since data continues to flow into the database while I am processing, I don't have a good way to say delete from myTable where allProcessed = true.

However, I do know the PK value of the rows to delete. So I want to do a delete from myTable where Id in @listToDelete

Problem is that if my server goes down for even 6 mintues, then I have over 2100 rows to delete.

Since Dapper takes my @listToDelete and turns each one into a parameter, my call to delete fails. (Causing my data purging to get even further behind.)

What is the best way to deal with this in Dapper?

NOTES: I have looked at Tabled Valued Parameters but from what I can see, they are not very performant. This piece of my architecture is the bottle neck of my system and I need to be very very fast.

Fishman answered 30/3, 2012 at 15:45 Comment(3)
@marc_s - I don't need to pass that many parameters... But I do need to delete that many rows by PK Id. However I do that is fine by me. I am telling dapper to delete every row in my @list. It is Dapper that is making parameters of every item in my list.Fishman
Hard to tell from this little information, but why can't you use teh selection criteria for batch as your criteria for a delete. Or have a processed flag in MyTable, set it in the "processing" and then use that. It's not brill, but it will be much quicker than deleting them one by one. It's isn't going to be shockingly bad even with 10,000 records.Reichenberg
My batch criteria is not very performant. So I would prefer not to use that. I could add a processed flag, but to do that I would need a way to call out the rows to add the processed flag to. If I can call them out, I may as well delete them. (I could add a flag to say, "batched" at select time. But I would rather not.)Fishman
F
26

One option is to create a temp table on the server and then use the bulk load facility to upload all the IDs into that table at once. Then use a join, EXISTS or IN clause to delete only the records that you uploaded into your temp table.

Bulk loads are a well-optimized path in SQL Server and it should be very fast.

For example:

  1. Execute the statement CREATE TABLE #RowsToDelete(ID INT PRIMARY KEY)
  2. Use a bulk load to insert keys into #RowsToDelete
  3. Execute DELETE FROM myTable where Id IN (SELECT ID FROM #RowsToDelete)
  4. Execute DROP TABLE #RowsToDelte (the table will also be automatically dropped if you close the session)

(Assuming Dapper) code example:

conn.Open();

var columnName = "ID";

conn.Execute(string.Format("CREATE TABLE #{0}s({0} INT PRIMARY KEY)", columnName));

using (var bulkCopy = new SqlBulkCopy(conn))
{
    bulkCopy.BatchSize = ids.Count;
    bulkCopy.DestinationTableName = string.Format("#{0}s", columnName);

    var table = new DataTable();                    
    table.Columns.Add(columnName, typeof (int));
    bulkCopy.ColumnMappings.Add(columnName, columnName);

    foreach (var id in ids)
    {
        table.Rows.Add(id);
    }

    bulkCopy.WriteToServer(table);
}

//or do other things with your table instead of deleting here
conn.Execute(string.Format(@"DELETE FROM myTable where Id IN 
                                   (SELECT {0} FROM #{0}s", columnName));

conn.Execute(string.Format("DROP TABLE #{0}s", columnName));
Flashy answered 30/3, 2012 at 16:53 Comment(1)
Added example code that actually uses SqlBulkCopy, could be adapted for wider tables by adding more columns to temp table and data table.Perpetua
F
7

To get this code working, I went dark side.

Since Dapper makes my list into parameters. And SQL Server can't handle a lot of parameters. (I have never needed even double digit parameters before). I had to go with Dynamic SQL.

So here was my solution:

string listOfIdsJoined = "("+String.Join(",", listOfIds.ToArray())+")";
connection.Execute("delete from myTable where Id in " + listOfIdsJoined);

Before everyone grabs the their torches and pitchforks, let me explain.

  • This code runs on a server whose only input is a data feed from a Mainframe system.
  • The list I am dynamically creating is a list of longs/bigints.
  • The longs/bigints are from an Identity column.

I know constructing dynamic SQL is bad juju, but in this case, I just can't see how it leads to a security risk.

Fishman answered 30/3, 2012 at 16:40 Comment(2)
@Pure.Krome presumably his listOfIds is of type List<int> therefore no need to sanitise the inputsAlcock
Probably ... but assumptions are the root of all evil. Secondly, it's the general suggestion => creating a sql script, on the fly and into a string ... could end very badly :)Nostradamus
A
1

Dapper request the List of object having parameter as a property so in above case a list of object having Id as property will work.

connection.Execute("delete from myTable where Id in (@Id)", listOfIds.AsEnumerable().Select(i=> new { Id = i }).ToList());

This will work.

Approval answered 1/3, 2016 at 6:2 Comment(2)
This will create separate query for every Id in the listOfIds.Serotonin
For Sql Server: There is a maximum of 2100 parameters that can be passed to SQL in the IN clause. If you are going to do it that way than it is best to chunk it out.Wiring

© 2022 - 2024 — McMap. All rights reserved.