.NET Entity Framework Insert vs Bulk Insert
Asked Answered
B

6

11

When I use my xxxContext object and issue several Adds to a table, then SaveChanges() how does the entity framework resolve this to SQL? Will it just loop doing insert into xxx or if there are hundreds of rows, is it smart enough to issue a Bulk insert command?

Bonus Question: If it doesn't issue the Bulk Insert is there a way to force it to so my DB performance isn't killed by separate inserts? Or to bulk to a temp table then merge to the original table like an Upsert?

Bobbobb answered 12/6, 2013 at 14:31 Comment(4)
Use the SqlBulkCopy APIPiane
possible duplicate of Fastest Way of Inserting in Entity FrameworkPhilina
Entity framework is very slow for this. Found a way to do it much faster. Basically you bulk insert into a temp table then issue a merge from there to the main table. I've explained the technique on my blog here: jarloo.com/c-bulk-upsert-to-sql-server-tutorial Describes the technique and shows the code how to do it.Bobbobb
To Upsert it's much faster to do this: jarloo.com/c-bulk-upsert-to-sql-server-tutorialBobbobb
K
7

The downfall of any ORM tool is that it is "chatty". Most times this is good enough. Sometimes it is not.

The short answer is "no".

Which is why I still sometimes pick IDataReader over EF or NHibernate, etc. And for bulk insert operations, I send xml to the stored procedure, and I shred it and bulk insert/update or merge from there.

So even when I use an ORM, I create a Domain Library that is not EF (or NHibernate) dependent......so I have a "safety valve" to by pass the ORM in certain situations.

Kos answered 12/6, 2013 at 14:36 Comment(3)
Thanks, what I thought. Surprised there isn't any OpenSource solution out there that deals with this. My project writes thousands of records at once to the db and this happens every 15 seconds so I guess I'll stick to plain sql where I can control it better.Bobbobb
Here is the key thing. The index recreating is the slow-down. When you insert RBAR (row by agonizing row), you have to recreate the index after every insert. So you do something "set based", like passing xml, shredding it, then doing a insert into dbo.MyTable select (bunch rows) from something. Now, you could do some voodoo where you insert into a staging table, and then issue a "move from staging to real" (in a bulk manner), but its not very EF'ish. But again, the rule of thumb (or one of the main ones) is "how are my indexes being rebuilt when I do it this way". Aka, the forgotten itemKos
The majority of what I do is basically an upsert. (update or insert) I need them to be fast and the tables I upsert to have billions (almost trillions) of rows. The fastest approach I've found is inserting to a temp table then doing a merge. I've documented it here: jarloo.com/c-bulk-upsert-to-sql-server-tutorialBobbobb
B
4

There is oportunity for several improvements in Entity Framework:

Set:

yourContext.Configuration.AutoDetectChangesEnabled = false;
yourContext.Configuration.ValidateOnSaveEnabled = false;

Do SaveChanges() in packages of 100 inserts... try with 1000 and see the changes.

Since during all this inserts, the context is the same, you can rebuild your context object every 1000 inserts. var yourContext = new YourContext();

Doing this improvements in an importing data process of mine, took it from 7 minutes to 6 seconds.

The actual numbers... could not be 100's o 1000's in your case... try it and tweek it.

Bloodworth answered 25/8, 2013 at 22:46 Comment(1)
I have a fairly complex migration where I'm also using UoW/Repo layer and this saved a significant amount of migration time for me. Not sure I'm able to make use of EntityFramework.BulkInsert given flow of obtaining and saving data in my console app.Golanka
G
3

If your insert queries are ANSI SQL or you don't care about supporting multipe databases with your codebase, you still have the backdoor to create a ADO.NET provider from EF and execute some raw SQL calls

https://mcmap.net/q/299108/-is-it-possible-to-run-native-sql-with-entity-framework

I would do something like this

private void BulkInsert(IEnumerable<Person> Persons)
{

    // use the information in the link to get your connection
    DbConnection conn = ...
    using (DbCommand cmd = conn.CreateCommand())
    {

       var sb = new StringBuilder();
       sb.Append("INSERT INTO person (firstname, lastname) VALUES ");
       var count = 0;
       foreach(var person in persons)
       {
           if (count !=0) sb.Append(",");
           sb.Append(GetInsertCommand(person, count++, cmd));
       }

       if (count > 0)
       {
           cmd.CommandText = sb.ToString();
           cmd.ExecuteNonQuery();
       }
    }



   if (sb.Length > 0)
       ExecuteNonQuery(sb.ToString());
}

private string GetInsertCommand(Person person, int count, DbCommand cmd)
{
    var firstname = "@firstname" + count.ToString();
    var lastname = "@lastname" + count.ToString();
    cmd.Parameters.Add(firstname, person.Firstname);
    cmd.Parameters.Add(lastname, person.Firstname);
    return String.Format("({0},{1})", firstname, lastname);
}

I must admit I haven't tested it but this should be a quick and dirty method to bypass EF for some Bulk Inserts until Bulk inserts are part of the core.

Update

Just a quick idea. Have you tried the ... method from the Migrations namespace? Maybe this one does bulk inserts, haven't look into it but it is worth a try:

private void BatchInsert(IEnumerable<Person> persons)
{
    context.Persons.AddOrUpdate(persons);
}

I know this method can be slow if you define a Key column like AddOrUpdate(p => p.Firstname, persons) but I would guess without specifing it, that should be all inserts (not guaranteed)

Gide answered 12/6, 2013 at 15:13 Comment(1)
AddOrUpdate without the lambda parameter uses the key property to query, so it's basically identical with AddOrUpdate(p => p.Id, persons). It will still be one query (SingleOrDefault by the key) per person in the collection before it inserts.Seumas
F
3

you can use bulk insert extension

usage:

using EntityFramework.BulkInsert.Extensions;

context.BulkInsert(myEntities);

with DbContext:

using (var ctx = GetContext())
{
    using (var transactionScope = new TransactionScope())
    {
        // some stuff in dbcontext    
        ctx.BulkInsert(entities);    
        ctx.SaveChanges();
        transactionScope.Complete();
    }
}
Fideicommissary answered 7/7, 2014 at 23:4 Comment(0)
C
0

I’m afraid EF does not support bulk insert or update. As you said currently EF will generate bunch of Insert commands and execute them separately (but all wrapped in a single transaction). There were some plans to implement batching, not sure if there is some progress recently. Hopefully in EF6 but I somehow doubt.

You can read more in this discussion.

Complect answered 12/6, 2013 at 14:37 Comment(0)
O
-1

ASP .NET Core version fast method insert from Repository.

public virtual void AddRangeFastAndCommit(IEnumerable<T> entities)
{
    MyDbContext localContext = new MyDbContext(_context.Options);
    localContext.ChangeTracker.AutoDetectChangesEnabled = false;

    foreach (var entity in entities)
    {
        localContext.Add(entity);
    }

    localContext.SaveChanges();
    localContext.Dispose();
}
Orangeade answered 19/4, 2017 at 14:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.