Entity framework remove object from context, but not from database
Asked Answered
M

2

9

I am working on a batch process which dumps ~800,000 records from a slow legacy database (1.4-2ms per record fetch time...it adds up) into MySQL which can perform a little faster. To optimize this, I have been loading all of the MySQL records into memory which puts usage to about 200MB. Then, I start dumping from the legacy database and updating the records.

Originally, when this would complete updating the records I would then call SaveContext which would then make my memory jump from ~500MB-800MB to 1.5GB. Very soon, I would get out of memory exceptions (the virtual machine this is running on has 2GB of RAM) and even if I were to give it more RAM, 1.5-2GB is still a little excessive and that would be just putting a band-aid on the problem. To remedy this, I started calling SaveContext every 10,000 records which helped things along a bit and since I was using delegates to fetch the data from the legacy database and update it in MySQL I didn't receive too horrible a hit in performance since after the 5 second or so wait while it was saving it would then run through the update in memory for the 3000 or so records that had backed up. However, the memory usage still keeps going up.

Here are my potential issues:

  • The data comes out of the legacy database in any order, so I can't chunk the updates and periodically release the ObjectContext.
  • If I don't grab all of the data out of MySQL beforehand and instead look it up during the update process by record, it is incredibly slow. I instead grab it all beforehand, cast it to a dictionary indexed by the primary key, and as I update the data I remove the records from the dictionary.

One possible solution I thought of is to somehow free the memory being used by entities that I know I will never touch again since they have already been updated (like clearing the cache, but only for a specific item), but I don't know if that is even possible with Entity Framework.

Does anyone have any thoughts?

Mitra answered 29/6, 2012 at 0:19 Comment(1)
EF does not seem to be the right tool for the job. Why don't you just migrate the data directly?Hendiadys
E
6

You can call the Detach method on the context passing it the object you no longer need: http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.detach%28v=vs.90%29.aspx

Exeat answered 29/6, 2012 at 1:15 Comment(1)
Although I may end up moving away from the entity framework for speed, this did fix the memory problem. I started saving the context every 30,000 records and then I detached the objects I had updated from the context and it stopped using so much memory.Mitra
H
2

I'm wondering if your best bet isn't another tool as previously suggested or just forgoing the use of Entity Framework. If you instead do the code without an ORM, you can:

  1. Tune the SQL statements to improve performance
  2. Easily control, and change, the scope of transactions to get the best performance.
  3. You can batch the updates so that you aren't calling the server to accomplish multiple updates instead of them being performed one at a time.
Howbeit answered 29/6, 2012 at 2:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.