Is using a (sequential) GUID the only viable alternative to a database generated ID?
Asked Answered
S

3

2

We are migrating our MS-Access database to SQL Server Compact 4.0 using the Entity Framework 5 Code First approach. We found that using Database generated Integer ID's is very slow and to make it worse, the delay increases exponentially with the size of the database. This makes using an Identity column impossible and it seems there is a bad implementation of this feature in SQL Server Compact 4.0 paired with the Entity Framework.

So, we ran some tests and found that using a client side generated key speeds op insertion by at least 20 times, the exponential increase in insertion disappears.

Now we are looking at a the best way to generate client side ID's. Using GUID's seems the most secure option, but I read that this negatively impacts read actions. Is there a strategy in using auto-incremented Integers that are client side generated?

EDIT: I will investigate the underlying problem that lead to the question further. In the mean time can my real question be answered please? :-)

EDIT2: It is pretty exasperating that nobody seems to believe the assertion that using auto-id's with EF and SQL Server compact 4.0 is so slow. I posted a separate question about this with a proof of concept that should be easily reproducible.

Seoul answered 6/2, 2013 at 23:14 Comment(9)
I suspect there is some other problem either in your application logic or your testing methodology. Generating integer IDs using IDENTITY columns should not be slow nor should it make using identity columns impossible.Downstage
Yet again I agree with @AaronBertrand, it seems very unlikely that they got something this basic that wrongCentavo
@AaronBertrand Likewise, it seems difficult to get a test that confirms this wrong :-). See also: social.msdn.microsoft.com/Forums/en/adodotnetentityframework/…Seoul
This was fixed in version 4.0 of SQL Server CompactHandball
@Handball No it is not. If you take some time to look for the problem on this very site you will find that questions about this topic actually abound. You participated even in some of them.Seoul
What I meantvwas that in version 3,5 no engine generated keys were supportedHandball
Why can't you turn the identity column off in SQL Server, copy the data with the auto incremented integer IDs and then turn the identity column on in SQL? You can tell SQL where to pick up again with the auto incrementing integer.Guido
Have you tested both 32-bit and 64-bit SQLCE runtimes, or are you focused on one bitness particularly?Dearr
@Ryan, didnt know you could do that with Squirrel Server. Thanks +1Campobello
H
1

If you are moving large amounts of data with EF, you are doing it wrong. Use ADO.NET, and for example a BULK COPY approach instead (with SQL CE use SqlCeUpdateableRecord). You could use my SqlCeBulkCopy library to save some coding effort.

Handball answered 7/2, 2013 at 7:36 Comment(3)
Actually I use your BulkCopy approach now. It does not update the generated ID's in the Entities though. And what is "a large amount"? EF5 + SQL Server Compact need 30 seconds to write 250 records in our databasetable with an Identity column, this drops to below a second when I generate the ID on the client. That cannot be good. Using your bulkkcopy the time needed is neglegible.Seoul
Have you tried the KeepIdentity option with the BulkCopy API?Handball
That fails: as the ID's are zero for all entities a duplicate primary key Exception is thrown.Seoul
C
1

I dont think the way of identity generation is the source of performance problem.
I think if you want to get a better performance during migration process, before conversion process, you can disable Primary keys and foreign keys and other constraint on your main tables. (this could be done by scripting or manualy)
However data integrity will be your new concern and you conversion code must be strong so after conversion process, enabling the constraints could be done.
hope this helps.

Chimene answered 10/2, 2013 at 19:20 Comment(1)
Thanks for your input, but your assumption about the constraints is wrong. Follow the link to my other question which proves that insertion using store generated ID's is very slow when using EF5 + SQLServer Compact 4.0Seoul
C
0

Solution as I see it.

a)Try Fix Performance problem. My suggestions (dont use large numbers of entities inside context.) try as few as the business problem will allow. Dont use merge tracking etc... See EF performance tips. http://blogs.msdn.com/b/wriju/archive/2011/03/15/ado-net-entity-framework-performance-tips.aspx and http://msdn.microsoft.com/en-au/library/cc853327.aspx

b)Use a Guid. Allocate externally (its not sequential, but fast)

c)Use a customer Integer generator, that runs in memory, can allocate many keys at once and can persist the current state. This technique is used by SAP. They call it "number range". Can be very fast but not as fast as b).

btw I use GUIDs and not DB generated IDs to make partial DB copies and migrations Easy/easier. :-)

Campobello answered 11/2, 2013 at 15:6 Comment(2)
Using non sequential guids within an indexed table can cause re-balance the index tree every time a row is written!Sonni
@Bernd I strongly prefer evenly distributed Guids. Especially in Large volume scenarios . By large I mean > 100 million rows. It is very important when you get > 1 billion rows. for more on why in case interested #13149639 . Btw if a table is partitioned for growth and 50% of the table use is Insert, then evenly spread Guids wins hands down. Having been through this exercise with DB2 experts at a bank where more than 100 million rows are to one table alone per month.Campobello

© 2022 - 2024 — McMap. All rights reserved.