Is there a .NET equivalent to SQL Server's newsequentialid()
Asked Answered
P

11

38

We use GUIDs for primary key, which you know is clustered by default.

When inserting a new row into a table it is inserted at a random page in the table (because GUIDs are random). This has a measurable performance impact because the DB will split data pages all the time (fragmentation). But the main reason I what a sequential GUID is because I want new rows to be inserted as the last row in the table... which will help when debugging.

I could make a clustered index on CreateDate, but our DB is auto-generated and in development, we need to do something extra to facilitate this. Also, CreateDate is not a good candidate for a clustered index.

Back in the day, I used Jimmy Nielsons COMB's, but I was wondering if there is something in the .NET framework for this. In SQL 2005 Microsoft introduced newsequentialid() as an alternative to newid(), so I was hoping that they made a .NET equivalent because we generate the ID in the code.

PS: Please don't start discussing if this is right or wrong, because GUIDs should be unique etc.

Photopia answered 17/10, 2008 at 9:1 Comment(3)
following @edg's comments, I really wonder why keeping records "ordered" makes sense or is of any interest to you. Aren't you trying here to solve an artificial problem or constraint?Ferreous
I'm pretty sure that the second paragraph (Third if you count "Hi") established exactly why he wants them in order. I wish people would stick to answering the actual question.Lentha
+1 for the last sentence.Lien
R
26

It should be possible to create a sequential GUID in c# or vb.net using an API call to UuidCreateSequential. The API declaration (C#) below has been taken from Pinvoke.net where you can also find a full example of how to call the function.

[DllImport("rpcrt4.dll", SetLastError=true)]
static extern int UuidCreateSequential(out Guid guid);

The MSDN article related to the UuidCreateSequential function can be found here which includes the prerequisites for use.

Radiant answered 17/10, 2008 at 11:10 Comment(4)
I marked this as the correct answer. This is not what I was after though. I wanted a COMB, and I orginally thougt that SequentialId's was the same as comb. That is... the Question my description is not aligned. But it seams that this is the correct answer to the question.Photopia
Beware that this sequential guid is not sequential in SQL Server as it uses another ordering algorithm, so using COMBs is a better optionCollude
I tested what Marc Climent said and can confirm itObscene
Here's a great .net core library to generate COMB guid's. github.com/richardtallent/RT.CombManna
H
23

Update 2018: Also check my other answer

This is how NHibernate generate sequantial IDs:

NHibernate.Id.GuidCombGenerator

/// <summary>
/// Generate a new <see cref="Guid"/> using the comb algorithm.
/// </summary>
private Guid GenerateComb()
{
    byte[] guidArray = Guid.NewGuid().ToByteArray();

    DateTime baseDate = new DateTime(1900, 1, 1);
    DateTime now = DateTime.Now;

    // Get the days and milliseconds which will be used to build the byte string 
    TimeSpan days = new TimeSpan(now.Ticks - baseDate.Ticks);
    TimeSpan msecs = now.TimeOfDay;

    // Convert to a byte array 
    // Note that SQL Server is accurate to 1/300th of a millisecond so we divide by 3.333333 
    byte[] daysArray = BitConverter.GetBytes(days.Days);
    byte[] msecsArray = BitConverter.GetBytes((long) (msecs.TotalMilliseconds / 3.333333));

    // Reverse the bytes to match SQL Servers ordering 
    Array.Reverse(daysArray);
    Array.Reverse(msecsArray);

    // Copy the bytes into the guid 
    Array.Copy(daysArray, daysArray.Length - 2, guidArray, guidArray.Length - 6, 2);
    Array.Copy(msecsArray, msecsArray.Length - 4, guidArray, guidArray.Length - 4, 4);

    return new Guid(guidArray);
}
Huth answered 25/9, 2012 at 9:28 Comment(2)
Like that it's based on Nilson's implementation and it's used by a well used libraryEogene
@Gian Marco Gherardi:Why does the algorithm only changes the latest section of guid?Ingres
V
18

It's important to note that the UUIDs generated by UuidCreateSequential will not be sequential when ordered by SQL Server.

  • SQL Server follows the RFC when it comes to sorting UUIDs
  • the RFC got it wrong
  • UuidCreateSequential did it right
  • but UuidCreateSequential creates something different from what SQL Server expects

Background

The Type 1 UUIDs created by UuidCreateSequential don't sort in SQL Server.

SQL Server's NewSequentialID uses UuidCreateSequential, with some byte shuffling applied. From the Books Online:

NEWSEQUENTIALID (Transact-SQL)

NEWSEQUENTIALID is a wrapper over the Windows UuidCreateSequential function, with some byte shuffling applied

which then references an MSDN blog post:

How to Generate Sequential GUIDs for SQL Server in .NET (archive)

public static Guid NewSequentialId()
{
   Guid guid;
   UuidCreateSequential(out guid);
   var s = guid.ToByteArray();
   var t = new byte[16];
   
   t[3] = s[0];
   t[2] = s[1];
   t[1] = s[2];
   t[0] = s[3];
   
   t[5] = s[4];
   t[4] = s[5];
   t[7] = s[6];
   t[6] = s[7];
   t[8] = s[8];
   t[9] = s[9];
   t[10] = s[10];
   t[11] = s[11];
   t[12] = s[12];
   t[13] = s[13];
   t[14] = s[14];
   t[15] = s[15];
   
   return new Guid(t);
}

It all starts with the number of ticks since 1582-10-15 00:00:00 (October 15, 1592, the date of Gregorian reform to the Christian calendar). Ticks is the number of 100 ns intervals.

For example:

  • 12/6/2017 4:09:39 PM UTC
  • = 137,318,693,794,503,714 ticks
  • = 0x01E7DA9FDCA45C22 ticks

The RFC says that we should split this value into three chunks:

  • UInt32 low (4 bytes)
  • Uint16 mid (2 bytes)
  • UInt32 hi (2 bytes)

So we split it up:

0x01E7DA9FDCA45C22

|   Hi   |   Mid  |    Low     |
|--------|--------|------------|
| 0x01E7 | 0xDA9F | 0xDCA45C22 |

And then the RFC says that these three integers should be written out in the order of:

  • Low: 0xDCA45C22
  • Mid: 0xDA9F
  • High: 0x01E7

If you follow the RFC, these values must be written in big-endian (aka "network byte order"):

DC A4 5C 22 DA 9F x1 E7 xx xx xx xx xx xx xx xx

This was a bad design, because you cannot take the first 8 bytes of the UUID and treat them either as a big-endian UInt64, nor as a little-endian UInt64. It's a totally dumb encoding.

UuidCreateSequential gets it right

Microsoft followed all the same rules so far:

  • Low: 0xDCA45C22
  • Mid: 0xDA9F
  • High: 0x1E7

But they write it out in Intel little-endian order:

22 5C A4 DC 9F DA E7 x1 xx xx xx xx xx xx xx xx

If you look at that, you've just written out a little-endian Int64:

225CA4DC9FDAE701

Meaning:

  • if you wanted to extract the timestamp
  • or sort by the timestamp

it's trivial; just treat the first 8 bytes as a UInt64.

With the RFC, you have no choice but to perform all kinds of bit fiddling. Even on big-endian machines, you can't treat the 64-bit timestamp as a 64-bit timestamp.

How to reverse it

Given a little-endian guid from UuidCreateSequential:

DCA45C22-DA9F-11E7-DDDD-FFFFFFFFFFFF

with the raw bytes of:

22 5C A4 DC 9F DA E7 11 DD DD FF FF FF FF FF FF

This decodes into:

Low      Mid  Version High
-------- ---- ------- ---- -----------------
DCA45C22-DA9F-1       1E7 -DDDD-FFFFFFFFFFFF
  • Low: 0xDCA45C22
  • Mid: 0xDA9F
  • High: 0x1E7
  • Version: 1 (type 1)

We can write this back out in RFC big-endian order:

DC A4 5C 22 DA 9F 11 E7 DD DD FF FF FF FF FF FF

Short version

               |   Swap      | Swap  | Swap  | Copy as-is
Start index    |  0  1  2  3 |  4  5 |  6  7 | 
End index      |  3  2  1  0 |  5  4 |  7  6 | 
---------------|-------------|-------|-------|------------------------ 
Little-endian: | 22 5C A4 DC | 9F DA | E7 11 | DD DD FF FF FF FF FF FF
Big-endian:    | DC A4 5C 22 | DA 9F | 11 E7 | DD DD FF FF FF FF FF FF
Valuer answered 6/12, 2017 at 20:19 Comment(6)
Why would anyone want to sort by Guid?Slovenia
@MariuszJamro You want newer rows to be physically stored after older rows. When older data occupies separate pages in the database, the limited RAM can more effectively be used to cache currently used data. See also: UUID "Version 6": The version RFC 4122 forgotValuer
Ahh so it's all about how SQL Server works internally. Thanks for clarificationSlovenia
@MariuszJamro Not quite. UUIDs are a glorified 128-bit timestamp. And generally people like being able to sort timestamps chronologically. Every database, or spreadsheet, or text file, like storing new rows at the end.Valuer
Something not mentioned enough, when this question comes up, is database indexing. Whether you use MySql, SQLServer, or PostgreSQL. Version 4 unique guids perform much slower than sequential guids because they are difficult to index. Why do you want to use a GUID? There are a few use cases including sharding and cross-regional database merging. Distributed systems are difficult and guids can be a nice way to maintain referential integrity across environments.Mola
@Mola Version 4 UUIDs are no more difficult to index than Version 1 UUIDs. We like using UUIDs because we know the primary key value without having to issue a (buffy, fragile, easily broken, many edge cases) query to read the PK back.Valuer
M
9

Perhaps a simple way to determine the order in which rows have been added would be to add an IDENTITY column to the table, avoiding the need to keep your GUIDS in order and hence avoiding the performance hit of maintaining a clustered index on the GUID column.

I can't help but wonder how keeping these rows in order helps you when debugging. Could you expand that a bit?

Meatman answered 17/10, 2008 at 9:9 Comment(3)
Wouldn't that add an extra column to my table? A column which is of no other use, and not a good choice for a clustered index! Regarding debugging: When I select * form my table I get the rows in random order. If I get the latest row in the bottom, I don't have do do any sorting every time I do so.Photopia
Why the fixation on having a Clustered Index? Please explain how this helps you with debugging. SELECT * FROM TABLE ORDER BY my_ident_column is not so hard, is it?Meatman
The standard way of handling this is to have an IDENTITY column. Yes, it'll add an extra column to your table, but that shouldn't be a big deal. It takes up 4 to 8 bytes a row depending on your setup. It's actually the perfect choice for a clustered index!Bacolod
H
7

You can use the tiny NewId library for this.

Install it via NuGet:

Install-Package NewId

And use it like this:

Guid myNewSequentialGuid =  NewId.NextGuid();

See Project Page on GitHub

Huth answered 13/3, 2018 at 12:56 Comment(0)
S
3

Here is the C# code to generate a COMB GUID.

byte[] guidArray = System.Guid.NewGuid().ToByteArray();

DateTime baseDate = new DateTime(1900, 1, 1);
DateTime now = DateTime.Now;

// Get the days and milliseconds which will be used to build the byte string 
TimeSpan days = new TimeSpan(now.Ticks - baseDate.Ticks);
TimeSpan msecs = new TimeSpan(now.Ticks - (new DateTime(now.Year, now.Month, now.Day).Ticks));

// Convert to a byte array 
// Note that SQL Server is accurate to 1/300th of a millisecond so we divide by 3.333333 
byte[] daysArray = BitConverter.GetBytes(days.Days);
byte[] msecsArray = BitConverter.GetBytes((long)(msecs.TotalMilliseconds / 3.333333));

// Reverse the bytes to match SQL Servers ordering 
Array.Reverse(daysArray);
Array.Reverse(msecsArray);

// Copy the bytes into the guid 
Array.Copy(daysArray, daysArray.Length - 2, guidArray, guidArray.Length - 6, 2);
Array.Copy(msecsArray, msecsArray.Length - 4, guidArray, guidArray.Length - 4, 4);

return new System.Guid(guidArray);
Schoolfellow answered 21/8, 2009 at 21:6 Comment(5)
I marked this as the "accepted answer" even though I haven't tested it. But it looks right. If anyone can tell me that this is wrong please, write a comment and I will undo this. PS: It seams to me that many don't se the benefit of GUID's as primary keys. Here is a two: 1. When createing objects like Order and OrderLine in the same Unit Of Work, one can set the OrderLine.OrderId before ever talking to the database. Thus mark OrderId as immutable. 2. If you have a project where you need to merge or migrate databases guid are great because you don't have to do integer manipulation.Photopia
I use this in my current project ComicsInventory.com and it works like a charm. Also I believe another benefit of using GUIDs is that you can use them in your client side code and not have to worry about someone figuring out the next id.Schoolfellow
This does not create a globally unique ID because it does not involve the network card. John's answer below is correct.Shavian
Ok, but is there a way to find out is algorithm for generating sequential guid is the same as used in SQL Server (NEWSEQUENTIALGUID)?Rhebarhee
I compared agains NEWSEQUENTIALID and it does not match at all: ms sql: {E275CCE3-AA11-EE11-BD14-782BCBAE1BCE}, {E375CCE3-AA11-EE11-BD14-782BCBAE1BCE}, this: {e3a005d6-c2b0-4441-9980-b02a00c210f6}, {621011f3-4065-48ab-b504-b02a00c210f7}Arbitrate
E
1

Unfortunatley, no there isn't a .NET equivalent to newsequentialid(). You could continue using a Comb. I actually have a C# implementation of a Comb somewhere...I'll see if I can dig it up.

Eichmann answered 17/10, 2008 at 9:26 Comment(0)
P
1

You can use these Snowflake style sequential id generators. They offer the same benefits of a Guid while having better clustered index compatibility that is similar to that of a sequential guid.

RT.COMB - .NET Core sequential GUIDs generator for MSSQL or PostgreSql (+DI support)

IdGen - Twitter Snowflake-alike ID generator for .Net Core (+DI support)

Flakey - Twitter Snowflake-alike ID generator for .Net

A point to note regarding dates in Guids

It's a good idea to always use UTC date/time values for COMB Guids, so they remain highly sequential regardless of server locale or daylight savings. More so important if the values will be generated on multiple machines that may differ in time zone.

Pseudohermaphroditism answered 2/1 at 13:10 Comment(0)
S
0

I've been lead to believe that random Guids can be beneficial to performance in some use cases. Apparently inserting to random pages can avoid contention that would otherwise occur in the end page when multiple people are trying to insert at the same time.

John's PInvoke suggestions is probably the closest to SQL's version but the UUidCreateSequential docs state that you shouldn't use it to identify an object that it's strictly local to the machine generating the Guid.

I'd be measuring the actual use case performance hit with realistic data in realistic quantities before I investigated sequential Guid generation any further.

Seaworthy answered 17/10, 2008 at 11:41 Comment(1)
I think "avoiding contention" in the last page pales in comparison to the complete and utter disaster this makes of any attempts to index a table using the uuid as the clustered index.Lentha
P
0

About the selected answer. The docs says... The generated Guid will not give you uniqueId between computers if they don't have ehternet access.

If you must know the guid when inserting, couldn't you let Sql-server return a block of sequential guids that you assign to your data before you insert them?

declare @ids table(id uniqueidentifier default NEWSEQUENTIALID(), dummy char(1))

declare @c int
set @c = 0;
while (@c < 100)
begin
    insert into @ids (dummy) values ('a');
    set @c += 1;
end

select id from @ids
Pagurian answered 14/10, 2010 at 18:5 Comment(1)
From what I read SQL Server doesn't generate really unique Guids .Obscene
D
-1

The key problem is knowing what the last value was in a .NET application. SQL Server keeps track of this for you. You will need to hold the last value yourself and use the Guid constructor with a byte array containing the next value. Of course, on a distributed application this probably isn't going to help and you may have to use the randomised Guids. (Not that I see anything wrong with this.)

http://msdn.microsoft.com/en-us/library/90ck37x3.aspx

Doerrer answered 17/10, 2008 at 9:9 Comment(1)
I didn't realize what the newsequentialId() is really doing. I don't need them to be 100% sequential. I just need the guid created at one point in time to have a lower sorting than one created later. Jimmy's COMB is using current time to do this and therefore I don't have to keep track of the ids.Photopia

© 2022 - 2024 — McMap. All rights reserved.