Generate a unique id
Asked Answered
S

7

89

I am a student at university and our task is to create a search engine. I am having difficulty generating a unique id to assign to each url when added into the frontier. I have attempted using the SHA-256 hashing algorithm as well as Guid. Here is the code that i used to implement the guid:

public string generateID(string url_add)
{
    long i = 1;

    foreach (byte b in Guid.NewGuid().ToByteArray())
    {
        i *= ((int)b + 1);
    }

    string number = String.Format("{0:d9}", (DateTime.Now.Ticks / 10) % 1000000000);

    return number;
}
Scrivens answered 3/7, 2012 at 14:33 Comment(6)
A GUID is bound to be globally unique (hence the name), so I don't understand the problem.Alexandrite
I think his concern is he wants the ID to be unique based on the URL, so a one-way hash of the URL to a unique ID. In which case, SHA1 would work.Imbue
There's always object.GetHashCode(). Although I don't think that's guaranteed to be unique.Agouti
@Agouti that's pretty much guaranteed to be not uniqueOosperm
easy answer: return url_add;Oosperm
As mentioned in the answers, check out Guid.NewGuid().Venial
N
140

Why not just use ToString?

public string generateID()
{
    return Guid.NewGuid().ToString("N");
}

If you would like it to be based on a URL, you could simply do the following:

public string generateID(string sourceUrl)
{
    return string.Format("{0}_{1:N}", sourceUrl, Guid.NewGuid());
}

If you want to hide the URL, you could use some form of SHA1 on the sourceURL, but I'm not sure what that might achieve.

Nannana answered 3/7, 2012 at 14:36 Comment(9)
This worked... I initially wanted the id to be based on the url but this seems to work fine. Will it be able to generate large amounts of unique keys? Because the search engine will be working with a large quantity of urlsScrivens
This will be able to produce approximately 5,316,911,983,139,663,491,615,228,241,121,400,000 unique values.Nannana
Thanks alot! Thats more than enough because as url's are retrieved from the frontier they are then removedScrivens
Thanks! Basing it on the url worked as well! I figure that basing it on the url will make it more unique and a less chance of collision! Thanks alot!!!Scrivens
OH no! tested the code based on the url again and it didnt work... But the first works perfectly fine!Scrivens
I did mistakenly have it as string.format instead of string.Format... was that the source of your issue?Nannana
With regards to generating the id without basing the id on the url, will the characters always have a fixed length? Enquiring for the purpose of the db structureScrivens
Yes, a guid is a well defined structure. The total string length of a Guid.ToString("N") will be 32 characters long.Nannana
For great justice, use String.Format("{0}_{1:N}", sourceUrl, Guid.NewGuid())Apostate
A
40

Why don't use GUID?

Guid guid = Guid.NewGuid();
string str = guid.ToString();
Apostate answered 3/7, 2012 at 14:34 Comment(0)
S
36

Here is a 'YouTube-video-id' like id generator e.g. "UcBKmq2XE5a"

StringBuilder builder = new StringBuilder();
Enumerable
   .Range(65, 26)
    .Select(e => ((char)e).ToString())
    .Concat(Enumerable.Range(97, 26).Select(e => ((char)e).ToString()))
    .Concat(Enumerable.Range(0, 10).Select(e => e.ToString()))
    .OrderBy(e => Guid.NewGuid())
    .Take(11)
    .ToList().ForEach(e => builder.Append(e));
string id = builder.ToString();

It creates random ids of size 11 characters. You can increase/decrease that as well, just change the parameter of Take method.

0.001% duplicates in 100 million.

Stitching answered 7/7, 2017 at 0:27 Comment(3)
do you think its ok to use this an order number for a E-Commerce? is there a chance that two order will get the same id using that method? considering that maybe there will be 1K or 10K orders/day?Argentite
I wouldn't recommend using the above approach in your case. Best option is to use Guid. Also have a look at this github.com/dotnet/aspnetcore/blob/master/src/Servers/Kestrel/…Stitching
well in my case, if i rephrase, i need something something exactly like in your solution(an alphanumeric string about 8 characters) for the purpose of using it as an OrderNo in an E-Commerce app, i just added your solution to my project + checking the duplicates against the DB,if yes, generatate a new one. is that CorrelationIdGenerator class fits my scenario?Argentite
P
9

Why can't we make a unique id as below.

We can use DateTime.Now.Ticks and Guid.NewGuid().ToString() to combine together and make a unique id.

As the DateTime.Now.Ticks is added, we can find out the Date and Time in seconds at which the unique id is created.

Please see the code.

var ticks = DateTime.Now.Ticks;
var guid = Guid.NewGuid().ToString();
var uniqueSessionId = ticks.ToString() +'-'+ guid; //guid created by combining ticks and guid

var datetime = new DateTime(ticks);//for checking purpose
var datetimenow = DateTime.Now;    //both these date times are different.

We can even take the part of ticks in unique id and check for the date and time later for future reference.

Prissie answered 30/5, 2017 at 7:54 Comment(0)
P
8

This question seems to be answered, however for completeness, I would add another approach.

You can use a unique ID number generator which is based on Twitter's Snowflake id generator. C# implementation can be found here.

var id64Generator = new Id64Generator();

// ...

public string generateID(string sourceUrl)
{
    return string.Format("{0}_{1}", sourceUrl, id64Generator.GenerateId());
}

Note that one of very nice features of that approach is possibility to have multiple generators on independent nodes (probably something useful for a search engine) generating real time, globally unique identifiers.

// node 0
var id64Generator = new Id64Generator(0);

// node 1
var id64Generator = new Id64Generator(1);

// ... node 10
var id64Generator = new Id64Generator(10);
Perturbation answered 18/7, 2015 at 9:21 Comment(3)
Thanks for the tip! Exactly what I was looking for.Georgenegeorges
There's a NuGet with code at github.com/RobThree/IdGen that also does similar snowflake-based ids. Is the codeplex code for FlakeId owned by you? I'd like to get it to github and do a nuget if that's ok?Georgenegeorges
@dotnetguy, yes, I own that one. Sure, you can follow with github migration and nuget package.Perturbation
D
6

If you want to use sha-256 (guid would be faster) then you would need to do something like

SHA256 shaAlgorithm = new SHA256Managed();
byte[] shaDigest = shaAlgorithm.ComputeHash(ASCIIEncoding.ASCII.GetBytes(url));
return BitConverter.ToString(shaDigest);

Of course, it doesn't have to ascii and it can be any other kind of hashing algorithm as well

Dunsany answered 3/7, 2012 at 14:45 Comment(3)
I'd avoid ASCII in favor of some unicode encoding. It's trivial to find collisions given your code.Oosperm
I know, it's because I'm working with a legacy system at the moment so I'm wired for ascii :)Dunsany
I want the id to be unique based on the url. Thats the way that i thought of generating the codeScrivens
T
-3

We can do something like this

string TransactionID = "BTRF"+DateTime.Now.Ticks.ToString().Substring(0, 10);
Telescopy answered 2/12, 2019 at 10:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.