Generate a unique value for a combination of two numbers
Asked Answered
D

9

14

Consider I've two numbers 1023232 & 44. I want to generate a unique number representing this combination of numbers. How can i generate it?

Requirement

f(x,y) = f(y,x) and f(x,y) is unique for every (x,y) or (y,x)

Detta answered 19/11, 2010 at 15:13 Comment(4)
Your question is not clear. Do you mean a unique number that is somehow derived from any two given numbers? Or a number that is unique each time you have two numbers as above and has no bearing on what those two numbers actually are?Deathly
does the same input repeated need to yield the same output?Lightship
I've edited the question. For those two numbers (any order) i should generate the same unique number 'every time'.Detta
@Detta BTW: My initial answer didn't handle negative numbers correctly. I fixed that now.Humbuggery
H
13

if those are two ints, you could just do this:

ulong F(int x, int y) {
    ulong id = x > y ? (uint)y | ((ulong)x << 32) :  
                       (uint)x | ((ulong)y << 32);
    return id;
}

if you need to generate a truly unique value for two variables of a given size, you need about double the size of each variable. (ok, a bit less now that f(x,y) == f(y,x))

You could also get your original values back by reversing the same operation.

Humbuggery answered 19/11, 2010 at 15:16 Comment(33)
Can you explain you answer? I'm not able to understand it.Detta
An int has 32 bit, a long has 64 bit. by combining the two numbers and shifting the second one 32 bits to the left, each number gets 32 bits of the resulting long variable.Humbuggery
@NLV: On the assumption that your two numbers are 32 bit, he is creating a single 64 bit number by putting the first number in the lower 32 bits and the second number in the upper 32 bits.Seve
I'm getting this compiler error (using C#). "Bitwise-or operator used on a sign-extender operand;consider casting to a small unsigned type first."Detta
Hmm, i did not get this warning in LINQPad. You can wrap the method body in an unchecked { ... } block. That should help.Humbuggery
@Humbuggery could you explain what's the reason for this question?Nucleon
@Nucleon Why don't you ask the user that asked that question? I never worked with PHP. #Downvoter: Any comment?Humbuggery
@Humbuggery I asked you because he used your answer. He is asking on your answer. The reason is it doesn't give unique output for two numbers 2,9 and 10,11.Nucleon
@Nucleon it does give unique output in C#, the language this question here is about. This is NOT an answer to a PHP question and is not meant to be used for PHP. You cannot take this answer out of context and apply it to all programming languages in the world. See the original question above. It's about C#.Humbuggery
@Humbuggery Anything written in PHP need not vary from that written in C# when its about numbers, numbers have universally just one language and they dont lie. Of course in C# too it gives the same result for 10,1, 9,2, 8,3 etc. Did you try?Nucleon
Yes i did try. The outputs in order are 0xA00000001, 0x900000002, and 0x800000003. I confused a bracket on my last edit a few minutes ago, did you try the latest edit?Humbuggery
Works now! Canceled down vote. Sorry I missed that typo. May not be that efficient though..Nucleon
@Nucleon Numbers are also not always the same when it comes to programming languages. In math, yes, then numbers have only one language. But between programming languages there are things that vary, such as operator evaluation and the size of the data types. ulong in C# is 64 bit and unsigned, while an integer in PHP is signed and its size is determined by the CPU (32/64 bit). So there might indeed be a problem if the PHP code runs on a 32 bit machine.Humbuggery
@Nucleon Looks quite efficient to me so far (AFAIK CPUs are very fast when it comes to bitwise operations). Do you have a suggestion on how to improve it?Humbuggery
+1 I agree, but since they both (PHP and your former C# code) was giving the same result, I felt his adaptation was a direct port of your former code (in the sense PHP variable was equivalent to your long)x, like PHP used 64 bit by default or so). I do not know PHP.Nucleon
@Humbuggery bitwise operations are fast, but how fast? I do not know to make your function faster, but you could make one side of your expression shorter like this (long)x << 32 | (uint)y. I said its inefficient considering the space (also time) it takes to pair two fairly small numbers. This is fast for small numbers. but as size grows the speed hit will be noticeable. I posted a solution which is much faster.Nucleon
@Nucleon Bitwise operations often take just 1 CPU cycle. I don't know why the size of the numbers would make any difference. Also you know what they say about premature optimization :) Thanks for the hint, i removed unnecessary casts. The solution you used in your answer looks rather verbose to me. Have you actually tested the performance difference? Also, does it matter at all?Humbuggery
@Humbuggery I deal with this kind of algorithms a lot in my program and I am particular about speed and space, but yes wouldn't matter for most. My speed tests were skewed a bit the moment I wrote my question. Your method indeed seems to be faster just for the calculation sake (but not when I had to process the obtained variable - thats what I did to test speed differences initially - may be because in my application ulong is slower than long which my function returns). Nevertheless this is less space efficient. Just saying.Nucleon
@Humbuggery one final thing. Your code can break if a negative int is passed (unlike u say under question's comments). The cast to unsigned type cant happen there. Please change the method definition to unsigned input type which will earn a +1 from me :PNucleon
@Nucleon I've tested it with negative ints (like F(int.MinValue, int.MinValue) and F(-2, -99)), and i always got correct results. Did you leave the method body unchanged?Humbuggery
@Humbuggery how is that even possible? how can you cast -2 to (uint)? Either you have try catch, or an unchecked wrapper (ok kidding), or you will have to turn on Check arithmetic oveflow/underflow from VS project properties->build->advanced->checkbox check..Nucleon
@Nucleon For non-constant expressions such as (uint)x, the C# compiler by default doesn't use overflow checking. Constant expressions such as (uint)-2 though, are checked at compile time. That's why it works using the method.Humbuggery
Yes by default, but to test it ideally, you should check arithmetic overflows. Do you know how does VS suppress overflows by default? What magic it performs?Nucleon
@Nucleon If someone is using another setting than the default, then obviously an unchecked block would be needed. But this answer applies to the default conditions. As most answers on stackoverflow, it is not meant to provide a all-encompassing Framework-Ready solution for all possible compiler configurations in the world, but rather to provide guidance on a possible way to solve the problem that the author of the question faces.Humbuggery
@Humbuggery quite agreed. Just my indignation towards compiler tricks and such which are not pure mathematical solutions! :)Nucleon
Do I really need to shift the larger number to the left and then fit in the smaller number? Guess it works either way as long as I decode them in the reverse manner.Selenodont
@Humbuggery Thanks for the quick reply. 'Sure' means essentially I do not need the 'x > y' check? I can just do encoded = (uint)x | ((ulong)y << 32) to encode and to decode x = encoded >> 32 and y = encoded & xFFFFFFFF.Am I right?Selenodont
@Selenodont No, you'll have to do some check. The numbers need to be combined in a fixed order, either the larger one or the smaller one on the left side. That's because the OP said that f(x,y) = f(y,x). And if the numbers would not be ordered, you would get a different value for e.g. f(3,5) and f(5,3).Humbuggery
@Humbuggery If my requirement is that f(3,5) != f(5,3) then I guess I do not need the check or?Selenodont
@Selenodont Yes, in this case you shouldn't do the check.Humbuggery
this is wrong way for F(1,2) and F(2,1) you get same valueNonstriated
@AliYousefie And that behavior is what the question asks for: "f(x,y) = f(y,x)"Humbuggery
what is mean "f(x,y) = f(y,x) and f(x,y) is unique for every (x,y) or (y,x)" he need uniq for that or not idkNonstriated
G
1

Use the fact that length of Int32 as string is <= 10, store the length of the first int's string representation modulo 10 as the last digit of an Int64:

int num1 = 1023232202;
int num2 = 44;
string encoded = num1.ToString() + num2.ToString() +  
    (num1.ToString().Length % 10).ToString();
Int64 result = Convert.ToInt64(encoded);

encoded = "1023232202440"

result = 1023232202440

To decode this you just need to extract the last digit of the string representation (encoded) and then convert the other digits back to int using two calls to Convert.ToInt32(Substring).

encoded = result.ToString();
int firstDigits = Convert.ToInt32(encoded[encoded.Length - 1] - '0');
if (firstDigits == 0)
{
    firstDigits = 10;
}
num1 = Convert.ToInt32(encoded.Substring(0, firstDigits));
num2 = Convert.ToInt32(encoded.Substring(firstDigits, 
    encoded.Length - firstDigits - 1));

To handle negatives - since # of digits <= 10, you could add two more data bits in the last digit to store a sign for each of your ints - 1 for positive, 0 for negative. Also - result won't fit in Int64 if both of your ints are very large, you would have to use BigInteger from System.Numerics

Gouda answered 19/11, 2010 at 15:21 Comment(2)
Yes i do. I can have negative values.Detta
@Detta - then my code won't work for you as is, sorry. Since # of digits <= 10, you could add two more data bits in the last digit to store a sign for each of your ints - 1 for positive, 0 for negative. Also - result won't fit in Int64 if both of your ints are very large, you would have to use BigInteger from System.Numerics.Gouda
H
1

If you are using ints and don't mind the result being a long, this should work:

Math.Max(x, y) << 32 | Math.Min(x, y)

The fact that the numbers are stored in the high and low dwords of the result get you your uniqueness constraint.

The fact that the higher number is always in the high dword gets you the symmetry you wanted.

Highfalutin answered 19/11, 2010 at 15:22 Comment(0)
N
1

You could use the function given here. This is the most space efficient I have seen and also doesn't involve any string approaches. The native function in the link won't work for negative integers though. But you can modify it as shown below to make it work for negative integers.

This will give back negative results too. For more on it and other options see this SO answer.

public static long GetHashCode_OrderIrrelevant(int a, int b)
{
    return GetHashCode(Math.Min(a, b), Math.Max(a, b));
}

public static long GetHashCode(int a, int b)
{
    var A = (ulong)(a >= 0 ? 2 * (long)a : -2 * (long)a - 1);
    var B = (ulong)(b >= 0 ? 2 * (long)b : -2 * (long)b - 1);
    var C = (long)((A >= B ? A * A + A + B : A + B * B) / 2);
    return a < 0 && b < 0 || a >= 0 && b >= 0 ? C : -C - 1;
}
Nucleon answered 14/12, 2012 at 7:16 Comment(2)
What would be the modified (for signed integers) unhash function?Frankenstein
@NoeticJun, I dont have it now. I will post it when I'm free. It wont be very difficult. Just reverse the process.Nucleon
R
0

Botz3000 gives the "correct" solution. I'd just add: To solve the problem, you must know the maximum possible size of each number, and an acceptable result must be the sum of the sizes of the two numbers. i.e. if each number is guaranteed to fit in 32 bits, as Botz3000 assumes, then the result will require 64 bits. If that is not acceptable -- if, say, you have a requirement that the input will be two 32 bit numbers and the output must fit in 32 bits -- then the problem is not solvable, because there aren't enough possible different answers.

If that's not clear, consider a trivial case: suppose the inputs are each 1 bit, 0 or 1. So there are two possible values for each number, 2x2=4 possible combinations. Therefore your output must be at least 2 bits. As you say that f(x,y)=f(y,x), you reduce the total number of possible answers by a factor somewhat less than 2. Again, in the 1 bit example, there are only 3 distinct possibilities: 0,0; 0,1; and 1,1. 1,0 isn't a distinct possibility because it's the same as 0,1.

Rodgers answered 19/11, 2010 at 18:14 Comment(0)
N
0

First you have to know you cannot make uniq value from two int.MaxValue to one int, and @Botz3000 answer don't make uniq value from F(1,2) and F(2,1) so you can use this method:

public static long GetFixedCode(int x, int y)
{
    return BitConverter.ToInt64(BitConverter.GetBytes(x).Concat(BitConverter.GetBytes(y)).ToArray(), 0);
}

This will work for anything and you can change result and parameters to short,ushort,int,uint, or result for ulong because its working with bytes.you need just change BitConverter method as you want to convert.

Example for get smaller value (from two small int you will get small long):

 public static ulong GetFixedCode(uint x, uint y)
    {
        var array1 = BitConverter.GetBytes(x);
        var array2 = BitConverter.GetBytes(y);
        List<byte> resultArray = new List<byte>();
        resultArray.AddRange(array1.ToList().GetRange(0, 2));
        resultArray.AddRange(array2.ToList().GetRange(0, 2));
        resultArray.AddRange(array1.ToList().GetRange(2, 2));
        resultArray.AddRange(array2.ToList().GetRange(2, 2));

        return BitConverter.ToUInt64(resultArray.ToArray(), 0);
    }
Nonstriated answered 15/5, 2018 at 7:42 Comment(0)
L
-1

If you can represent it as a string, this should work:

Hash((int1 | int2).ToString());

Like so:

public static string Hash(string plaintext)
{
var hashAlgorithm = new SHA1CryptoServiceProvider();
var unhashedBuffer = Encoding.Default.GetBytes(plaintext);
var hashedBuffer = hashAlgorithm.ComputeHash(unhashedBuffer);
return Convert.ToBase64String(hashedBuffer);
)
Lightship answered 19/11, 2010 at 15:18 Comment(4)
the problem is that there would be no difference between 102323 & 244 and 1023 & 23244Xl
How do i know in what order should i concatenate it?Detta
No, or'ing them still wouldn't work... you'd still have collisions. The only way is to do the bit-shiftXl
-1. How is this even returning a number? Not sure how you can construct a unique number from the byte[] or the encoded string..Nucleon
N
-1

You can combine the two numbers into a string, and generate a hash based on that string using SHA1.

Nihil answered 19/11, 2010 at 15:19 Comment(1)
I can concatenate it either way. How would i know that?Detta
P
-1

If X & Y are Int add a seperator. Always unique.

X = 100, Y = 5 => 100.5

X = 1023232, Y = 44 => 1023232.44

Polo answered 19/11, 2010 at 15:23 Comment(5)
Good idea - how do you guarantee no loss of precision here though?Gouda
Using a datatype which can store 64 bits of dataPolo
-1 since this gives same result for 100,5 and 100,50. Also take into consideration the mess if your numbers were large, the resultant might suffer from precision errors..Nucleon
@Nucleon I love it how you react to a question from two years ago :D.Polo
@rdkleine haha, the questions here remain for ever (hopefully) or as long. We should exercise our suffrage rightfully to alert ppl of good answers and (relatively) poor answers! :)Nucleon

© 2022 - 2024 — McMap. All rights reserved.