Im using the "De Bruijn" Algorithm to discover the number of digits in binary that a big number (up to 64bits) has.
For example:
- 1022 has 10 digits in binary.
- 130 has 8 digits in binary.
I found that using a table lookup based on De Bruijn give me the power to calculate this x100 times faster than conventional ways (power, square, ...).
According to this website, 2^6 has the table to calculate the 64 bits numbers. this would be the table exposed in c#
static readonly int[] MultiplyDeBruijnBitPosition2 = new int[64]
{
0,1,2,4,8,17,34,5,11,23,47,31,63,62,61,59,
55,46,29,58,53,43,22,44,24,49,35,7,15,30,60,57,
51,38,12,25,50,36,9,18,37,10,21,42,20,41,19,39,
14,28,56,48,33,3,6,13,27,54,45,26,52,40,16,32
};
(I dont know if i brought the table from that website correctly) Then, based on the R.. comment here. I should use this to use the table with the input uint64 number.
public static int GetLog2_DeBruijn(ulong v)
{
return MultiplyDeBruijnBitPosition2[(ulong)(v * 0x022fdd63cc95386dull) >> 58];
}
But the c# compiler doesnt allow me to use "0x022fdd63cc95386dull" because it overflows 64bits. And i have to use "0x022fdd63cc95386d" instead.
Using those codes. The problem is that i am not getting the correct result for the input given.
For example, doing 1.000.000 calculations of the number: 17012389719861204799 (64bits used) This is the result:
- Using pow2 method i get the result 64 1 Million times in 1380ms.
- Using DeBruijn method i get the result 40 1 Million times in 32ms. (Dont know why 40)
Im trying to understand how "De Bruijn" works, and how can i fix this and create a final code for c# to calculate up to 64bits numbers.
UDPATE and benchmarks of different solutions
I was looking for the fastest algorithm to get the number of digits in binary that a unsigned given number of 64bits has in c# (known as ulong).
For example:
- 1024 has 11 binary digits. (2^10+1) or (log2[1024]+1)
- 9223372036854775808 has 64 binary digits. (2^63+1) or (log2[2^63]+1)
The conventional power of 2 and square is extremely slow. and just for 10000 calculations it needs 1500ms to get the answer. (100M calculations needs hours).
Here, Niklas B., Jim Mischel, and Spender brought differents methods to make this faster.
- SIMD and SWAR Techniques //Provided by Spender (Answer here)
- De_Bruijn Splited 32bits //Provided by Jim Mischel (Answer here)
- De_Bruijn 64bits version //Provided by Niklas B. (Answer here)
- De_Bruijn 128bits version //Also provided by Niklas B. (Answer here)
Testing this Methods with a CPU Q6600 overclocked to 3Ghz using Windows 7 (64bits) Gives the following results.
As you can see, it takes just a few seconds to find correctly 100,000,000 of request given, being De_Bruijn 128bits version the fastest.
Thanks a lot to all of you, you help me a lot with this. I hope this helps you too.
0x022fdd63cc95386dull
gives a compiler error. Wouldn't it be0x022fdd63cc95386dUL
? It doesn't overflow 64-bits here. – Pincince>2^63
thanks to the branch. – Eyla>2^63
making all the operations 2ms slower, and when is bigger than 2^63, is significantly faster (it needs 1/3 the normal time) so at the end, in this case, in big proportions, the branch compensate the lag of having it. But as i was saying, im using 128bits version that is always faster than any other. Thank you again. – Eyla