What is the math behind * 1233 >> 12 in this code counting decimal digits

About

Asked 13/12, 2017 at 5:29 Answered 13/12, 2017 at 5:41

Solved c++optimization bit-manipulation unsigned fmt

I am a bit confused how this short function from the C++ {fmt} library works.

inline std::uint32_t digits10_clz(std::uint32_t n) {
  std::uint32_t t = (32 - __builtin_clz(n | 1)) * 1233 >> 12;
  return t - (n < powers_of_10_u32[t]) + 1;
}

I understand the logic that you can approximate the log10 using log2(__builtin_clz) and that you need to adjust for exact value, but the multiplication is a mystery to me.

Estop answered 13/12, 2017 at 5:29 Comment(0)

Recall the formula for changing the base of logarithm from b to d is

log_dx = log_bx / log_bd

In our case, b is 2 (binary), and d is 10 (decimal). Hence, you need to divide by log₂10, which is the same as multiplying by 1/log₂10, i.e by 0.30102999566.

Now recall that shifting by 12 is the same as dividing by 2¹², which is 4096. Dividing 1233 by 4096 yields 0.30102539062, which is a pretty good approximation for the denominator in the base change formula.

Slipsheet answered 13/12, 2017 at 5:41 Comment(9)

I have a further question. This trick of optimizing division is usually given by compiler and doesn't need to be in source code. Any possible reason it appears in source code this time? – Socialite 13/12, 2017 at 5:43

@NickyC It's hard to say - perhaps they copied this from the source of bit tricks, and the author who wrote it there does not trust the compiler to optimize his code. I would replace the shift by division without hesitation. – Slipsheet 13/12, 2017 at 5:50

@NickyC: because it's faster, and reliable. The other option to use floating point arithmetic is slower, and not reliable among platforms. The compiler doesn't do such optimizations, so this have to be in the source code. – Pirn 13/12, 2017 at 9:38

@geza: That's untrue. Compilers have been doing such optimizations for the last 25 years. It's utterly reliable. (Note that the division in question is /4096) – Shellans 13/12, 2017 at 10:59

@MSalters: okay, if we're talking about the /4096 -> >>12 transformation, then of course, this will be done by the compiler (for the unsigned case, they are exactly the same. For the signed case, they differ a little bit). I think I misunderstood Nicky's the question here... I was talking about the transformation of *0.301... -> *1233>>12. – Pirn 13/12, 2017 at 11:12

@MSalters: I misunderstood it, because there's a little typo in the answer. (I thought that Nicky's talking about this divison: "Hence, you need to divide by") – Pirn 13/12, 2017 at 11:18

@dasblinkenlight: There's a little typo in the answer: you either need to multiply by log10(2), or divide by log2(10). – Pirn 13/12, 2017 at 11:19

@Pirn Thanks for the comment! I switched b and d in the formula, and edited the rest of the answer to match. – Slipsheet 13/12, 2017 at 11:30

The author of fmtlib here: as @dasblinkenlight correctly guessed the formula is indeed taken from graphics.stanford.edu/~seander/bithacks.html#IntegerLog10 as pointed out in the comment github.com/fmtlib/fmt/blob/… – Araucaria 30/12, 2017 at 2:20

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags