What is the math behind * 1233 >> 12 in this code counting decimal digits
Asked Answered
E

1

5

I am a bit confused how this short function from the C++ {fmt} library works.

inline std::uint32_t digits10_clz(std::uint32_t n) {
  std::uint32_t t = (32 - __builtin_clz(n | 1)) * 1233 >> 12;
  return t - (n < powers_of_10_u32[t]) + 1;
}

I understand the logic that you can approximate the log10 using log2(__builtin_clz) and that you need to adjust for exact value, but the multiplication is a mystery to me.

Estop answered 13/12, 2017 at 5:29 Comment(0)
S
10

Recall the formula for changing the base of logarithm from b to d is

logdx = logbx / logbd

In our case, b is 2 (binary), and d is 10 (decimal). Hence, you need to divide by log210, which is the same as multiplying by 1/log210, i.e by 0.30102999566.

Now recall that shifting by 12 is the same as dividing by 212, which is 4096. Dividing 1233 by 4096 yields 0.30102539062, which is a pretty good approximation for the denominator in the base change formula.

Slipsheet answered 13/12, 2017 at 5:41 Comment(9)
I have a further question. This trick of optimizing division is usually given by compiler and doesn't need to be in source code. Any possible reason it appears in source code this time?Socialite
@NickyC It's hard to say - perhaps they copied this from the source of bit tricks, and the author who wrote it there does not trust the compiler to optimize his code. I would replace the shift by division without hesitation.Slipsheet
@NickyC: because it's faster, and reliable. The other option to use floating point arithmetic is slower, and not reliable among platforms. The compiler doesn't do such optimizations, so this have to be in the source code.Pirn
@geza: That's untrue. Compilers have been doing such optimizations for the last 25 years. It's utterly reliable. (Note that the division in question is /4096)Shellans
@MSalters: okay, if we're talking about the /4096 -> >>12 transformation, then of course, this will be done by the compiler (for the unsigned case, they are exactly the same. For the signed case, they differ a little bit). I think I misunderstood Nicky's the question here... I was talking about the transformation of *0.301... -> *1233>>12.Pirn
@MSalters: I misunderstood it, because there's a little typo in the answer. (I thought that Nicky's talking about this divison: "Hence, you need to divide by")Pirn
@dasblinkenlight: There's a little typo in the answer: you either need to multiply by log10(2), or divide by log2(10).Pirn
@Pirn Thanks for the comment! I switched b and d in the formula, and edited the rest of the answer to match.Slipsheet
The author of fmtlib here: as @dasblinkenlight correctly guessed the formula is indeed taken from graphics.stanford.edu/~seander/bithacks.html#IntegerLog10 as pointed out in the comment github.com/fmtlib/fmt/blob/…Araucaria

© 2022 - 2024 — McMap. All rights reserved.