Gray code algorithm (32 bit or less)
Asked Answered
L

1

2

I have recently come across the Gray code, and I have been quite stuck trying to wrap my head around the efficient algorithm used to convert Gray code back to binary (32 bits or less).

num = num ^ (num >> 16);
num = num ^ (num >> 8);
num = num ^ (num >> 4);
num = num ^ (num >> 2);
num = num ^ (num >> 1);

This is the code that I am talking. Now here are my questions:

  • What is the difference between this and the normal code (right shift by 1 and XOR until mask == 0)?
  • Why are 16, 8, 4, 2, 1 used specifically as opposed to any other number less than 32 bit?
  • What is the difference if we do it in reverse as in:

    num = num ^ (num >> 1);
    num = num ^ (num >> 2);
    num = num ^ (num >> 4);
    num = num ^ (num >> 8);
    num = num ^ (num >> 16);
    

    I have tried this and it seems to yield the same result.

Lukash answered 5/5, 2016 at 10:15 Comment(1)
I have tried [reverse] and it seems to yield the same result. Can you prove it does? What is the difference between [shift by powers of 2] and the [basic algorithm] (right shift by 1…) Try smaller examples.Azaleeazan
L
5

For example, in bits (let's take 8)

h g^h f^g e^f d^e c^d b^c a^b

so what happens if we apply x ^= x >> 1? We get this

h g f^h e^g d^f c^e b^d a^c

That looks like what we started with only as though it was made by x ^ (x >> 2) instead of x ^ (x >> 1), so same idea to reverse only with a shift of 2:

h g f e d^h c^g b^f a^e

Looking good, and now it's obvious why x ^= x >> 4 will completely turn it back to normal. For more bits, the same pattern just goes on for a while longer.

An other way to look at this, is temporarily reversing the bits, turning the "to gray" into x ⊗ 3 with being multiplication in GF(2k), multiplication by odd numbers is invertible in GF(2k) and the multiplicative inverse of 3 is "all bits set", which you can find as follows:

  • start with y=3 and a temporary inverse i=1
  • kill the first bit in y (not the lsb) by XORing it with 3, set the respective bit in the inverse
  • loop until y=1

So the first steps are y=3, i=1, y=5, i=3, y=9, i=7 etc until you set all bits in i, lets call that final i inv.

Then we have (x ⊗ 3) ⊗ inv = x ⊗ (3 ⊗ inv) = x ⊗ 1 = x

Multiplying by "all bits set" means that every bit ends up being the XOR of itself and all lower bits, which you can do with

x ^= x << 1
x ^= x << 2
x ^= x << 4
...

First all bits are XORed by the bit next to them, then by the two next bits (which were already XORed together so this takes only one step), then by the next four bits etc.

Reverse the bits again to get what you started with.

But now the fun stuff.

why does the order not matter?

(yes in fact you can not only reverse the steps, you can shuffle them arbitrarily)

Ok, reverse the bits and go back to GF(2k). An other way to write every line then is

x = x ⊗ 3
x = x ⊗ 5
x = x ⊗ 17
...

With the end result of course being ((x ⊗ 3) ⊗ 5) ⊗ 17 = x ⊗ (3 ⊗ 5 ⊗ 17) = x ⊗ 127

Multiplication in GF(2k) is pretty nice and commutative, so it could just have well have been done in any order.

what about other numbers

Sure, as long a their product is inv. But all other choices lead to annoying/many multiplicands showing up. For example maybe we want 9 to be a factor, then the rest is 199, which can decompose to 9 ⊗ 63, and so on, but this keeps going for a while until you have maybe 3, 5, 9, 9, 17, 65 and that's just terrible (and note that 9 ⊗ 9 ⊗ 65 = 1 anyway if there are 8 bits so yea, just kick that out and be back to the original 3, 5, 17). Possible, though.

Lungan answered 5/5, 2016 at 10:58 Comment(3)
Thanks for the response, I've heard that the 1->2->4->8->16 has to do something with parallel prefix computations, what role does parallel prefix operations play over here?Lukash
@Lukash that's what they call it when they compute something cumulative in a parallel way, so here we have the prefix-XOR and compute it with bit-level parallelismLungan
Fun fact: gray->binary decoding can be done with one carryless multiply, according to twitter.com/rygorous/status/1050461906073341952 - for a 32-bit int in the bottom of __m128i vx, _mm_clmulepi64_si128(vx, _mm_set_epi32(0,0,1,-2), 0) produces the result in the top half of the low qword - _mm_extract_epi32(prod, 1).Percolation

© 2022 - 2024 — McMap. All rights reserved.