Addition and multiplication in a Galois Field
Asked Answered
C

2

14

I am attempting to generate QR codes on an extremely limited embedded platform. Everything in the specification seems fairly straightforward except for generating the error correction codewords. I have looked at a bunch of existing implementations, and they all try to implement a bunch of polynomial math that goes straight over my head, particularly with regards to the Galois fields. The most straightforward way I can see, both in mathematical complexity and in memory requirements is a circuit concept that is laid out in the spec itself:

circuit diagram

With their description, I am fairly confident I could implement this with the exception of the parts labeled GF(256) addition and GF(256) Multiplication.

They offer this help:

The polynomial arithmetic for QR Code shall be calculated using bit-wise modulo 2 arithmetic and byte-wise modulo 100011101 arithmetic. This is a Galois field of 2^8 with 100011101 representing the field's prime modulus polynomial x^8+x^4+x^3+x^2+1.

which is all pretty much greek to me.

So my question is this: What is the easiest way to perform addition and multiplication in this kind of Galois field arithmetic? Assume both input numbers are 8 bits wide, and my output needs to be 8 bits wide also. Several implementations precalculate, or hardcode in two lookup tables to help with this, but I am not sure how those are calculated, or how I would use them in this situation. I would rather not take the 512 byte memory hit for the two tables, but it really depends on what the alternative is. I really just need help understanding how to do a single multiplication and addition operation in this circuit.

Chanel answered 9/12, 2011 at 3:21 Comment(0)
T
11

In practice only one table is needed. That would be for the GP(256) multiply. Note that all arithmetic is carry-less, meaning that there is no carry-propagation.

Addition and subtraction without carry is equivalent to an xor.

So in GF(256), a + b and a - b are both equivalent to a xor b.

GF(256) multiplication is also carry-less, and can be done using carry-less multiplication in a similar way with carry-less addition/subtraction. This can be done efficiently with hardware support via say Intel's CLMUL instruction set.

However, the hard part, is reducing the modulo 100011101. In normal integer division, you do it using a series of compare/subtract steps. In GF(256), you do it in a nearly identical manner using a series of compare/xor steps.

In fact, it's bad enough where it's still faster to just precompute all 256 x 256 multiplies and put them into a 65536-entry look-up table.

page 3 of the following pdf has a pretty good reference on GF256 arithmetic:

http://www.eecs.harvard.edu/~michaelm/CS222/eccnotes.pdf

Torchwood answered 9/12, 2011 at 3:37 Comment(13)
Reducing addition to xor is great. a 65k lookup table is out of the question when I am working with <2k of ram. I can sacrifice time for memory when necessary, but I am still not clear on how a multiplication would work the long way.Chanel
It's actually very similar to the shift-add binary-multiplication algorithm. Except that you replace all the adds with xors.Torchwood
When does the 100011101 come into play? Furthermore, why is that 9 bits when each byte is only 8?Chanel
It comes in when you do the modulus. The extra bit is implicit. (though it's been a while so I'm not sure how to explain it) I'll see if I can dig up my example codeTorchwood
I found my code that does the multiply-mod in GF(256). But it relies on the CLMUL instruction. I don't have time right now to expand on it, so I'll do it tomorrow. (unless someone gives a better answer by then)Torchwood
looking at the zxing source: they use this formula for multiplication: int logSum = logTable[a] + logTable[b]; return expTable[(logSum % size) + logSum / size]; Is that sufficient? It would require both tables, but that may be acceptable.Chanel
I've never seen that formula before - let alone know how it works. Are you sure that is for GF(256)? It seems like log-tables shouldn't work in modular field. (though I may be completely wrong)Torchwood
Both 256 byte tables are precalculated through a process I don't understand. They are essentially inverses of each other. Source is code.google.com/p/zxing/source/browse/trunk/core/src/com/google/…Chanel
I'll take a look at that link tomorrow after my final. As weak as my math is, this is a topic that fascinates me.Torchwood
@Torchwood - 10010010 is not a valid poly for GF(256), it should be 9 bits, and the least significant bit needs to be 1. It's not an extra bit, the modulo operation divides some n bit result by the 9 bit polynomial, resulting in an 8 bit remainder, and that remainder is the modulus. For 100011101, all non-zero numbers are powers of 1x+0 == 2. Multiply can be done using product = exp[(log[a]+log[b])%255]. The %255 can be done with if (sum >= 255) sum -= 255.Intact
@Intact That's most likely a typo on my part. But given that it's been so many years, I do not recall what I actually used.Torchwood
@Torchwood - reducing modulo can be done using carryless multiplies using the "inverse" of the polynomial. Using P(x) to represent the polynomial, then using 64 bit carryless math, the modulo of A = A xor ( (A x (2^64 / P(x) ) x P(x) ) (along with some shifts), where (2^64 / P(x)) is a pre-computed constant.Intact
@Torchwood - Here is a lengthy example (500+ lines) using xmm registers and pclmulqdq , and here is a Intel pdf doc . Look at the "Barret reduction part", where constants for (2^64 / P(x)) and P(x) are used along with shifts.Intact
P
7

(I'm following up on the pointer to zxing in the first answer, since I'm the author.)

The answer about addition is exactly right; that's why working in this field is convenient on a computer.

See http://code.google.com/p/zxing/source/browse/trunk/core/src/com/google/zxing/common/reedsolomon/GenericGF.java

Yes multiplication works, and is for GF256. a * b is really the same as exp(log(a) + log(b)). And because GF256 has only 256 elements, there are only 255 unique powers of "x", and same for log. So these are easy to put in a lookup table. The tables would "wrap around" at 256, so that is why you see the "% size". "/ size" is slightly harder to explain in a sentence -- it's because really 1-255 "wrap around", not 0-255. So it's not quite just a simple modulus that's needed.

The final piece perhaps is how you reduce modulo an irreducible polynomial. The irreducibly polynomial is x^8 plus some lower-power terms, right -- call it I(x) = x^8 + R(x). And the polynomial is congruent to 0 in the field, by definition; I(x) == 0. So x^8 == -R(x). And, conveniently, addition and subtraction are the same, so x^8 == -R(x) == R(x).

The only time we need to reduce higher-power polynomials is when constructing the exponents table. You just keep multiplying by x (which is a shift left) until it gets too big -- gets an x^8 term. But x^8 is the same as R(x). So you take out the x^8 and add in R(x). R(x) merely has powers up to x^7 so it's all in a byte still, all in GF(256). And you know how to add in this field.

Helps?

Pandich answered 9/12, 2011 at 17:27 Comment(4)
So if I read correctly, simply using bytes with wraparound and adding would not be adequate? Div and mod are rather expensive operations, so an addition only way would be best. I can implement byte addition that wraps 254->255->1->2 if necessary.Chanel
Ah the expert finally steps in... +1Torchwood
I'm actually having issues getting a reference implementation of the circuit concept working, even using the zxing code for the GF addition and multiplication. Maybe I will post details to the zxing forum.Chanel
If you know that the field size is 256, then the / and % operations are really just >>8 and &0xFF. I imagine you could optimize this a little more. But, in the QR code decoding process, the Reed-Solomon error correction process is a very small part. I would be surprised if this is your bottleneck at all.Pandich

© 2022 - 2024 — McMap. All rights reserved.