Hardware implementation of square root?
Asked Answered
S

2

9

I'm trying to find a little bit more information for efficient square root algorithms which are most likely implemented on FPGA. A lot of algorithms are found already but which one are for example from Intel or AMD? By efficient I mean they are either really fast or they don't need much memory.

EDIT: I should probably mention that the question is generally a floating point number and since most of the hardware implements the IEEE 754 standard where the number is represented as: 1 sign bit, 8 bits biased exponent and 23 bits mantissa.

Thanks!

Sharpen answered 15/1, 2012 at 17:1 Comment(11)
#1529227 has detailed information.Genip
Why not implement this? You only do shifts and adds and no extra memory is needed for things like look-up tables. Looks like a good candidate for an FPGA.Femmine
Thanks for the comment @Alex. I'll try to find some more resources, because I still don't how can I implement that in VHDL. One more question, doesn't that find just the integer part of sqrt?Sharpen
Do you want to solve one square root as fast as possible, or solve a continuous stream of square roots as fast as possible?Jadeite
@DimitarPetrov: right, that particular piece of code calculates the integer square root of an integer. But you can reuse it for floating-point values too because sqrt(mantissa*2^exponent)=sqrt(mantissa)*2^(exponent/2) and you can always represent your number as an integer mantissa times some even power of 2. You should have included the details about floating-point square root and VHDL in the question. Actually, the VHDL probably deserves a separate question.Femmine
@GregS: Doesn't it mean that solving one as fast as possible will solve a continious stream also fast as possible? I was asking for one square root. Alex: The VHDL part is not so important right now, so that's why I didn't add that to the question. I'll try to implement it later.Sharpen
@DimitarPetrov: No. For example one may take 8 ticks per square root, while the other may take 16 ticks before the first answer comes out of the pipeline, but then each successive answer comes out one tick later.Jadeite
Thanks for the naswer @GregS. So any recommendations for algorithm calculating fast single square root?Sharpen
@DimitarPetrov: No, I'm useless in that regard, though I think Paul S has the correct answer.Jadeite
@Alex: "sqrt(mantissa*2^exponent)=sqrt(mantissa)*2^(exponent/2)" is true, but sqrt(mantissa) might be outside of the range 0.25->0.5. If that's the case you'll need to re-normalise it, and have to adjust the exponent +/- 1. This is just a final stage on the output though.Krakow
@PaulS: Of course one will need to make sure the result is properly normalized if possible.Femmine
K
6

Not a full solution, but a couple of pointers.

I assume you're working in floating point, so point 1 is remember that floating point is stored as a mantissa and exponent. The exponent of the square root will be approximately half the exponent of the original number thanks to logarithms.

Then the mantissa can be approximated with a look-up table, and then you can use a couple of newton-raphson rounds to give some accuracy to the result from the LUT.

I haven't implemented anything like this for about 8 years, but I think this is how I did it and was able to get a result in 3 or 4 cycles.

Krakow answered 16/1, 2012 at 2:9 Comment(3)
Thanks Paul! Can you point me to particular algorithm? Yep, I'm working in floating point and just editted my question... or could you expand a little bit your explanation because I didn't quite understand everyhing :)Sharpen
Unfortunately it's been so long that I can't remember the precise details, and it was for a previous employer so I can't look it up either. If you have specific questions I'll do my best to answer them.Krakow
Thanks @Paul. I've marked your question as best answer, since that gave me some ideas and point me (hopefully) in the right direction. ThanksSharpen
J
2

This is a great one for fast inverse-quare root.
Have a look at it here. Notice it's pretty much about the initial guess, rather amazing document :)

Jotham answered 15/1, 2012 at 17:7 Comment(4)
Thanks a lot! I've seen that already and looks pretty impresive, however the "magic number" scared me a little bit in the beginning. I will take a look again :)Sharpen
I don't feel this answer is really relevant to the question. This algorithm only computes a rough approximation of the inverse of the square root, and definitely is not was is implemented on FPGA'sOverwind
Thanks Sven. Does most implementations on FPGAs just use some variant on the Newton-Raphson method? Like some inversion to get rid from the division, which itself is a expensive operation?Sharpen
@DimitarPetrov The magic number is only required for the inverse square root. The square root can be achieved using a right shift and a few rounds of Newton-Raphson.Elmaelmajian

© 2022 - 2024 — McMap. All rights reserved.