Floats vs rationals in arbitrary precision fractional arithmetic (C/C++)

Asked 3/8, 2012 at 15:23 Answered 28/9, 2015 at 15:16

Solved c++c performance floating-point arbitrary-precision

Since there are two ways of implementing an AP fractional number, one is to emulate the storage and behavior of the double data type, only with more bytes, and the other is to use an existing integer APA implementation for representing a fractional number as a rational i.e. as a pair of integers, numerator and denominator, which of the two ways are more likely to deliver efficient arithmetic in terms of performance? (Memory usage is really of minor concern.)

I'm aware of the existing C/C++ libraries, some of which offer fractional APA with "floats" and other with rationals (none of them features fixed-point APA, however) and of course I could benchmark a library that relies on "float" implementation against one that makes use of rational implementation, but the results would largely depend on implementation details of those particular libraries I would have to choose randomly from the nearly ten available ones. So it's more theoretical pros and cons of the two approaches that I'm interested in (or three if take into consideration fixed-point APA).

Schwejda answered 3/8, 2012 at 15:23 Comment(6)

Do you really need a floating-point number? Floating-point generally refers to representations that provide non-uniform precision: more "concentrated" around 0, dropping as we get farther away from 0, and catastrophically "sparse" at the remote ends of the range. Do you really need this property? Or did you just use floating-point as a generic term for any fractional number? In other words, why isn't fixed-point arithmetic considered? – Cinder 3/8, 2012 at 16:49

@Desmond Hume: The reason there are no fixed-point libraries, as I said in my answer, is because fixed-point does not require a dedicated "fixed-point" library. Fixed-point arithmetic is just integer arithmetic after all data has been multiplied by some constant factor (with some minor adjustments). In other word any big-integer library (again: key word being integer) serves at the same time as fixed-width fractional library. – Cinder 3/8, 2012 at 17:3

@DesmondHume: fixed point is merely a specialcase/optimization of fractional/rational. – Coldhearted 3/8, 2012 at 17:3

@AndreyT: There's definite pros to having at least a fixed point wrapper at least. Multiplication/division of fixed point is not just multiplying them. (Also IO and conversions and such) – Coldhearted 3/8, 2012 at 17:5

@Mooing Duck: You are formally right about mul and div. However, in many practical cases it is often necessary to calculate values using a balanced number of muls and divs, like x = y * a / b. In such cases the post-adjustment for mul and div is not necessary. This is the beauty of fixed point, since in such cases one can easily optimize and just use full efficiency of integer operations. In my applications (computational geometry) perfectly balanced muls/divs are actually encountered more often than non-balanced ones, which is what makes fixed-point especially attractive. – Cinder 3/8, 2012 at 17:13

Floats are rationals too by the way, with the denominator constrained to powers of two, which makes normalization cheap (shifts instead of dividing by the gcd). – Eldreeda 3/8, 2012 at 19:30

The question is what you mean by arbitrary precision that you mention in the title. Does it mean "arbitrary, but pre-determined at compile-time and fixed at run-time"? Or does it mean "infinite, i.e. extendable at run-time to represent any rational number"?

In the former case (precision customizable at compile-time, but fixed afterwards) I'd say that one of the most efficient solutions would actually be fixed-point arithmetic (i.e. none of the two you mentioned).

Firstly, fixed-point arithmetic does not require any dedicated library for basic arithmetic operations. It is just a concept overlaid over integer arithmetic. This means that if you really need a lot of digits after the dot, you can take any big-integer library, multiply all your data, say, by 2^64 and you basically immediately get fixed-point arithmetic with 64 binary digits after the dot (at least as long as arithmetic operations are concerned, with some extra adjustments for multiplication and division). This is typically significantly more efficient than floating-point or rational representations.

Note also that in many practical applications multiplication operations are often accompanied by division operations (as in x = y * a / b) that "compensate" for each other, meaning that often it is unnecessary to perform any adjustments for such multiplications and divisions. This also contributes to efficiency of fixed-point arithmetic.

Secondly, fixed-point arithmetic provides uniform precision across the entire range. This is not true for either floating-point or rational representations, which in some applications could be a significant drawback for the latter two approaches (or a benefit, depending on what you need).

So, again, why are you considering floating-point and rational representations only. Is there something that prevents you from considering fixed-point representation?

Cinder answered 3/8, 2012 at 16:47 Comment(1)

I'll note that when you're dealing with arbitrary precision numbers, there is effectively no (performance) difference between floating-point and fixed-point numbers. – Hynes 3/8, 2012 at 17:18

Since no one else seemed to mention this, rationals and floats represent different sets of numbers. The value 1/3 can be represented precisely with a rational, but not a float. Even an arbitrary precision float would take infinitely many mantissa bits to represent a repeating decimal like 1/3. This is because a float is effectively like a rational but where the denominator is constrained to be a power of 2. An arbitrary precision rational can represent everything that an arbitrary precision float can and more, because the denominator can be any integer instead of just powers of 2. (That is, unless I've horribly misunderstood how arbitrary precision floats are implemented.)

This is in response to your prompt for theoretical pros and cons.

I know you didn't ask about memory usage, but here's a theoretical comparison in case anyone else is interested. Rationals, as mentioned above, specialize in numbers that can be represented simply in fractional notation, like 1/3 or 492113/203233, and floats specialize in numbers that are simple to represent in scientific notation with powers of 2, like 5*2^45 or 91537*2^203233. The amount of ascii typing needed to represent the numbers in their respective human-readable form is proportional to their memory usage.

Please correct me in the comments if I've gotten any of this wrong.

Yate answered 28/9, 2015 at 15:16 Comment(0)

Either way, you'll need multiplication of arbitrary size integers. This will be the dominant factor in your performance since its complexity is worse than O(n*log(n)). Things like aligning operands, and adding or subtracting large integers is O(n), so we'll neglect those.

For simple addition and subtraction, you need no multiplications for floats^* and 3 multiplications for rationals. Floats win hands down.

For multiplication, you need one multiplication for floats and 2 multiplications for rational numbers. Floats have the edge.

Division is a little bit more complex, and rationals might win out here, but it's by no means a certainty. I'd say it's a draw.

So overall, IMHO, the fact that addition is at least O(n*log(n)) for rationals and O(n) for floats clearly gives the win to a floating-point representation.

^*It is possible that you might need one multiplication to perform addition if your exponent base and your digit base are different. Otherwise, if you use a power of 2 as your base, then aligning the operands takes a bit shift. If you don't use a power of two, then you may also have to do a multiplication by a single digit, which is also an O(n) operation.

Mudra answered 3/8, 2012 at 16:19 Comment(5)

You can't claim that you need no multiplications for float addition because you may in fact have to relatively normalize prior to the addition. – Avuncular 3/8, 2012 at 16:24

Are you sure that 2 integer multiplications efficiently implemented with, say, Karatsuba's algorithm, would be slower than 1 "float" multiplication? – Schwejda 3/8, 2012 at 16:24

@MarkB: That can be done with shifts if you're clever. Desmond: I think the "float" multiplication requires two integer multiplies. – Coldhearted 3/8, 2012 at 16:51

@MooingDuck Floating-point multiplications can be slightly cheaper than integer multiplies. That's because you often don't need the bottom half of the product - which allows room for optimizations. – Hynes 3/8, 2012 at 17:26

@MarkB As MooingDuck says: you can do it by shifting. Even if you don't use a power of 2 as a base, you can still get away with one multiplication by a single digit, which is also O(n). – Mudra 3/8, 2012 at 19:7

Rational numbers don't give arbitrary precision, but rather the exact answer. They are, however, more expensive in terms of storage and certain operations with them become costly and some operations are not allowed at all, e.g. taking square roots, since they do not necessarily yield a rational answer.

Personally, I think in your case AP floats would be more appropriate.

Endplay answered 3/8, 2012 at 16:5 Comment(6)

Although, if you have a good rational library, you can use it to work on continued fractions. – Unbecoming 3/8, 2012 at 16:7

I don't see a big problem in taking a square root from a rational using one of the well-established algorithms that rely purely on (a subset of) the four basic arithmetic operations. – Schwejda 3/8, 2012 at 16:13

@DesmondHume It is impossible to represent sqrt(2), or Pi as a ratio, because they are irrational numbers. For most polynomials, it is possible only to approximate an answer as a rational number. Those algorithms you mention are infinite series, and it would take an infinite number of steps to get the exact value (thus, it is impossible to get numerically). – Ipoh 3/8, 2012 at 16:21

@Ipoh No amount of memory would be enough to represent Pi in full precision, either it's rational or "float", they are both just approximations. What really prevents one in setting the maximum number of storage bytes after which sqrt(2) stops computing? The number of bytes would be arbitrary, which perfectly agrees with the idea of arbitrary precision arithmetic. – Schwejda 3/8, 2012 at 16:42

@DesmondHume Sure, and how much amount of memory would take to represent sqrt(361/9409) in full precision? It would take infinite memory using the "float" representation, but only 2 small integers using a ratio. What @Endplay should have said is that some operations are not allowed unless you still want exact results. This is so because the range of numbers you can represent with a fraction is greater than what you can do with a "float". By using rationals and avoiding operations that needs approximation, you can easily guarantee that no errors are incurred and your result is always exact. – Ipoh 3/8, 2012 at 17:24

@Ipoh that's what I did mean, even if I didn't say that explicitly. The only point in using the rational numbers is to get the exact answers. If one wants generic arbitrary precision, there's no sense in taking the harder route. And while it's true that you can get a rational approximation of PI or sqrt(2) or whatever, those operations are not implemented for rational numbers in most implementations. It is not impossible, it just makes no sense. – Endplay 3/8, 2012 at 23:51

You are effectively asking the question: "I need to participate in a race with my chosen animal. Should I choose a turtle or a snail ?".

The first proposal "emulating double" sounds like staggered precision: using an array of doubles of which the sum is the defined number. There is a paper from Douglas M. Priest "Algorithms for Arbitrary Precision Floating Point Arithmetic" which describes how to implement this arithmetic. I implemented this and my experience is very bad: The necessary overhead to make this run drops the performance 100-1000 times ! The other method of using fractionals has severe disadvantages, too: You need to implement gcd and kgv and unfortunately every prime in your numerator or denominator has a good chance to blow up your numbers and kill your performance.

So from my experience they are the worst choices one can made for performance.

I recommend the use of the MPFR library which is one of the fastest AP packages in C and C++.

Anthropomorphize answered 16/10, 2012 at 14:38 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags