How to properly round to the nearest integer in double-double arithmetic

Asked 12/10, 2023 at 21:58 Answered 16/10, 2023 at 2:53

Solved python precision floating-accuracy pypy double-double-arithmetic

I have to analyse a large amount of data using Python3 (PyPy implementation), where I do some operations on quite large floats, and must check if the results are close enough to integers.

To exemplify, say I'm generating random pairs of numbers, and checking if they form pythagorean triples (are sides of right triangles with integer sides):

from math import hypot
from pprint import pprint
from random import randrange
from time import time

def gen_rand_tuples(start, stop, amount):
    '''
    Generates random integer pairs and converts them to tuples of floats.
    '''
    for _ in range(amount):
        yield (float(randrange(start, stop)), float(randrange(start, stop)))

t0 = time()
## Results are those pairs that results in integer hypothenuses, or
## at least very close, to within 1e-12.
results = [t for t in gen_rand_tuples(1, 2**32, 10_000_000) if abs((h := hypot(*t)) - int(h)) < 1e-12]
print('Results found:')
pprint(results)
print('finished in:', round(time() - t0, 2), 'seconds.')

Running it I got:

Python 3.9.17 (a61d7152b989, Aug 13 2023, 10:27:46)
[PyPy 7.3.12 with GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
Type "help", "copyright", "credits" or "license()" for more information.
>>> 
===== RESTART: /home/user/Downloads/pythagorean_test_floats.py ====
Results found:
[(2176124225.0, 2742331476.0),
 (342847595.0, 3794647043.0),
 (36.0, 2983807908.0),
 (791324089.0, 2122279232.0)]
finished in: 2.64 seconds.

Fun, it ran fast, processing 10 million datapoints in a bit over 2 seconds, and I even found some matching data. The hypothenuse is apparently integer:

>>> pprint([hypot(*x) for x in results])
[3500842551.0, 3810103759.0, 2983807908.0, 2265008378.0]

But not really, if we check the results using the decimal arbitrary precision module, we see the results are not actually not close enough to integers:

>>> from decimal import Decimal
>>> pprint([(x[0]*x[0] + x[1]*x[1]).sqrt() for x in (tuple(map(Decimal, x)) for x in results)])
[Decimal('3500842551.000000228516418075'),
 Decimal('3810103758.999999710375341513'),
 Decimal('2983807908.000000217172157183'),
 Decimal('2265008377.999999748566051441')]

So, I think the problem is the numbers are large enough to fall in the range where python floats lack precision, so false positives are returned.

Now, we can just change the program to use arbitrary precision decimals everywhere:

from decimal import Decimal
from pprint import pprint
from random import randrange
from time import time

def dec_hypot(x, y):
    return (x*x + y*y).sqrt()

def gen_rand_tuples(start, stop, amount):
    '''
    Generates random integer pairs and converts them to tuples of decimals.
    '''
    for _ in range(amount):
        yield (Decimal(randrange(start, stop)), Decimal(randrange(start, stop)))

t0 = time()
## Results are those pairs that results in integer hypothenuses, or
## at least very close, to within 1e-12.
results = [t for t in gen_rand_tuples(1, 2**32, 10_000_000) if abs((h := dec_hypot(*t)) - h.to_integral_value()) < Decimal(1e-12)]
print('Results found:')
pprint(results)
print('finished in:', round(time() - t0, 2), 'seconds.')

Now we don't get any false positives, but we take a large performance hit. What previously took a bit over 2s, now takes over 100s. It appears decimals are not JIT-friendly:

====== RESTART: /home/user/Downloads/pythagorean_test_dec.py ======
Results found:
[]
finished in: 113.82 seconds.

I found this answer to the question, CPython and PyPy Decimal operation performance, suggesting the use of double-double precision numbers as a faster, JIT-friendly alternative to decimals, to get better precision than built-in floats. So I pip installed the doubledouble third-party module, and changed the program accordingly:

from doubledouble import DoubleDouble
from decimal import Decimal
from pprint import pprint
from random import randrange
from time import time

def dd_hypot(x, y):
    return (x*x + y*y).sqrt()

def gen_rand_tuples(start, stop, amount):
    for _ in range(amount):
        yield (DoubleDouble(randrange(start, stop)), DoubleDouble(randrange(start, stop)))

t0 = time()
print('Results found:')
results = [t for t in gen_rand_tuples(1, 2**32, 10_000_000) if abs((h := dd_hypot(*t)) - int(h)) < DoubleDouble(1e-12)]
pprint(results)
print('finished in:', round(time() - t0, 2), 'seconds.')

But I get this error:

======= RESTART: /home/user/Downloads/pythagorean_test_dd.py ======
Results found:
Traceback (most recent call last):
  File "/home/user/Downloads/pythagorean_test_dd.py", line 24, in <module>
    results = [t for t in gen_rand_tuples(1, 2**32, 10_000_000) if abs((h := dd_hypot(*t)) - int(h)) < DoubleDouble(1e-12)]
  File "/home/user/Downloads/pythagorean_test_dd.py", line 24, in <listcomp>
    results = [t for t in gen_rand_tuples(1, 2**32, 10_000_000) if abs((h := dd_hypot(*t)) - int(h)) < DoubleDouble(1e-12)]
TypeError: int() argument must be a string, a bytes-like object or a number, not 'DoubleDouble'

I think the problem is the module doesn't specify a conversion or rounding to the nearest integer method. The best I could write was an extremely contrived "int" function, that rounds a double-double to the nearest integer by doing a round-trip through string and decimals and back to DoubleDouble:

def contrived_int(dd):
    rounded_str = (Decimal(dd.x) + Decimal(dd.y)).to_integral_value()
    hi = float(rounded_str)
    lo = float(Decimal(rounded_str) - Decimal(hi))
    return DoubleDouble(hi, lo)

But it's very roundabout, defeats the purpose of sidesteping decimals and makes the progam even slower than the full-decimal version.

Then I ask, is there a fast way to round a double-double precision number to the nearest integer directly, without intermediate steps going through decimals or strings?

Fleda answered 12/10, 2023 at 21:58 Comment(4)

Maybe you should not use floating-point arithmetic for this problem, which is about integers? There are some ways to detect if a number is a perfect square using integers, or just do something like "y=math.sqrt(x); round y to nearest integer; check if y*y==x", which should work until you go past the precision limit of doubles (something like 2**50) – Jolly 13/10, 2023 at 7:43

Thank you, @ArminRigo The values I'm interested can go over 2^50 surely, up to 10^20 ~ 2^66, despite not in the particular examples I used, hence the interest in using doubledouble precision. I should have made the question clearer about that. Also, despite starting with integers, they go through a square root step that generates floats anyway, so I can't use pure integer arithmetic everywhere. – Fleda 13/10, 2023 at 20:24

If you need to work with large numbers, perhaps mpmath.org, a library for floating-point arithmetic with arbitrary precision, might be of interest to you. – Hutchings 15/10, 2023 at 12:34

Perfect example of an XY problem. Congratulations on providing enough context to get good answers. – Doth 18/10, 2023 at 0:0

Since Python integers have no upper limits, and you're looking for integral results, you should stick to integer inputs and integer operations. In your example, you can use math.isqrt to perform integer square root instead to avoid any imprecision of floating-point numbers altogether:

results = [
    (x, y) for x, y in gen_rand_tuples(1, 2 ** 32, 10_000_000)
    if (s := x * x + y * y) == math.isqrt(s) ** 2
]

In testing this is about as fast as your first attempt with floating-point operations, but without any imprecision:

Demo: Try it online!

Farewell answered 16/10, 2023 at 2:53 Comment(1)

Nice! I had no idea math had a square root function just for integers. Great tip. – Fleda 17/10, 2023 at 19:27

Not a answer to the question you directly ask, but here's at least a way to check if an integer of any size is a perfect square (I'm sure there are faster ways, but at least this should always work and be of logarithmic complexity):

def is_square(n):
    low = 0
    high = 1
    while high * high <= n:
        low = high
        high *= 2
    while low < high:
        mid = (low + high) >> 1
        if mid * mid == n:
            return True
        if mid * mid > n:
            high = mid
        else:
            low = mid + 1
    return False

It's just doing a binary search.

Jolly answered 15/10, 2023 at 6:6 Comment(2)

I'm impressed by how fast simple guess and bisect can be in doing the trick, never would have thought of that. I didn't understand your comment at first, but with your example and blhsing's, now it became clear, the float square root I was taking in my initial version was unnecessary. I didn't mark your answer as accepted because blhsing was a bit faster and pointed me a standard library feature I didn't know. But the bisect trick impressed me the most. – Fleda 17/10, 2023 at 19:36

NP! I didn't know about math.isqrt(), actually. – Jolly 20/10, 2023 at 10:9

math.modf splits the fractional and integer parts, you can compare that to the threshold. You can also move this check to gen_rand_tuples to decrease the overhead (not by much but still something)

import math
from pprint import pprint
from random import randrange
from time import time

from doubledouble import DoubleDouble


def dd_hypot(x, y):
    return (x * x + y * y).sqrt()


def gen_rand_tuples(start, stop, amount):
    t = DoubleDouble(1e-12)
    for _ in range(amount):
        x, y = DoubleDouble(randrange(start, stop)), DoubleDouble(randrange(start, stop))
        if math.modf(dd_hypot(x, y))[0] < t:
            yield float(x), float(y)


t0 = time()
results = gen_rand_tuples(1, 2 ** 32, 10_000_000)
print('results found in', round(time() - t0, 2), 'seconds:')
pprint([t for t in results])
print('finished in:', round(time() - t0, 2), 'seconds.')

Output:

results found in 0.0 seconds:
[(680368648.0, 3711917722.0),
 (3725230685.0, 4052331950.0),
 (3105505826.0, 4185910333.0),
 (4149112881.0, 1954134663.0),
 (2526797500.0, 3295693164.0),
 (1386817952.0, 1040113474.0)]
finished in: 49.76 seconds.

Creatine answered 15/10, 2023 at 10:34 Comment(1)

It seems this solution don't actually work, because it returned false positives, as you can check with the decimal module, by pprint([(x[0]*x[0] + x[1]*x[1]).sqrt() for x in (tuple(map(Decimal, x)) for x in results)]). The results aren't exact integers. It's probably because modf rely on the properties of floats and its limitations. But thank you. – Fleda 17/10, 2023 at 19:46

One thing you can try is to use multiprocessing to both generate the random pairs and to test them.

In the following code a call to multiprocessing.cpu_count() is made to determine pool_size, the number of processes that should be in our multiprocessing pool. Then we submit pool_size tasks passing to each n_pairs, the number of pairs the task should generate and test (N_TUPLETS // pool_size where N_TUPLETS is the total number of pairs to be generated and tested). The results from these pool_size tasks are accumulated into results.

Clearly the more CPU cores you have, the greater the reduction in time. Also, multiprocessing will improve performance only if the number of pairs each task is generating and testing is sufficiently large such that the overhead incurred by using multiprocessing is more than compensated for by running the tasks in parallel, which is the case here:

from decimal import Decimal
from random import randrange
from pprint import pprint
from time import time


N_TUPLETS = 10_000_000

def dec_hypot(x, y):
    return (x*x + y*y).sqrt()

def gen_rand_tuples(start, stop, amount):
    '''
    Generates random integer pairs and converts them to tuples of decimals.
    '''
    for _ in range(amount):
        yield (Decimal(randrange(start, stop)), Decimal(randrange(start, stop)))

def generate_and_test_tuplets(n_pairs):
    ## Results are those pairs that results in integer hypothenuses, or
    ## at least very close, to within 1e-12.
    return [t for t in gen_rand_tuples(1, 2**32, n_pairs) if abs((h := dec_hypot(*t)) - h.to_integral_value()) < Decimal(1e-12)]

def serial_test():
    t0 = time()

    results = generate_and_test_tuplets(N_TUPLETS)

    print('Serial results found:')
    pprint(results)
    print('finished in:', round(time() - t0, 2), 'seconds.')

def parallel_test():
    from multiprocessing import Pool, cpu_count

    t0 = time()

    pool_size = cpu_count()
    print('The pool size is', pool_size)
    n_pairs_list = [N_TUPLETS // pool_size] * (pool_size - 1)
    n_pairs_list.append(N_TUPLETS - sum(n_pairs_list))

    with Pool(pool_size) as pool:
        results = []
        for result in pool.imap_unordered(generate_and_test_tuplets, n_pairs_list):
            results.extend(result)
        print('Parallel results found:')
        pprint(results)
        print('finished in:', round(time() - t0, 2), 'seconds.')

if __name__ == '__main__':
    serial_test()
    print()
    parallel_test()

Prints:

Serial results found:
[]
finished in: 78.92 seconds.

The pool size is 8
Parallel results found:
[]
finished in: 16.83 seconds.

With 8 logical cores (4 physical cores), the multiprocessing version reduced the running time by a factor of approximately 5. Nothing prevents you from also incorporating improvements suggested in other answers to your post.

Hallucination answered 15/10, 2023 at 11:44 Comment(1)

Nice tip to speed it up! Thank you. – Fleda 17/10, 2023 at 19:38

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags