Fast algorithm/formula for serial range of modulo of co-prime numbers
Asked Answered
U

1

6

In my project, one part of problem is there. But to simplify, here the problem is being formulated. There are two positive co-prime integers: a and b, where a < b. Multiples of a from 1 through b-1 is listed followed by modulus operation by b.

a mod b , 2*a mod b , 3*a mod b, ... , (b-1)*a mod b

Now, there is another integer, say n ( 1 <= n < b). Through the first n numbers in the list, we have to find how many numbers is less than, say m (1 <= m < b). This can be done in brute force approach, thereby giving a O(n).

An example:

a=6, b=13, n=8, m=6

List is:

6, 12, 5, 11, 4, 10, 3, 9, 2, 8, 1, 7

This is a permutation of the numbers from 1 to 12 because modulus operation of any two co-primes produces a permutation of numbers if we include another number, that is, 0. If we take a= 2, b=13, then the list would have been 2, 4, 6, 8, 10, 12, 1, 3, 5, 7, 9, 11, which gives a pattern. Whereas if a and b are very large (in my project they can go up to 10^20), then I have no idea how to deduce a pattern of such large numbers.

Now getting back to the example, we take the first n = 8 numbers from the list, which gives

6, 12, 5, 11, 4, 10, 3, 9

Applying the less-than operator with m = 6, it gives the total number of numbers less than m being 3 as explained below in the list

0, 0, 1, 0, 1, 0, 1, 0

where 0 refers to not being less than m and 1 refers to being less than m.

Since, the algorithm above is a O(n), which is not acceptable for the range of [0, 10^20], so can the community give a hint/clue/tip to enable me to reach a O(log n ) solution, or even better O(1) solution?

Underpinnings answered 11/9, 2014 at 5:20 Comment(0)
L
1

(Warning: I got a little twitchy about the range of multipliers not being [0, n), so I adjusted it. It's easy enough to compensate.)

I'm going to sketch, with tested Python code, an implementation that runs in time O(log max {a, b}). First, here's some utility functions and a naive implementation.

from fractions import gcd
from random import randrange


def coprime(a, b):
    return gcd(a, b) == 1


def floordiv(a, b):
    return a // b


def ceildiv(a, b):
    return floordiv(a + b - 1, b)


def count1(a, b, n, m):
    assert 1 <= a < b
    assert coprime(a, b)
    assert 0 <= n < b + 1
    assert 0 <= m < b + 1
    return sum(k * a % b < m for k in range(n))

Now, how can we speed this up? The first improvement is to partition the multipliers into disjoint ranges such that, within a range, the corresponding multiples of a are between two multiples of b. Knowing the lowest and highest values, we can count via a ceiling division the number of multiples less than m.

def count2(a, b, n, m):
    assert 1 <= a < b
    assert coprime(a, b)
    assert 0 <= n < b + 1
    assert 0 <= m < b + 1
    count = 0
    first = 0
    while 0 < n:
        count += min(ceildiv(m - first, a), n)
        k = ceildiv(b - first, a)
        n -= k
        first = first + k * a - b
    return count

This isn't fast enough. The second improvement is to replace most of the while loop with a recursive call. In the code below, j is the number of iterations that are "complete" in the sense that there is a wraparound. term3 accounts for the remaining iteration, using logic that resembles count2.

Each of the complete iterations contributes floor(m / a) or floor(m / a) + 1 residues under the threshold m. Whether we get the + 1 depends on what first is for that iteration. first starts at 0 and changes by a - (b % a) modulo a on each iteration through the while loop. We get the + 1 whenever it's under some threshold, and this count is computable via a recursive call.

def count3(a, b, n, m):
    assert 1 <= a < b
    assert coprime(a, b)
    assert 0 <= n < b + 1
    assert 0 <= m < b + 1
    if 1 == a:
        return min(n, m)
    j = floordiv(n * a, b)
    term1 = j * floordiv(m, a)
    term2 = count3(a - b % a, a, j, m % a)
    last = n * a % b
    first = last % a
    term3 = min(ceildiv(m - first, a), (last - first) // a)
    return term1 + term2 + term3

The running time can be analyzed analogously to the Euclidean GCD algorithm.

Here's some test code to prove evidence for my claims of correctness. Remember to delete the assertions before testing performance.

def test(p, f1, f2):
    assert 3 <= p
    for t in range(100):
        while True:
            b = randrange(2, p)
            a = randrange(1, b)
            if coprime(a, b):
                break
        for n in range(b + 1):
            for m in range(b + 1):
                args = (a, b, n, m)
                print(args)
                assert f1(*args) == f2(*args)


if __name__ == '__main__':
    test(25, count1, count2)
    test(25, count1, count3)
Lamasery answered 15/9, 2014 at 21:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.