Count of divisors of numbers till N in O(N)?

Asked 10/11, 2017 at 18:45 Answered 25/11, 2017 at 5:19

So, we can count divisors of each number from 1 to N in O(NlogN) algorithm with sieve:

int n;
cin >> n;
for (int i = 1; i <= n; i++) {
    for (int j = i; j <= n; j += i) {
        cnt[j]++; //// here cnt[x] means count of divisors of x
    }
}

Is there way to reduce it to O(N)? Thanks in advance.

Girlish answered 10/11, 2017 at 18:45 Comment(2)

all of them in O(N)?? – Feticide 10/11, 2017 at 18:55

@coderredoc Yes. – Girlish 10/11, 2017 at 18:56

How about this? Start with the prime 2 and keep a list of tuples, (k, d_k), where d_k is the number of divisors of k, starting with (1,1):

for each prime, p (ascending and lower than or equal to n / 2):
  for each tuple (k, d_k) in the list:
    if k * p > n:
      remove the tuple from the list
      continue
    power = 1
    while p * k <= n:
      add the tuple to the list if k * p^power <= n / p
      k = k * p
      output (k, (power + 1) * d_k)
      power = power + 1
  the next number the output has skipped is the next prime
  (since clearly all numbers up to the next prime are
   either smaller primes or composites of smaller primes)

The method above also generates the primes, relying on O(n) memory to keep finding the next prime. Having a more efficient, independent stream of primes could allow us to avoid appending any tuples (k, d_k) to the list, where k * next_prime > n, as well as free up all memory holding output greater than n / next_prime.

Python code

Subplot answered 21/11, 2017 at 22:45 Comment(14)

easiest thing to be sure about this is to test it vs a simple straightforward algorithm and compare the results; and then, measure the empirical orders of growth for both. – Fervency 22/11, 2017 at 16:52

but from "list of primes up to n in O(n) time" I can see right away that n is limited by a cache size, because to have O(n) you need a contiguous array, AFAIK (i.e. "Euler's sieve"). Eratosthenes's runs in n log log n. – Fervency 22/11, 2017 at 16:57

@WillNess I'm confused - are you saying saving a list of primes up to n is not doable in O(n) time? I just assumed that part from a little reading on the web. Sieve of Atkin apparently. – Imputation 22/11, 2017 at 16:59

Eratsothenes runs in n log log n. I saw stuff about "n / log log n with simple optimizations" in the Wikipedia but this was added by a mathematician; "simple" for them isn't quite as simple for me. he based it on another mathematician's articles which speak of unwinding spirals of wheels which have enormous size requirements if done simply, which are bound to impact performance. The only practical O(n) I know of, not as simple to do itself, is Euler's (or Gries&Misra's, too); and it can't be segmentized AFAIK. – Fervency 22/11, 2017 at 17:5

"simple" for me is 4-5 lines of Haskell, tops. preferably no more than 2. Atkin, there's lots of critique about, on SO, by user GordonBGood. – Fervency 22/11, 2017 at 17:9

@WillNess looking at Euler's sieve inspired me to slightly modify my code to easily incorporate the prime generation as well :) See here: repl.it/repls/MistyroseTruthfulWoodcock – Imputation 23/11, 2017 at 0:0

good, I'll try to understand it (later). my feeling was all along to augment Euler's sieve to this problem somehow; but I haven't thought this through. yet. :) (don't you just love those quirky repl.it link names, huh. wonder if they had SqueamishOssifrage just yet.) – Fervency 23/11, 2017 at 0:2

Seems fastest algorithm here, but it is not fully O(N), there is some constant, though it is not a big constant. You got my upvote :) – Girlish 23/11, 2017 at 14:32

Each del operation in the middle of a list is O(size of list) so this is actually a O(n^2) implementation. But you can fix that by switching your list to 2 queues. On each pass you read from one queue then deciding whether to insert into the other. When the pass finishes you switch the roles of the queues. But the idea works, you get my vote! – Kreiker 23/11, 2017 at 16:0

@Kreiker thanks for checking it out and for your suggestion about the queues! – Imputation 23/11, 2017 at 16:5

The version now has an endless loop. :-( – Kreiker 24/11, 2017 at 19:44

@Kreiker Could you explain what you mean? All the links and code seem to work for me. – Imputation 24/11, 2017 at 19:51

When I cut and paste it into a terminal it was finding 2 over and over again. It now works. But there are a number of optimizations that are missing. I'll add another answer. – Kreiker 25/11, 2017 at 4:5

@WillNess I have a practical algorithm for finding the primes up to N in time O(n/log(log(n)). Start by running this algorithm to find all of the primes up to log(n)/2. Construct a wheel for the product of all of those primes (which will be reasonably close to length sqrt(n)). Run this algorithm (except with flags not counts) on the values relatively prime to all of those primes. That will run in time O(n log(log(n))). – Kreiker 25/11, 2017 at 5:7

Here is a simple optimization on @גלעד ברקן's solution. Rather than use sets, use arrays. This is about 10x as fast as the set version.

n = 100

answer = [None for i in range(0, n+1)]
answer[1] = 1

small_factors = [1]
p = 1
while (p < n):
    p = p + 1
    if answer[p] is None:
        print("\n\nPrime: " + str(p))
        limit = n / p
        new_small_factors = []
        for i in small_factors:
            j = i
            while j <= limit:
                new_small_factors.append(j)
                answer[j * p] = answer[j] + answer[i]
                j = j * p
        small_factors = new_small_factors

print("\n\nAnswer: " + str([(k,d) for k,d in enumerate(answer)]))

It is worth noting that this is also a O(n) algorithm for enumerating primes. However with the use of a wheel generated from all of the primes below size log(n)/2 it can create a prime list in time O(n/log(log(n))).

Kreiker answered 25/11, 2017 at 5:19 Comment(6)

I like how succinct the code is, too; and the difference in calculating the number of divisors using addition rather than multiplication. – Imputation 25/11, 2017 at 5:45

I added an additional optimization that brought mine down to 8 seconds and yours to 6 seconds for 10,000,000 on my laptop. p = p + 2 :) – Imputation 25/11, 2017 at 11:23

By the way, this: while (p < n) could be this: while (p < n / 2) since anything multiplied by a greater p is out of range (shaved another 0.4 sec on your version). – Imputation 25/11, 2017 at 11:23

@גלעדברקן You can find another speedup by noting that everything below min(p, n/p) is in small_factors so can be eliminated from that list and traversed with addition. From my point of view the p < n/2 actually produces wrong answers because it puts None instead of 2. Though it could be special cased. – Kreiker 26/11, 2017 at 5:20

"special cased?" yes, how about we call them "prime numbers?" ;) I'm not sure what you mean by traversed with addition, could you please explain? – Imputation 26/11, 2017 at 9:53

@גלעדברקן What I mean is that if p is prime, you can traverse 1*p, 2*p, 3*p, ..., (p-1)*p just by adding p over and over again and keeping a counter for where you are. And the counts you are writing in are always doubled. As for special cased, I meant in output to print None as 2. – Kreiker 26/11, 2017 at 21:45

Consider the total of those counts, sum(phi(i) for i=1,n). That sum is O(N log N), so any O(N) solution would have to bypass individual counting.

This suggests that any improvement would need to depend on prior results (dynamic programming). We already know that phi(i) is the product of each prime degree plus one. For instance, 12 = 2^2 * 3^1. The degrees are 2 and 1, respective. (2+1)*(1+1) = 6. 12 has 6 divisors: 1, 2, 3, 4, 6, 12.

This "reduces" the question to whether you can leverage the prior knowledge to get an O(1) way to compute the number of divisors directly, without having to count them individually.

Think about the given case ... divisor counts so far include:

Is there an O(1) way to get phi(12) = 6 from these figures?

Smelser answered 10/11, 2017 at 19:59 Comment(2)

is it correct to say that it is essential for this enumeration to be done on one, contiguous array from 1 to N? that is what I have in mind anyway, I can't think of any way to do an O(1) enumeration starting from some upper number. – Fervency 12/11, 2017 at 21:41

@WillNess That's my gut feeling. I aced number theory in college, but I don't see the solution from here. If you can identify any factorisation into relative primes (such as 3*4 above), then you can simply multiple their phi values to get the new one. However, I don't know of an O(1) way to identify two such factors. Also, you need a different method (although simple) for a power of a prime number. For instance, 8 does not have such a factorisation. – Smelser 13/11, 2017 at 18:36

Here is an algorithm that is theoretically better than O(n log(n)) but may be worse for reasonable n. I believe that its running time is O(n lg*(n)) where lg* is the https://en.wikipedia.org/wiki/Iterated_logarithm.

First of all you can find all primes up to n in time O(n) using the Sieve of Atkin. See https://en.wikipedia.org/wiki/Sieve_of_Atkin for details.

Now the idea is that we will build up our list of counts only inserting each count once. We'll go through the prime factors one by one, and insert values for everything with that as the maximum prime number. However in order to do that we need a data structure with the following properties:

We can store a value (specifically the count) at each value.
We can walk the list of inserted values forwards and backwards in O(1).
We can find the last inserted number below i "efficiently".
Insertion should be "efficient".

(Quotes are the parts that are hard to estimate.)

The first is trivial, each slot in our data structure needs a spot for the value. The second can be done with a doubly linked list. The third can be done with a clever variation on a skip-list. The fourth falls out from the first 3.

We can do this with an array of nodes (which do not start out initialized) with the following fields that look like a doubly linked list:

value The answer we are looking for.
prev The last previous value that we have an answer for.
next The next value that we have an answer for.

Now if i is in the list and j is the next value, the skip-list trick will be that we will also fill in prev for the first even after i, the first divisible by 4, divisible by 8 and so on until we reach j. So if i = 81 and j = 96 we would fill in prev for 82, 84, 88 and then 96.

Now suppose that we want to insert a value v at k between an existing i and j. How do we do it? I'll present pseudocode starting with only k known then fill it out for i = 81, j = 96 and k = 90.

k.value := v
for temp in searching down from k for increasing factors of 2:
    if temp has a value:
        our_prev := temp
        break
    else if temp has a prev:
        our_prev = temp.prev
        break
our_next := our_prev.next
our_prev.next := k
k.next := our_next
our_next.prev := k
for temp in searching up from k for increasing factors of 2:
    if j <= temp:
        break
    temp.prev = k
k.prev := our_prev

In our particular example we were willing to search downwards from 90 to 90, 88, 80, 64, 0. But we actually get told that prev is 81 when we get to 88. We would be willing to search up to 90, 92, 96, 128, 256, ... however we just have to set 92.prev 96.prev and we are done.

Now this is a complicated bit of code, but its performance is O(log(k-i) + log(j-k) + 1). Which means that it starts off as O(log(n)) but gets better as more values get filled in.

So how do we initialize this data structure? Well we initialize an array of uninitialized values then set 1.value := 0, 1.next := n+1, and 2.prev := 4.prev := 8.prev := 16.prev := ... := 1. And then we start processing our primes.

When we reach prime p we start by searching for the previous inserted value below n/p. Walking backwards from there we keep inserting values for x*p, x*p^2, ... until we hit our limit. (The reason for backwards is that we do not want to try to insert, say, 18 once for 3 and once for 9. Going backwards prevents that.)

Now what is our running time? Finding the primes is O(n). Finding the initial inserts is also easily O(n/log(n)) operations of time O(log(n)) for another O(n). Now what about the inserts of all of the values? That is trivially O(n log(n)) but can we do better?

Well first all of the inserts to density 1/log(n) filled in can be done in time O(n/log(n)) * O(log(n)) = O(n). And then all of the inserts to density 1/log(log(n)) can likewise be done in time O(n/log(log(n))) * O(log(log(n))) = O(n). And so on with increasing numbers of logs. The number of such factors that we get is O(lg*(n)) for the O(n lg*(n)) estimate that I gave.

I haven't shown that this estimate is as good as you can do, but I think that it is.

So, not O(n), but pretty darned close.

Kreiker answered 20/11, 2017 at 16:6 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags