Generating integers in ascending order using a set of prime numbers
Asked Answered
H

4

14

I have a set of prime numbers and I have to generate integers using only those prime factors in increasing order.

For example, if the set is p = {2, 5} then my integers should be 1, 2, 4, 5, 8, 10, 16, 20, 25, …

Is there any efficient algorithm to solve this problem?

Hirundine answered 12/4, 2012 at 15:2 Comment(3)
Better to ask this on math.stackexchange.comCatullus
@HighPerformanceMark yes, but in increasing orderHirundine
Check out this related question. The accepted answer there gives O(n) Python code similar to my answer here, which can be adapted to arbitrary "bases" (primes set).Oxus
K
8

The basic idea is that 1 is a member of the set, and for each member of the set n so also 2n and 5n are members of the set. Thus, you begin by outputting 1, and push 2 and 5 onto a priority queue. Then, you repeatedly pop the front item of the priority queue, output it if it is different from the previous output, and push 2 times and 5 times the number onto the priority queue.

Google for "Hamming number" or "regular number" or go to A003592 to learn more.

----- ADDED LATER -----

I decided to spend a few minutes on my lunch hour to write a program to implement the algorithm described above, using the Scheme programming language. First, here is a library implementation of priority queues using the pairing heap algorithm:

(define pq-empty '())
(define pq-empty? null?)

(define (pq-first pq)
  (if (null? pq)
      (error 'pq-first "can't extract minimum from null queue")
      (car pq)))

(define (pq-merge lt? p1 p2)
  (cond ((null? p1) p2)
        ((null? p2) p1)
        ((lt? (car p2) (car p1))
          (cons (car p2) (cons p1 (cdr p2))))
        (else (cons (car p1) (cons p2 (cdr p1))))))

(define (pq-insert lt? x pq)
  (pq-merge lt? (list x) pq))

(define (pq-merge-pairs lt? ps)
  (cond ((null? ps) '())
        ((null? (cdr ps)) (car ps))
        (else (pq-merge lt? (pq-merge lt? (car ps) (cadr ps))
                            (pq-merge-pairs lt? (cddr ps))))))

(define (pq-rest lt? pq)
  (if (null? pq)
      (error 'pq-rest "can't delete minimum from null queue")
      (pq-merge-pairs lt? (cdr pq))))

Now for the algorithm. Function f takes two parameters, a list of the numbers in the set ps and the number n of items to output from the head of the output. The algorithm is slightly changed; the priority queue is initialized by pushing 1, then the extraction steps start. Variable p is the previous output value (initially 0), pq is the priority queue, and xs is the output list, which is accumulated in reverse order. Here's the code:

(define (f ps n)
  (let loop ((n n) (p 0) (pq (pq-insert < 1 pq-empty)) (xs (list)))
    (cond ((zero? n) (reverse xs))
          ((= (pq-first pq) p) (loop n p (pq-rest < pq) xs))
          (else (loop (- n 1) (pq-first pq) (update < pq ps)
                      (cons (pq-first pq) xs))))))

For those not familiar with Scheme, loop is a locally-defined function that is called recursively, and cond is the head of an if-else chain; in this case, there are three cond clauses, each clause with a predicate and consequent, with the consequent evaluated for the first clause for which the predicate is true. The predicate (zero? n) terminates the recursion and returns the output list in the proper order. The predicate (= (pq-first pq) p) indicates that the current head of the priority queue has been output previously, so it is skipped by recurring with the rest of the priority queue after the first item. Finally, the else predicate, which is always true, identifies a new number to be output, so it decrements the counter, saves the current head of the priority queue as the new previous value, updates the priority queue to add the new children of the current number, and inserts the current head of the priority queue into the accumulating output.

Since it is non-trivial to update the priority queue to add the new children of the current number, that operation is extracted to a separate function:

(define (update lt? pq ps)
  (let loop ((ps ps) (pq pq))
    (if (null? ps) (pq-rest lt? pq)
      (loop (cdr ps) (pq-insert lt? (* (pq-first pq) (car ps)) pq)))))

The function loops over the elements of the ps set, inserting each into the priority queue in turn; the if returns the updated priority queue, minus its old head, when the ps list is exhausted. The recursive step strips the head of the ps list with cdr and inserts the product of the head of the priority queue and the head of the ps list into the priority queue.

Here are two examples of the algorithm:

> (f '(2 5) 20)
(1 2 4 5 8 10 16 20 25 32 40 50 64 80 100 125 128 160 200 250)
> (f '(2 3 5) 20)
(1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36)

You can run the program at http://ideone.com/sA1nn.

Katelynnkaterina answered 12/4, 2012 at 15:12 Comment(4)
Your algorithm is inefficient in that it over-produces the sequence past the end, and the use of PQ which is growing in size also incurs additional costs per number produced, which are greater than O(1) it seems. I've posted an answer without these two problems. BTW do you have complexity estimate for your pq-rest? pq-insert is O(1) always, and pq-rest seems to be O(size-of-pq) in worst case, but what about amortized?Oxus
measuring your algorithm interpreted, in MIT-Scheme, it runs at about O(n^1.12) empirical complexity (between n=6k, 12k). The efficient algorithm with back-pointers should run at O(n). btw I could speed your code up by almost 20% (interpreted) with (define (update lt? pq ps) (pq-merge lt? (pq-rest lt? pq) (pq-from-ordlist (map (lambda(p)(* (pq-first pq) p)) ps)))) and (define (pq-from-ordlist xs) (cons (car xs) (map list (cdr xs)))).Oxus
I've checked it now in Haskell interpreter (GHCi) and the "classic" algorithm indeed runs in O(n) between n=40k, 80k.Oxus
sorry, didn't mention that I tested your (f '(2 3 5) N) in Scheme. btw between n=12k and n=24k the empirical complexity was O(n^1.08) so it does look like O(n log n) complexity. I measure empirical complexity as log(t2/t1) / log(n2/n1), where t_i is run time and n_i is problem size.Oxus
O
14

Removing a number and reinserting all its multiples (by the primes in the set) into a priority queue is wrong (in the sense of the question) - i.e. it produces correct sequence but inefficiently so.

It is inefficient in two ways - first, it overproduces the sequence; second, each PriorityQueue operation incurs extra cost (the operations remove_top and insert are not usually both O(1), certainly not in any list- or tree-based PriorityQueue implementation).

The efficient O(n) algorithm maintains pointers back into the sequence itself as it is being produced, to find and append the next number in O(1) time. In pseudocode:

  return array h where
    h[0]=1; n=0; ps=[2,3,5, ... ]; // base primes
    is=[0 for each p in ps];       // indices back into h
    xs=[p for each p in ps]        // next multiples: xs[k]==ps[k]*h[is[k]]
    repeat:
      h[++n] := minimum xs
      for each ref (i,x,p) in (is,xs,ps):
        if( x==h[n] )
          { x := p*h[++i]; }       // advance the minimal multiple/pointer

For each minimal multiple it advances its pointer, while at the same time calculating its next multiple value. This too effectively implements a PriorityQueue but with crucial distinctions - it is before the end point, not after; it doesn't create any additional storage except for the sequence itself; and its size is constant (just k numbers, for k base primes) whereas the size of past-the-end PriorityQueue grows as we progress along the sequence (in the case of Hamming sequence, based on set of 3 primes, as n2/3, for n numbers of the sequence).


The classic Hamming sequence in Haskell is essentially the same algorithm:

h = 1 : map (2*) h `union` map (3*) h `union` map (5*) h

union a@(x:xs) b@(y:ys) = case compare x y of LT -> x : union  xs  b
                                              EQ -> x : union  xs  ys
                                              GT -> y : union  a   ys

We can generate the smooth numbers for arbitrary base primes using the foldi function (see Wikipedia) to fold lists in a tree-like fashion for efficiency, creating a fixed sized tree of comparisons:

smooth base_primes = h   where       -- strictly increasing base_primes  NB!
    h = 1 : foldi g [] [map (p*) h | p <- base_primes]
    g (x:xs) ys = x : union xs ys

foldi f z []     = z
foldi f z (x:xs) = f x (foldi f z (pairs f xs))
 
pairs f (x:y:t)  = f x y : pairs f t
pairs f t        = t

It is also possible to directly calculate a slice of Hamming sequence around its nth member in O(n2/3) time, by direct enumeration of the triples and assessing their values through logarithms, logval(i,j,k) = i*log 2+j*log 3+k*log 5. This Ideone.com test entry calculates 1 billionth Hamming number in 1.12 0.05 seconds (2016-08-18: main speedup due to usage of Int instead of the default Integer where possible, even on 32-bit; additional 20% thanks to the tweak suggested by @GordonBGood, bringing band size complexity down to O(n1/3)).

This is discussed some more in this answer where we also find its full attribution:

slice hi w = (c, sortBy (compare `on` fst) b) where  -- hi is a top log2 value
  lb5=logBase 2 5 ; lb3=logBase 2 3                  -- w<1 (NB!) is (log2 width)
  (Sum c, b) = fold                                  -- total count, the band
      [ ( Sum (i+1),                                 -- total triples w/this j,k
          [ (r,(i,j,k)) | frac < w ] )               -- store it, if inside the band
        | k <- [ 0 .. floor ( hi   /lb5) ],  let p = fromIntegral k*lb5,
          j <- [ 0 .. floor ((hi-p)/lb3) ],  let q = fromIntegral j*lb3 + p,
          let (i,frac) = pr  (hi-q)      ;       r = hi - frac    -- r = i + q
      ]    -- (sum . map fst &&& concat . map snd)
  pr = properFraction 

This can be generalized for k base primes as well, probably running in O(n(k-1)/k) time.


(see this SO entry for an important later development. also, this answer is interesting. and another related answer.)

Oxus answered 15/4, 2012 at 6:32 Comment(2)
I just discovered Hamming numbers today. This answer is brilliant! I went ahead and implemented your pseudocode in C++11 syntax here in case any future reader is interested.Koy
@Koy thank you very much, I spent too much time on this stuff too many years ago... :)Oxus
K
8

The basic idea is that 1 is a member of the set, and for each member of the set n so also 2n and 5n are members of the set. Thus, you begin by outputting 1, and push 2 and 5 onto a priority queue. Then, you repeatedly pop the front item of the priority queue, output it if it is different from the previous output, and push 2 times and 5 times the number onto the priority queue.

Google for "Hamming number" or "regular number" or go to A003592 to learn more.

----- ADDED LATER -----

I decided to spend a few minutes on my lunch hour to write a program to implement the algorithm described above, using the Scheme programming language. First, here is a library implementation of priority queues using the pairing heap algorithm:

(define pq-empty '())
(define pq-empty? null?)

(define (pq-first pq)
  (if (null? pq)
      (error 'pq-first "can't extract minimum from null queue")
      (car pq)))

(define (pq-merge lt? p1 p2)
  (cond ((null? p1) p2)
        ((null? p2) p1)
        ((lt? (car p2) (car p1))
          (cons (car p2) (cons p1 (cdr p2))))
        (else (cons (car p1) (cons p2 (cdr p1))))))

(define (pq-insert lt? x pq)
  (pq-merge lt? (list x) pq))

(define (pq-merge-pairs lt? ps)
  (cond ((null? ps) '())
        ((null? (cdr ps)) (car ps))
        (else (pq-merge lt? (pq-merge lt? (car ps) (cadr ps))
                            (pq-merge-pairs lt? (cddr ps))))))

(define (pq-rest lt? pq)
  (if (null? pq)
      (error 'pq-rest "can't delete minimum from null queue")
      (pq-merge-pairs lt? (cdr pq))))

Now for the algorithm. Function f takes two parameters, a list of the numbers in the set ps and the number n of items to output from the head of the output. The algorithm is slightly changed; the priority queue is initialized by pushing 1, then the extraction steps start. Variable p is the previous output value (initially 0), pq is the priority queue, and xs is the output list, which is accumulated in reverse order. Here's the code:

(define (f ps n)
  (let loop ((n n) (p 0) (pq (pq-insert < 1 pq-empty)) (xs (list)))
    (cond ((zero? n) (reverse xs))
          ((= (pq-first pq) p) (loop n p (pq-rest < pq) xs))
          (else (loop (- n 1) (pq-first pq) (update < pq ps)
                      (cons (pq-first pq) xs))))))

For those not familiar with Scheme, loop is a locally-defined function that is called recursively, and cond is the head of an if-else chain; in this case, there are three cond clauses, each clause with a predicate and consequent, with the consequent evaluated for the first clause for which the predicate is true. The predicate (zero? n) terminates the recursion and returns the output list in the proper order. The predicate (= (pq-first pq) p) indicates that the current head of the priority queue has been output previously, so it is skipped by recurring with the rest of the priority queue after the first item. Finally, the else predicate, which is always true, identifies a new number to be output, so it decrements the counter, saves the current head of the priority queue as the new previous value, updates the priority queue to add the new children of the current number, and inserts the current head of the priority queue into the accumulating output.

Since it is non-trivial to update the priority queue to add the new children of the current number, that operation is extracted to a separate function:

(define (update lt? pq ps)
  (let loop ((ps ps) (pq pq))
    (if (null? ps) (pq-rest lt? pq)
      (loop (cdr ps) (pq-insert lt? (* (pq-first pq) (car ps)) pq)))))

The function loops over the elements of the ps set, inserting each into the priority queue in turn; the if returns the updated priority queue, minus its old head, when the ps list is exhausted. The recursive step strips the head of the ps list with cdr and inserts the product of the head of the priority queue and the head of the ps list into the priority queue.

Here are two examples of the algorithm:

> (f '(2 5) 20)
(1 2 4 5 8 10 16 20 25 32 40 50 64 80 100 125 128 160 200 250)
> (f '(2 3 5) 20)
(1 2 3 4 5 6 8 9 10 12 15 16 18 20 24 25 27 30 32 36)

You can run the program at http://ideone.com/sA1nn.

Katelynnkaterina answered 12/4, 2012 at 15:12 Comment(4)
Your algorithm is inefficient in that it over-produces the sequence past the end, and the use of PQ which is growing in size also incurs additional costs per number produced, which are greater than O(1) it seems. I've posted an answer without these two problems. BTW do you have complexity estimate for your pq-rest? pq-insert is O(1) always, and pq-rest seems to be O(size-of-pq) in worst case, but what about amortized?Oxus
measuring your algorithm interpreted, in MIT-Scheme, it runs at about O(n^1.12) empirical complexity (between n=6k, 12k). The efficient algorithm with back-pointers should run at O(n). btw I could speed your code up by almost 20% (interpreted) with (define (update lt? pq ps) (pq-merge lt? (pq-rest lt? pq) (pq-from-ordlist (map (lambda(p)(* (pq-first pq) p)) ps)))) and (define (pq-from-ordlist xs) (cons (car xs) (map list (cdr xs)))).Oxus
I've checked it now in Haskell interpreter (GHCi) and the "classic" algorithm indeed runs in O(n) between n=40k, 80k.Oxus
sorry, didn't mention that I tested your (f '(2 3 5) N) in Scheme. btw between n=12k and n=24k the empirical complexity was O(n^1.08) so it does look like O(n log n) complexity. I measure empirical complexity as log(t2/t1) / log(n2/n1), where t_i is run time and n_i is problem size.Oxus
R
3

This 2-dimensional exploring algorithm is not exact, but works for the first 25 integers, then mixes up 625 and 512.

Powers of 2 and 5

n = 0
exp_before_5 = 2
while true
  i = 0
  do
    output 2^(n-exp_before_5*i) * 5^Max(0, n-exp_before_5*(i+1))
    i <- i + 1
  loop while n-exp_before_5*(i+1) >= 0
  n <- n + 1
end while
Roily answered 12/4, 2012 at 15:34 Comment(3)
the thing to do here is to draw a line at atan( log(5)/log(2) ) * 180 / pi = 66.69958829239839 degrees angle to the horizontal axis and collect the points that cross it as we slide it out away from the top left point.Oxus
Can you provide an algorithm for that ?Hydraulic
I thought I did, in the comment above. :) No, I don't have working code yet. One thing to notice is log 5/log 2 = 2.321928094887362 and '7/3 = 2.333333333333333`.Oxus
L
3

Based on user448810's answer, here's a solution that uses heaps and vectors from the STL.
Now, heaps normally output the largest value, so we store the negative of the numbers as a workaround (since a>b <==> -a<-b).

#include <vector>
#include <iostream>
#include <algorithm>

int main()
{
    std::vector<int> primes;
    primes.push_back(2);
    primes.push_back(5);//Our prime numbers that we get to use

    std::vector<int> heap;//the heap that is going to store our possible values
    heap.push_back(-1);
    std::vector<int> outputs;
    outputs.push_back(1);
    while(outputs.size() < 10)
    {
        std::pop_heap(heap.begin(), heap.end());
        int nValue = -*heap.rbegin();//Get current smallest number
        heap.pop_back();
        if(nValue != *outputs.rbegin())//Is it a repeat?
        {
            outputs.push_back(nValue);
        }
        for(unsigned int i = 0; i < primes.size(); i++)
        {
            heap.push_back(-nValue * primes[i]);//add new values
            std::push_heap(heap.begin(), heap.end());
        }
    }
    //output our answer
    for(unsigned int i = 0; i < outputs.size(); i++)
    {
        std::cout << outputs[i] << " ";
    }
    std::cout << std::endl;
}

Output:

1 2 4 5 8 10 16 20 25 32
Livraison answered 12/4, 2012 at 15:50 Comment(1)
(I don't remember if I commented here previously, if so, my apologies) Using heaps leads to overproduction past the desired element and heapify takes additional time, usually O(log n), leading to O(n log n) behaviour. Edsger Dijkstra once shown an O(n) solution, check out the pseudocode in my answer. :) Take e.g. 400. The linear algorithm keeps just two look-back pointers, one to 80, another to 200. But when the priority queue algorithm gets to 400, it has 500,625,640,800,1000,1250,1280,1600,500,512,640 in its heap, past the point of interest.Oxus

© 2022 - 2024 — McMap. All rights reserved.