A positive_integer/1 predicate that works for big numbers

Asked 31/8, 2016 at 15:1 Answered 31/8, 2016 at 22:20

In my Prolog-inspired language Brachylog, there is the possibility to label CLP(FD)-equivalent variables that have potentially infinite domains. The code that does this labelization can be found here (thanks to Markus Triska @mat).

This predicate requires the existence of a predicate positive_integer/1, which must have the following behavior:

?- positive_integer(X).
X = 1 ;
X = 2 ;
X = 3 ;
X = 4 ;
…

This is implemented as such in our current solution:

positive_integer(N) :- length([_|_], N).

This has two problems that I can see:

This becomes slow fairly quickly:

?- time(positive_integer(100000)).
% 5 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)

?- time(positive_integer(1000000)).
% 5 inferences, 0.000 CPU in 0.008 seconds (0% CPU, Infinite Lips)

?- time(positive_integer(10000000)).
% 5 inferences, 0.062 CPU in 0.075 seconds (83% CPU, 80 Lips)

This ultimately returns an Out of global stack error for numbers that are a bit too big:
```
?- positive_integer(100000000).
ERROR: Out of global stack
```

This is obviously due to the fact that Prolog needs to instantiate the list, which is bad if its length is big.

How can we improve this predicate such that this works even for very big numbers, with the same behavior?

Rodmun answered 31/8, 2016 at 15:1 Comment(1)

Since "Brachylog's interpreter is entirely written in Prolog" meaning SWI-Prolog, you can use between/3 with the second argument bound to inf. Like positive_integer(X):- between(0,inf,X).. Can you time that and see if that helps? – Howlet 31/8, 2016 at 16:56

There are already many good ideas posted, and they work to various degrees.

Additional test case

@vmg has the right intuition: between/3 does not mix well with constraints. To see this, I would like to use the following query as an additional benchmark:

?- X #> 10^30, positive_integer(X).

Solution

With the test case in mind, I suggest the following solution:

positive_integer(I) :-
        I #> 0,
        (   var(I) ->
            fd_inf(I, Inf),
            (   I #= Inf
            ;   I #\= Inf,
                positive_integer(I)
            )
        ;   true
        ).

The key idea is to use the CLP(FD) reflection predicate fd_inf/2 to reason about the smallest element in the domain of a variable. This is the only predicate you will need to change when you port the solution to further Prolog systems. For example, in SICStus Prolog, the predicate is called fd_min/2.

Main features

portable to several Prolog systems with minimal changes
fast in the shown cases
works also in the test case above
advertises and uses the full power of CLP(FD) constraints.

It is of course very clear which of these points is most important.

Sample queries

creatio ex nihilo:

?- positive_integer(X).
X = 1 ;
X = 2 ;
X = 3 .

fixed integer:

?- X #= 12^42, time(positive_integer(X)).
% 4 inferences, 0.000 CPU in 0.000 seconds (68% CPU, 363636 Lips)
X = 2116471057875484488839167999221661362284396544.

constrained integer:

?- X #> 10^30, time(positive_integer(X)).
% 124 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 3647059 Lips)
X = 1000000000000000000000000000001 ;
% 206 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 2367816 Lips)
X = 1000000000000000000000000000002 ;
% 204 inferences, 0.000 CPU in 0.000 seconds (92% CPU, 2428571 Lips)
X = 1000000000000000000000000000003 .

Other comments

First, make sure to check out Brachylog and the latest Brachylog solutions on Code Golf. Thanks to Julien's efforts, a language inspired by Prolog is now increasingly often hosting some of the most concise and elegant programs that are posted there. Awesome work Julien!
Please refrain from using implementation-specific anomalies of between/3: These destroy important semantic properties of the predicate and are not portable to other systems.
If you ignore (2), please use infinite instead of inf. In the context of CLP(FD), inf denotes the infimum of the set of integers, which is the exact opposite of positive infinity.
In the context of CLP(FD), I recommend to use CLP(FD) constraints instead of between/3 and other predicates that don't take constraints into account.
In fact, I recommend to use CLP(FD) constraints instead of all low-level predicates that reason over integers. This can at most make your programs more general, never more specific.

Many thanks for your interest in this question and the posted solutions! I hope you find the test case above useful for your variants, and find ways to take CLP(FD) constraints into account in your versions so that they run faster and we can all upvote them!

Midshipman answered 31/8, 2016 at 22:20 Comment(2)

Main feature: No usage of such old-fashioned notions like addition! – Skirmish 1/9, 2016 at 13:45

BTW, should fd_inf(X,Inf). really succeed without constraining X? Possible alternate behaviours: instantiation_error, X #= X. – Skirmish 1/9, 2016 at 13:49

Since "Brachylog's interpreter is entirely written in Prolog" meaning SWI-Prolog, you can use between/3 with the second argument bound to inf.

Comparing your positive_integer with

positive_integer_b(X):- between(1,inf,X).

Tests on my machine:

?- time(positive_integer(10000000)).
% 5 inferences, 0.062 CPU in 0.072 seconds (87% CPU, 80 Lips)
true.
9 ?- time(positive_integer_b(10000000)).
% 2 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
true.

And showing "Out of global stack":

13 ?- time(positive_integer(100000000)).
% 5 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
ERROR: Out of global stack

14 ?- time(positive_integer_b(100000000)).
% 2 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
true.

I don't think between is pure prolog though.

Howlet answered 31/8, 2016 at 17:1 Comment(0)

In case you want your code to be runnable also on other systems, consider:

positive_integer(N) :-
   (  nonvar(N), % strictly not needed, but clearer
      integer(N),
      N > 0
   -> true
   ;  length([_|_], N)
   ).

This version produces exactly the same errors as your first try.

You have indeed spotted a bit towards a weakness in current length/2 implementations. Ideally, a goal like length([_|_], 1000000000000000) would take some time, but at least does not consume more than constant memory. On the other hand, I am not too sure if this is worth optimizing. After all, I do not see an easy way to solve the runtime problem for such cases.

Note that the version of between/3 in SWI-Prolog is highly specific to SWI. It makes termination arguments much more complex. In other systems like SICStus, you know for sure that between/3 is terminating, regardless of the arguments. In SWI you would have to prove, that the atom inf will not be encountered which raises the burden of proof obligation.

Skirmish answered 31/8, 2016 at 17:42 Comment(0)

without between/3, and ISO compliant (I think)

positive_integer(1).
positive_integer(X) :-
 var(X),
 positive_integer(Y),
 X is Y + 1.
positive_integer(X) :-
 integer(X),
 X > 0.

Nam answered 31/8, 2016 at 21:25 Comment(0)

Additional test case

Solution

Main features

Sample queries

Other comments

Recommended topics

Hot tags