Using \==/2 or dif/2
Asked Answered
R

3

18

If I want to make sure that two variables do not instantiate to the same term, what is the preferred way to do it?

Let's say I need to find directed edges in a graph, and a node cannot have an edge to itself:

node(a, x, y). node(b, z, x). node(c, y, y).

(the edges here are a -> c, b -> a, but not c -> c)

The following works:

edge(A, B) :- node(A, _, X), node(B, X, _), A \== B.

This works too [swi-prolog]:

edge(A, B) :- dif(A, B), node(A, _, X), node(B, X, _).

This does not work, apparently (because neither A nor B are instantiated yet?):

edge(A, B) :- A \== B, node(A, _, X), node(B, X, _).

I guess my problem with the first solution is that, with a more complex node predicate, a lot of unnecessary unifications might take place before edge fails. The dif on the other hand is in a library, which suggests that it is not meant to be used in such a simple case (although it has the exact function that I seem to be looking for).

Ratoon answered 7/12, 2012 at 5:8 Comment(2)
Must be a graph in hyper space aka a hyper graph, I have never seen node/3 defining a graph. Terminology here isnt standard, node is usually only a the set of nodes (vertice, point) in the graph and edge (arc, line) is then a binary relation over the nodes. en.wikipedia.org/wiki/Graph_theorySpurn
@j4nbur53 Thank you for the info. I am right now trying to figure out what the node/3 in my question is doing and I am having trouble. Who knows what I was smoking when I wrote this question. It seems that I was solving some problem in which I know something about the graph but not its actual topology yet: this is somehow encoded in node/3. I could not track down the motivation for this question.Ratoon
M
16

For elegance and didactic reasons alone, dif/2 is clearly preferable here and also in the vast majority of other cases, since as you already note "a lot of unnecessary unifications might take place" otherwise, and also because dif/2 is a pure and nicely declarative predicate that can be used in all directions and at any place in the clause body without changing the meaning of the program, in contrast to (\==)/2. dif/2 is also an autoloaded predicate in SWI-Prolog, meaning that you need not import any library explicitly to use it, and dif/2 is available like any built-in predicate.

If you use dif/2 you can reason much more easily about your code. For example, in your case, you start with:

edge(A, B) :- node(A, _, X), node(B, X, _), dif(A, B).

and then, as you know that dif/2 is a completely pure predicate, you know that you can also write this as:

edge(A, B) :- dif(A, B), node(A, _, X), node(B, X, _).

Further, since you know that dif/2 always terminates, you know that this change can at most improve the termination properties of your program.

Like all constraints, dif/2 is meant to be used. I highly recommend it instead of impure predicates that are not commutative.

In case you are worried about performance, here is a small comparison, just comparing dif/2 against the non-declarative (\==)/2 in a use case where the two predicates can be used interchangeably:

?- N = 1_000_000, time((between(1,N,_),dif(a,b),false)).
% 11,000,005 inferences, 0.352 CPU in 0.353 seconds (100% CPU, 31281029 Lips)

?- N = 1_000_000, time((between(1,N,_),a\==b,false)).
%@ % 3,000,001 inferences, 0.107 CPU in 0.107 seconds (99% CPU, 28167437 Lips)

So, there are sometimes performance benefits when using (\==)/2. However, there are also much more severe drawbacks when using such a low-level predicate: It is harder to understand, more error-prone, and not declarative.

I therefore recommend to simply use dif/2 to express that two terms are different.

Misesteem answered 7/12, 2012 at 8:20 Comment(8)
Thank you for your answer. After looking more carefully at the "Coroutining" part of the SWI-Prolog manual it makes sense. I am still slightly confused that such a simple Prolog program actually requires procedural considerations when using only the built-in predicates.Ratoon
I do consider dif/2 a built-in predicate. It is available in all serious Prolog implementations (SICStus, YAP, SWI, ...) just like any other built-in. Procedural considerations are mostly neccessary when you use impure predicates.Misesteem
why is \== impure? I can't think of any side-effects that a comparison could have. Or do you mean something else by "impure"?Ratoon
?- X \== Y, X = Y. succeeds, but ?- X = Y, X \== Y. fails. From pure logical relations, we expect that the order of goals does not change the truth of a conjunction.Misesteem
"From pure logical relations, we expect that the order of goals does not change the truth of a conjunction." which is the source of all my confusion. With your help it has cleared away. Thank you again for taking the time!Ratoon
In my coverage code here: swish.swi-prolog.org/p/Coverage%20using%20Constraints%20%20.pl Line 98 If I swap \== for dif I get a different answer..Crinose
@user27815: I cannot reproduce this: I get exactly the same answer when I use dif/2. However, please include example queries in the SWISH snippet. See the other example programs on how to embed example queries in the your source file, so that they can be selected in SWISH.Misesteem
@Misesteem dif version swish.swi-prolog.org/p/… \== version swish.swi-prolog.org/p/Coverage%20using%20Constraints%20%20.pl ?-pos(Ps),neg(Ns),start_search(Ps,Ns,Result).Crinose
D
7

First of all, dif/2 and (\==)/2 mean the same when both arguments are ground, that is variable free. So if you can ensure that the arguments will be ground — or rather sufficiently instantiated such that further instantiations will not affect the outcome of (\==)/2 — then it does not make a difference.

In your example, we would need to know for sure that answers for node/3 contain always a ground first argument. In that case, the (\==)/2 program is fine. In rare cases it might be less efficient than the dif/2 version. Think of the goal edge(X, X).

In many situations, the (\==)/2 or even (\=)/2 is significantly more efficient. On the other hand, how important is efficiency when compared to correctness?

Another way of seeing this, is to consider (\==)/2 and (\=)/2 as approximations from two sides: Only if both agree, do we have a safe final outcome.

Historically, dif/2 is one of the oldest built-in predicates. It was present in the very first Prolog system which is sometimes called Prolog 0 to distinguish it from the next version which is often perceived to be the first Prolog — the Marseille Prolog — Prolog 1. Prolog 1 did no longer have dif/2 and it is in this shape that Prolog came to Edinburgh. Also,dif/2 is not part of the ISO standard (currently) since it requires some coroutining-like mechanism. And many (rather older) Prolog systems do not have such a mechanism. However, even in ISO Prolog one could do better:

dif_si(X, Y) :-
   X == Y,
   !,
   fail.
dif_si(X, Y) :-
   X \= Y,
   !.
dif_si(X, Y) :-
   throw(error(instantiation_error,iso_dif/2)).

(Here is another, probably more efficient implementation)

Note how the problematic cases are covered by an error that stops the entire computation.

Current Prolog systems that support dif/2 right out of the box are B, SICStus, SWI, YAP. It is in a library of IF, Ciao, XSB, Scryer.

See also this answer.


To support my claim about the overheads, here is a test in various Prologs on the same machine. In SWI, there is an overhead of a factor of 10, in B, there is no overhead. As has been noted by @nfz, numbers are slightly different when compiling things. So your mileage may vary.

SWI 6.3.4-55

?- 1000=I,time(( dif(X,Y), between(1,I,X), between(1,I,Y), false )).
% 22,999,020 inferences, 5.162 CPU in 5.192 seconds (99% CPU, 4455477 Lips)
false.

?- 1000=I,time(( between(1,I,X), between(1,I,Y), X \== Y, false )).
% 2,000,001 inferences, 0.511 CPU in 0.521 seconds (98% CPU, 3912566 Lips)
false.

B 7.8

| ?- 1000=I,time(( dif(X,Y), between(1,I,X), between(1,I,Y), false )).
CPU time 0.364 seconds.
no
| ?- 1000=I,time(( between(1,I,X), between(1,I,Y), X \== Y, false )).   
CPU time 0.356 seconds.
no

YAP

?- 1000=I,time(( dif(X,Y), between(1,I,X), between(1,I,Y), false )).
% 2.528 CPU in 2.566 seconds ( 98% CPU)
no
?- 1000=I,time(( between(1,I,X), between(1,I,Y), X \== Y, false )).
% 0.929 CPU in 0.963 seconds ( 96% CPU)
no
Daven answered 7/12, 2012 at 19:47 Comment(7)
Also, how much is significantly more efficient? I measured a factor of 2.4 between dif/2 and (\==)/2 in SWI-Prolog in a tight loop that consists only of these predicates. This is not much difference in most practical applications given that (\==)/2 is hardly ever a realistic bottleneck. If you are writing a performance-critical library that provides high-level predicates to other users, consider using (\==)/2 internally, but for real user-level application code, I think dif/2 is by far the better choice in almost all cases.Misesteem
@mat: Can you reproduce my numbers above?Daven
Please consider the case of ground terms, which is (implicitly) the case under discussion. I measured a factor of 2.4 for N = 1000000, time((between(1,N,_),dif(a,b),false)). vs N = 1000000, time((between(1,N,_),a\=b,false)). with SWI-Prolog. For terms involving variables, take into account that dif/2 also gives you significantly more functionality in comparison to (\==)/2, for example, the ability to actually ask existential queries.Misesteem
@mat: I clearly disagree: The case I gave is about ground terms in much the same way as OP has done it: Initially they are not ground, and due to node/3 and between/3 they are instantiated to ground terms.Daven
I also disagree: In the very first case of OP's posting, commented with "the following works", (\==)/2 is used when both arguments are ground.Misesteem
@mat: Nobody writes dif/2 as the last goal. If dif/2 is used, it is written as early as possible, after all, that is the very reason to use it in the first place. A compiler might push it to the right, but this is often very tricky, as it requires termination proofs.Daven
I see no reason not to use dif/2 as the last goal instead of (\==)/2, except for a performance overhead that is negligible in almost all cases, including the one under discussion.Misesteem
B
7

The queries are meta-interpreted and the overhead may outweigh the differences of dif(X,Y) and X\==Y. You should compare these two predicates:

t1:-
    1000=I,
    time(t1(I)).


t1(I):-
    dif(X,Y), 
    between(1,I,X), 
    between(1,I,Y), 
    false.

t2:-
    1000=I,
    time(t2(I)).


t2(I):-
    between(1,I,X), 
    between(1,I,Y), 
    X\==Y,
    false.

On B-Prolog, I got the following results:

| ?- cl(t)
Compiling::t.pl
compiled in 0 milliseconds
loading::t.out

yes
| ?- t1
CPU time 0.14 seconds.
no
| ?- t2
CPU time 0.078 seconds.
no
| ?- 1000=I,time(( dif(X,Y), between(1,I,X), between(1,I,Y), false )).
CPU time 0.234 seconds.
no
| ?- 1000=I,time(( between(1,I,X), between(1,I,Y), X \== Y, false )).
CPU time 0.218 seconds.
Beatification answered 9/12, 2012 at 22:8 Comment(1)
Thank you - the absolute speed is still staggering!Daven

© 2022 - 2024 — McMap. All rights reserved.