Remove leading zeros in list in Prolog
Asked Answered
A

6

12

I have a list with an unknown number of zeros at the beginning of it, for example [0, 0, 0, 1, 2, 0, 3]. I need this list to be stripped of leading zeros, so that it would look like [1, 2, 0 , 3].

Here's what I have:

lead([Head | _], _) :- Head =\= 0.
lead([0 | Tail], _) :- 
  lead(Tail, Tail).

The output of which is simply True. Reading the trace shows that it is running until it has a list with no leading zeros, but then the answer doesn't propagate back up the stack. I'm pretty new to Prolog, so I can't figure out how to make it do that.

Amendment answered 30/9, 2016 at 19:50 Comment(0)
L
11

Here is a solution that works in all directions:

lead([],[]).
lead([H|T],[H|T]) :-
    dif(H,0).
lead([0|T],T2) :-
    lead(T,T2).

Some queries:

?- lead([0,0,0,1,2,0,3], L).
L = [1, 2, 0, 3] ;
false.


?- lead(L, []).
L = [] ;
L = [0] ;
L = [0, 0] ;
L = [0, 0, 0] ;
...


?- lead(L0, L).
L0 = L, L = [] ;
L0 = L, L = [_G489|_G490],
dif(_G489, 0) ;
L0 = [0],
L = [] ;
L0 = [0, _G495|_G496],
L = [_G495|_G496],
dif(_G495, 0) ;
L0 = [0, 0],
L = [] ;
L0 = [0, 0, _G501|_G502],
L = [_G501|_G502],
dif(_G501, 0) ;
L0 = [0, 0, 0],
L = [] ;
...

EDIT This predicate actually doesn't work for e.g. lead(L0, [0,1,2]).

Leighton answered 1/10, 2016 at 8:29 Comment(3)
Nice, but it is quite inefficient leaving all those leftover choice points. What about using if_/3?Cockaleekie
@Cockaleekie After reading its implementation, I'm not quite sure how I would use it here. Do you think a question on the usage of that predicate would be useful? (in other words: do you think this predicate should be known to Prolog programmers and has many common uses?)Leighton
Indeed, it should! Look at existing solutions and post your solution as a different answer!Cockaleekie
L
9

With library(reif):

:- use_module(reif).

remove_leading_zeros([], []).
remove_leading_zeros([H|T], Rest) :-
        if_(    H = 0,
                remove_leading_zeros(T, Rest),
                Rest = [H|T]).

Then:

?- remove_leading_zeros([0,0,0,1,2,0,3], R).
R = [1, 2, 0, 3].

?- remove_leading_zeros([2,0,3], R).
R = [2, 0, 3].

?- remove_leading_zeros(L, R).
L = R, R = [] ;
L = [0],
R = [] ;
L = [0, 0],
R = [] ;
L = [0, 0, 0],
R = [] . % and so on
Lunalunacy answered 3/10, 2016 at 15:45 Comment(25)
=(0, H) that is H = 0!Cockaleekie
@Cockaleekie it looks even more like unification. If i have one legitimate criticism about library(reif), it's the choice of names.Lunalunacy
Which other names do you have in mind? if/3 is already taken (by SICStus), if_then_else/3 is too long. So, what else do you suggest?Cockaleekie
@Cockaleekie what is wrong with if_then_else/3? In most cases, the call doesn't fit on one line anyway. And about the =/3, it is not exactly the same as overloading an operator in C++, so making it look too much like unification is a bit misleading. This is maybe just because I am not used to seeing it yet. I can't come up with a good name. same? equal?Lunalunacy
if_then_else/3 is extremely lengthy. As for same, equal we had here a question some years ago. In the meantime the convention is: either equal_t/3 or (=)/3 since it is an operator. It is important to make it look like unification, because it actually unifies! Try if_(A = B, R = true, R = false)!Cockaleekie
@Cockaleekie I am certain that you have put plenty of thought into this. As I said, I am just not that used to seeing it but I am sure I can get used to it.Lunalunacy
Have you seen the benchmarks? In SICStus, it is sometimes even faster than the impure memberchk/2. Maybe that helps!Cockaleekie
@Cockaleekie as I said, I only had a small issue with the naming, nothing else, really.Lunalunacy
This does NOT work for remove_leading_zeros(L0, [0,1,2,3]).. I had tried a lot of approaches similar to yours and have only been able to get the same behaviour as your answer, or to solve this problem but get different ones for LO, L as arguments.Leighton
@Leighton Should this simply fail?Lunalunacy
Well yes it should, this is what we would logically expect (and not an error)Leighton
@Leighton I am anyway not too sure about the current behavior of remove_leading_zeros(L0, L). What use is it to just keep on adding zeros to an empty list.Lunalunacy
I had been able to get the following behavior for L0, L: alternate between all zeroes in L0, and L0 = [0,…,0,H|T] with L = [H|T] and dif(H,0). However I couldn't prevent the error for L0, [0,1,2,3] in that situation.Leighton
@Cockaleekie I don't want to criticize Boris' answer, and I don't really care for reputation, but I don't understand why you would award the bounty to this answer when mine correctly fails for (L0, [0,1,2,3]) (this one loops infinitely) and correctly enumerates the answers for (L0,L) (this one only enumerates the lists of only 0s). Sure it does not work for e.g. (L0, [L0|T]) but so does this one.Leighton
@Leighton I agree with you on this, your solution is basically my solution, but covers also the corner cases that we already discussed in the comments, so it is better in every way. The only explanation would be that false didn't award the bounty, but it got awarded automatically (you can read the rules, I happen to have more plus votes for whatever reason).Lunalunacy
@Boris Possible. I don't care that the bounty stays on your answer, which is more than sufficient for what the OP originally asked anyway (and my answer doesn't even cover all cases), I'm just interested in his reason.Leighton
@Leighton As I said, I don't understand why your solution ended up having less points than mine. I gave yours a plus for sure. Stackoverflow is a strange place, esp. in a less-visited tag like Prolog.Lunalunacy
@Fatalize: In fact, your answer covers more cases than Boris'. However, the means you used are (a) impure. That is, using nonvar/1 and (;)/2-if-then-else directly, requires a lot of reasoning to ensure its pureness. My follow-up question is about this - maybe you find a safe general way? And (b) the bounty was about removing the CPs of your original solution only. In any case, do not worry too much about rep.Cockaleekie
@Cockaleekie (a) I don't see how using impure means can possibly be worse than a predicate which does not work properly. (b) This removes the CPs of my original solution, but this does not enumerate properly the answers for (L0, L), while my original solution did. (c) I don't really care about rep, I just don't understand how your selection process makes any sense.Leighton
@Fatalize: Ad a and working properly: The means you used, very easily produce incorrect solutions - that is, solutions that fail when they should succeed and vice versa. Compared to that, non-termination is a quite benign form of incorrectness: it does not claim something incorrect. And, also your solution does not work for all possible inputs, since it does not terminate for lead(L, L)Cockaleekie
@Cockaleekie The means you used, very easily produce incorrect solutions: but they don't in this particular case so I don't see why this is a problem. non-termination is a quite benign form of incorrectness: it does not claim something incorrect.: Outside the world of theoretical Prolog fanatics, I don't know any programmer that would consider an infinite loop as being "benign". Finally, yes my solution does not work for all possible inputs, but it works as well or better than this one for all possible inputs.Leighton
@Leighton well, non-termination means "I can't find an answer no matter how hard I look", and it will stop, whenever you run out of memory (in the case of a growing list at least), while an incorrect answer is just an incorrect answer. Don't want to take sides any more, just pointing out.Lunalunacy
@Fatalize: 1mo, you claimed that your solution "actually works for all possible inputs", which you now agree is not true - so please correct your answer accordingly.Cockaleekie
@Fatalize: 2do, with your solution using impure mechanism to get pure behaviour, it is e.g. impossible to produce generalizations or specializations by mere syntactic means. If you take a pure, monotonic program, you can always add false wherever you like and you can be sure that the program will be a specialization. No matter what! Similarly for other specializations or generalizations.Cockaleekie
@Fatalize: Note that the purity of if_/3 is only guaranteed as long as you respect this interface. That is, if you look into the definition of if_/3 you have impure behaviour too. But the interface itself remains pure.Cockaleekie
L
6

Here is a solution that actually works for all possible inputs and doesn't leave unnecessary choice points:

lead(L0, L) :-
    (   nonvar(L),
        L = [H|_] ->
        dif(H,0)
        ;
        true
    ),
    lead_(L0, L).

lead_([], []).
lead_([H|T], L) :-
    if_(H \= 0,
        L = [H|T],
        lead_(T,L)).

The initial check for nonvar(L) is the only solution I have been able to come up with that would prevent problems with e.g. lead(L0, [0,1,2,3]), while retaining the behavior of the predicate in all other situations.

This uses if_/3, part of library(reif)

if_(If_1, Then_0, Else_0) :-
    call(If_1, T),
    (  T == true -> Then_0
    ;  T == false -> Else_0
    ;  nonvar(T) -> throw(error(type_error(boolean,T),
                                type_error(call(If_1,T),2,boolean,T)))
    ;  throw(error(instantiation_error,instantiation_error(call(If_1,T),2)))
    ).

This also uses (\=)/3, that I came up with by simple modification of (=)/3 in library(reif).

\=(X, Y, T) :-
    (   X \= Y -> T = true
    ;   X == Y -> T = false
    ;   T = true, dif(X, Y)
    ;   T = false,
        X = Y
    ).

Some queries

?- lead([0,0,0,1,2,0,3],L).              % No choice point
L = [1, 2, 0, 3].


?- lead([1,2,0,3],L).
L = [1, 2, 0, 3].


?- lead([0,0,0,0],L).
L = [].


?- lead([],L).
L = [].


?- lead(L0,[0,1,2,0,3]).                 % Correctly fails
false.


?- lead(L0,[1,2,0,3]).
L0 = [1, 2, 0, 3] ;
L0 = [0, 1, 2, 0, 3] ;
L0 = [0, 0, 1, 2, 0, 3] ;
…


?- lead(L0,L).                           % Exhaustively enumerates all cases:  
L0 = L, L = [] ;                         %   - LO empty
L0 = L, L = [_G2611|_G2612],             %   - L0 contains no leading 0
dif(_G2611, 0) ;
L0 = [0],                                %   - L0 = [0]
L = [] ;
L0 = [0, _G2629|_G2630],                 %   - L0 contains one leading 0
L = [_G2629|_G2630],
dif(_G2629, 0) ;
L0 = [0, 0],                             %   - L0 = [0, 0]
L = [] ;
L0 = [0, 0, _G2647|_G2648],              %   - L0 contains two leading 0s
L = [_G2647|_G2648],
dif(_G2647, 0) ;
…                                        %   etc.
Leighton answered 4/10, 2016 at 7:20 Comment(3)
Why (\=)/3? There is dif/3 for this. But even better exchange branches.Cockaleekie
lead(L,[E|L]) (to be fair, I don't expect this to terminate).Cockaleekie
Same for lead(L,L).Cockaleekie
C
5

Here is a solution that doesn't generate any choice points. Its using freeze/2, in a way that is not anticipated by dif/2. But using freeze/2 here is quite appropriate, since one rule of thumb for freeze/2 is as follows:

Rule of Thumb for freeze/2: Use freeze/2 where the predicate would generate uninstantiated solutions and a lot of choice points. The hope is that a subsequent goal will specify the solution more, and the freeze/2 will be woken up. Unfortunately doesn't work with CLP(FD) or dif/2, since freeze/2 does not react to refinements implied by CLP(FD) or dif/2, only unification will wake it up.

The code is thus:

lead(X, Y) :- var(X), !, freeze(X, lead(X,Y)).
lead([X|Y], Z) :- var(X), !, freeze(X, lead([X|Y],Z)).
lead([0|X], Y) :- !, lead(X, Y).
lead(X, X).

Here are some sample runs (SWI-Prolog without some import, Jekejeke Prolog use Minlog Extension and ?- use_module(library(term/suspend))):

?- lead([0,0,0,1,2,3], X).
X = [1, 2, 3].

?- lead([0,0|X], Y).
freeze(X, lead(X, Y)).

?- lead([0,0|X], Y), X = [0,1,2,3].
X = [0, 1, 2, 3],
Y = [1, 2, 3].

?- lead([Z,0|X], Y), X = [0,1,2,3].
X = [0, 1, 2, 3],
freeze(Z, lead([Z, 0, 0, 1, 2, 3], Y)).

?- lead([Z,0|X], Y), X = [0,1,2,3], Z = 0.
Z = 0,
X = [0, 1, 2, 3],
Y = [1, 2, 3].

In the above lead/2 implemetation only the first argument is handled. To handle multiple arguments simultaneously the predicate when/2 can be used. But for simplicity this is not shown here.

Also when using suspended goals, one might need a labeling like predicate at the end, since suspended goals cannot detect inconsistency among them.

Coen answered 2/10, 2016 at 21:49 Comment(4)
Would you call it a pure solution?Cockaleekie
How can you modify your solution to skip leading s(0) in place of 0?Cockaleekie
See the new question!Cockaleekie
See the new answer for pureness.Coen
A
3

The problem in your code is that the second parameter, your output, is specified as _, so your predicate is true for any output. What you want is a predicate that is true if and only if it is the input minus leading zeroes.

lead([], []).
lead([0 | Tail], Tail2) :- !, lead(Tail, Tail2).
lead([Head | Tail], [Head | Tail]) :- Head =\= 0.

The ! in the first line is optional. It prunes the search tree so Prolog does not consider the second line (which would fail) if the first line matches.

Algy answered 30/9, 2016 at 19:59 Comment(4)
lead([0],L) fails incorrectly since it should return empty list.Jeggar
In addition to what coder said: ?- lead(Ls0, Ls). does not produce a single answer. Ideally, we are able to use predicates also to generate solutions.Volcano
In addition to what @Jeggar and @Volcano said, you have this ambiguity between syntactic equality and arithmetic equality. Not sure if you intend this, but I expect that even if you fix the first two, your program will still fail for lead([0+0,1], Xs).Cockaleekie
@Heinrich: With the new version, we get for the query ?- lead([A], Ls). the single solution A = 0. However, that's only one of many possible cases! For example, ?- lead([1], [1]). succeeds!Volcano
S
2

Here's how I'd phrase it. First, establish constraints: either X or Y must be bound to a list. Anything else fails.

  • If X is bound, we don't care about Y: it can be bound or unbound. We just strip any leading zeros from X and unify the results with Y. This path has a single possible solution.

  • If X is unbound and Y is bound, we shift into generative mode. This path has an infinite number of possible solutions.

The code:

strip_leading_zeros(X,Y) :- listish(X), !, rmv0( X , Y ) .
strip_leading_zeros(X,Y) :- listish(Y), !, add0( Y , X ) .

rmv0( []     , [] ) .
rmv0( [D|Ds] , R  ) :- D \= 0 -> R = [D|Ds] ; rmv0(Ds,R) .

add0( X , X ) .
add0( X , Y ) :- add0([0|X],Y ) .

listish/1 is a simple shallow test for listish-ness. Use is_list/1 if you want to be pedantic about things.

listish( L     ) :- var(L), !, fail.
listish( []    ) .
listish( [_|_] ) .

Edited to note: is_list/1 traverses the entire list to ensure that it is testing is a properly constructed list, that is, a ./2 term, whose right-hand child is itself either another ./2 term or the atom [] (which denotes the empty list). If the list is long, this can be an expensive operation.

So, something like [a,b,c] is a proper list and is actually this term: .(a,.(b,.(c,[]))). Something like [a,b|32] is not a proper list: it is the term .(a,.(b,32)).

Swish answered 5/10, 2016 at 0:50 Comment(2)
Could you explain what the advantage of not using is_list/1 is in this context?Amendment
@bendl: see my amended answer.Swish

© 2022 - 2024 — McMap. All rights reserved.