how to split a sentence in swi-prolog

Asked 20/10, 2010 at 10:36 Answered 1/2, 2012 at 11:2

I am trying my hands on SWI-Prolog in win xp. I am trying to understand how to split a sentence in Prolog into separate atoms.

Ex : Say I have a sentence like this :

"this is a string"
Is there any way to get individual words to get stored in a variable?

like :

X = this
Y = is
....
and so forth.

Can anyone please explain how this works?

Thanks.

Amherst answered 20/10, 2010 at 10:36 Comment(0)

I would use atomic_list_concat/3. See

http://www.swi-prolog.org/pldoc/man?predicate=atomic_list_concat%2F3

Normally it is meant to insert a separator but because of Prolog's bidirectionality of unification, it can also be used to split a string given the separator:

atomic_list_concat(L,' ', 'This is a string').
L = ['This',is,a,string]

Of course once the split is done you can play with the elements of the list L.

Carpi answered 1/2, 2012 at 0:2 Comment(1)

For SWI-prolog newer version split_string/4 can be used. swi-prolog.org/pldoc/man?predicate=split_string/4 split_string("Hello, here I am!"," "," ",Temp). – Roman 1/6, 2015 at 14:3

I like the answer of 'pat fats', but you have to convert your string to atom before:

..., atom_codes(Atom, String), atomic_list_concat(L, ' ', Atom), ...

If you need to work directly with strings, I have this code in my 'arsenal':

%%  split input on Sep
%
%   minimal implementation
%
splitter(Sep, [Chunk|R]) -->
    string(Chunk),
    (   Sep -> !, splitter(Sep, R)
    ;   [], {R = []}
    ).

being a DCG, must be called in this way:

?- phrase(splitter(" ", L), "this is a string"), maplist(atom_codes, As, L).
L = [[116, 104, 105, 115], [105, 115], [97], [115, 116, 114, 105, 110|...]],
As = [this, is, a, string] .

edit: more explanation

I forgot to explain how that works: DCG are well explained by @larsman, in this other answer. I cite him

-->, which actually adds two hidden arguments to it. The first of these is a list to be parsed by the grammar rule; the second is "what's left" after the parse. c(F,X,[]) calls c on the list X to obtain a result F, expecting [] to be left, i.e. the parser should consume the entire list X.

Here I have 2 arguments, the first it's the separator, the second the list being built. The builtin string//1 come from SWI-Prolog library(http/dcg_basics). It's a very handy building block, that match literally anything on backtracking. Here it's 'eating' each char before the separator or the end-of-string. Having done that, we can recurse...

Recommended topics

Hot tags