How to split a string at a particular character in Rebol

M

2

5

I haven't figure out how to split a string in a cleaner way.

 ref: copy/part (find line "#") -15
 rest2: copy/part (skip (find line "#") 1) 450

-15 is for going to the beginning and 450 to go to the end.

It is not nice because I put value.

What is the proper solution ?

Mab answered 25/4, 2013 at 10:31 Comment(0)

L

6

A warning up front: there are many different ways to achieve this in Rebol. So you'll probably get quite a few different suggestions.

For a start, let's stick to your original approach of using FIND.

When you use FIND with a series, what you get is a new view onto the underlying series data, positioned at a different offset from the start of the series data.

Let's start with some example data:

>> line: copy "this is foo#and here comes a long bar"
== "this is foo#and here comes a long bar"

Let's FIND the # character within that line, and refer to the result as POS:

>> pos: find line "#"
== "#and here comes a long bar"

As you can see, this basically already gives you the second part (what you called REST2) of your split. You'll only have to skip past the delimiter itself (and then copy the resulting string, to make it independent from the original LINE string):

>> rest: copy next pos
== "and here comes a long bar"

For extracting the initial part, you can use a nice feature of COPY/part. The documentation of the "/part" refinement (try help copy) says: "Limits to a given length or position" (emphasis mine). We already have that position handy as POS. So:

>> ref: copy/part line pos
== "this is foo"

And there you go! The complete code:

pos: find line "#"
ref: copy/part line pos
rest: copy next pos

For reference, here's a PARSE-based approach:

parse line [copy ref to "#" skip copy rest to end]

I'll let this stand without further explanation. If you want to know more about PARSE, a good place to start is the venerable "Parsing" chapter in the REBOL/Core Users Guide (originally written for REBOL 2.3, but the basics are still mostly the same in current REBOL 2 and 3 versions).

One ancillary note at the end: instead of a single-item string "#" you could also use a character which is written as #"#" in Rebol.

Lalla answered 25/4, 2013 at 10:54 Comment(4)

Very nice and I like the parse approach. Thanks. – Mab 25/4, 2013 at 11:21

If your searched character is unique in the line string, you can use a shorter form to splitting the string and assigning the two parts in one expression: set [ref rest2] parse line "#" – Yarbrough 25/4, 2013 at 13:33

@Lalla Now how do we do this for multiple occurrences of a given character? I don't see an /all refinement to the "find" function. – Pretentious 24/2, 2021 at 15:2

@ArchanJoshi You either loop the FIND-based approach. Or you use the PARSE/all-based splitting suggested in @sqlab's answer: items: parse/all line "#", for example. – Lalla 3/3, 2021 at 10:59

T

3

the advice to use

set [ref rest2] parse line "#"

will not give what you desire.
better use

set [ref rest2] parse/all line "#"

,as parse without /all is a special case in Rebol notation for parsing csv strings or for parsing Rebol notation.
without /all "#" is just added to the already defined white space delimiters etc.
You would get this

== ["this" "is" "foo" "and" "here" "comes" "a" "long" "bar"]

with the first two elements assigned to ref and rest2

Ticknor answered 20/6, 2013 at 19:12 Comment(0)

Recommended topics

Hot tags