TCL - split string by arbitrary number of whitespaces to a list
Asked Answered
D

6

6

Say I have a string like this:

set str "AAA    B C     DFG 142               56"

Now I want to get a list as follows:

{AAA B C DFG 142 56}

For that I want to use split function, but in that case I get some extra empty lists {}. How I can get the list above?

Danley answered 20/4, 2011 at 10:8 Comment(0)
R
15
set text "Some arbitrary text which might include \$ or {"
set wordList [regexp -inline -all -- {\S+} $text]

See this: Splitting a String Into Words.

Renewal answered 20/4, 2011 at 10:31 Comment(4)
Wow, I didn't know regexp could return a list. I would have done this, which is almost as good [split [regsub { {2,}} $string " "] " "]. The regsub replaces all sequences of spaces of length 2 or more with a single space, then the split splits on that.Conflict
set wordList [regexp -inline -all -- {\S+} $text] here what does "--" mean?Danley
That's necessary in case the regex begins with a "-" character, plus it's just a good habit to get into (for this plus many Tcl commands where the first non-switch argument can be perhaps user-input: file delete -- $file, switch -exact -- $word, ...)Glendoraglendower
BTW see #3369958 and hume.com/html84/mann/regexp.html for -all and -inline flags.Danley
M
9

You can always do the following:

set str "AAA    B C     DFG 142               56"
set newStr [join $str " "]

It will output the following:

{AAA B C DFG 142 56}
Morgun answered 20/4, 2011 at 13:28 Comment(3)
Why does it work? As I know join - Create a string by joining together list elements, but your output is a list? and your input not a list... It is strange...Danley
String, list, its all the same thing (technically). The output that was given, can easily be split into separate list elements, etc. You want your output to be a list though, don't you?Morgun
This works because the input can be parsed as a valid list. Had the input been "AAA }B C DFG 142 56", it would fail because that can't be parsed as a list.Southward
G
5

The textutil::split module from tcllib has a splitx proc that does exactly what you want

package require textutil::split
set result [textutil::split::splitx $str]
Glendoraglendower answered 20/4, 2011 at 10:53 Comment(0)
R
3

As of Tcl 8.5, the following also works:

list {*}$str

(provided the string is also a proper list, as in the question). The output is the desired list.

Documentation: list, {*} (syntax)

Rumormonger answered 7/7, 2016 at 10:40 Comment(0)
M
1

I know this is old, but in case others come across this in the future I'll add my solution. I subbed the unknown number of whitespace into one character of whitespace to allow split to work correctly, similar to Scott's answer:

set str "AAA    B C     DFG 142               56"    
regsub -all { +} $str " " str ; # str is now "AAA B C DFG 142 56"
set splitout [split $str { +}] ; # Split on one or more spaces. 
Mireille answered 4/2, 2020 at 20:21 Comment(0)
W
0

set map [list {*}$str]

this will return your original str as a list of the substrings, ignoring the whitespace. you can get the 1st item in list then using

set first [lindex $map 0]

Whorl answered 24/7 at 14:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.