I needed a String tokenizer in Haskell but there is apparently nothing already defined in the Prelude or other modules. There is splitOn in Data.Text, but that's a pain to use because you need to wrap the String to Text.
The tokenizer is not too hard to do so I wrote one (it doesn't handle multiple adjacent delimiters, but it worked well for what I needed it). I feel something like this should be already in the modules somewhere..
This is my version
tokenizer :: Char -> String -> [String]
tokenizer delim str = tokHelper delim str []
tokHelper :: Char -> String -> [String] -> [String]
tokHelper d s acc
| null pos = reverse (pre:acc)
| otherwise = tokenizer d (tail pos) (pre:acc)
where (pre, pos) = span (/=d) s
I searched the internet for more solutions and found some discussions, like this blog post.
The last comment (by Mahee on June 10, 2011) is particularly interesting. Why not make a version of the words function more generic to handle this? I tried searching for such a function but found none..
Is there a simpler way to this or is 'tokenizing' a string not a very recurring problem? :)
words
. Most other parsing tasks beyondwords
-level are probably going to be complex enough to be worth doing with things like parser combinators (e.g. parsec) instead. – Honeyman