I'm trying to parse a string that can contain escaped characters, here's an example:
import qualified Data.Text as T
exampleParser :: Parser T.Text
exampleParser = T.pack <$> many (char '\\' *> escaped <|> anyChar)
where escaped = satisfy (\c -> c `elem` ['\\', '"', '[', ']'])
The parser above creates a String
and then packs it into Text
. Is there any way to parse a string with escapes like the above using the functions for efficient string handling that attoparsec provides? Like string
, scan
, runScanner
, takeWhile
, ...
Parsing something like "one \"two\" \[three\]"
would produce one "two" [three]
.
Update:
Thanks to @epsilonhalbe I was able to come out with a generalized solution perfect for my needs; note that the following function doesn't look for matching escaped characters like [..]
, ".."
, (..)
, etc; and also, if it finds an escaped character that is not valid it treats \
as a literal character.
takeEscapedWhile :: (Char -> Bool) -> (Char -> Bool) -> Parser Text
takeEscapedWhile isEscapable while = do
x <- normal
xs <- many escaped
return $ T.concat (x:xs)
where normal = Atto.takeWhile (\c -> c /= '\\' && while c)
escaped = do
x <- (char '\\' *> satisfy isEscapable) <|> char '\\'
xs <- normal
return $ T.cons x xs