How to parse comments with FParsec
Asked Answered
A

1

8

I'm attempting to parse lisp-style comments from an s-expression language with FParsec. I got a bit of help with parsing single-line comments in this previous thread - How to convert an FParsec parser to parse whitespace

While that was resolved, I still need to parse multiline comments. Here's the current code -

/// Read whitespace character as a string.
let spaceAsStr = anyOf whitespaceChars |>> fun chr -> string chr

/// Read a line comment.
let lineComment = pchar lineCommentChar >>. restOfLine true

/// Read a multiline comment.
/// TODO: make multiline comments nest.
let multilineComment =
    between
        (pstring openMultilineCommentStr)
        (pstring closeMultilineCommentStr)
        (charsTillString closeMultilineCommentStr true System.Int32.MaxValue)

/// Read whitespace text.
let whitespace =
    lineComment <|>
    multilineComment <|>
    spaceAsStr

/// Skip any white space characters.
let skipWhitespace = skipMany whitespace

/// Skip at least one white space character.
let skipWhitespace1 = skipMany1 whitespace

Unfortunately, the multilineComment parse never succeeds. Since this is a combinator, I can't put breakpoints or analyze why it won't work.

Any ideas why it won't work?

Abram answered 6/12, 2011 at 18:41 Comment(0)
R
7

Try changing the bool argument for closeMultilineCommentStr to false

(charsTillString closeMultilineCommentStr false System.Int32.MaxValue)

Otherwise it will skip over the closeMultilineCommentStr string.

To make it work with nested comments

let rec multilineComment o=
    let ign x = charsTillString x false System.Int32.MaxValue
    between
        (pstring openMultilineCommentStr)
        (pstring closeMultilineCommentStr)
        (attempt (ign openMultilineCommentStr >>. multilineComment >>. ign closeMultilineCommentStr) <|> 
        ign closeMultilineCommentStr) <|o
Rugged answered 6/12, 2011 at 19:27 Comment(4)
Ah, wonderful! I was distracted by thinking it would be some deep underlying parser issue, but it turns out it was a boolean I specified thoughtlessly! Thank you!Abram
Holy-moly that looks difficult! I'm glad I asked! I will be trying to grok that until I can get it :) Thanks again!Abram
To think F# and FParsec were around since 2011 and I didn't know until recently... anyway, that's a smart way to support nested comment blocks, but most (if not all) languages don't support nested block comments, instead ending the block on the first close comment token. A point to consider for anyone who ends up here by searching.Cobbett
It seems that this recursive version thinks this case valid: (* foo (* bar *), and this (* foo *) bar *), which is not consistent with most ML languages (I believe same goes for the Scheme/Lisp-y langs). To make it work, simply drop >>. ign closeMultilineCommentStr in the attempt.Veterinarian

© 2022 - 2024 — McMap. All rights reserved.