PEG.js - how to parse c-style comments?
Asked Answered
N

2

10

Implementing a peg.js based parser, I get stuck adding code to to handle c-style comments /* like this */.

I need to find the end marker without eating it.

this not working:

multi = '/*' .* '*/'

The message is:

line: 14
Expected "*/" or any character but end of input found.

I do understand why this is not working, but unfortunately I have no clue how to make comment parsing functional.

Here's the code so far:

start = item*

item = comment / content_line

content_line = _ p:content _ {return ['CONTENT',p]}

content = 'some' / 'legal' / 'values'

comment = _ p:(single / multi) {return ['COMMENT',p]}

single = '//' p:([^\n]*) {return p.join('')}

multi = 'TODO'


_ = [ \t\r\n]* {return null}

and some sample input:

// line comment, no problems here

/*
  how to parse this ??
*/

values

// another comment

some legal
Nonunion answered 24/10, 2014 at 21:30 Comment(0)
C
18

Use a predicate that looks ahead and makes sure there is no "*/" ahead in the character stream before matching characters:

comment
 = "/*" (!"*/" .)* "*/"

The (!"*/" .) part could be read as follows: when there's no '*/' ahead, match any character.

This will therefor match comments like this successfully: /* ... **/

Carpentaria answered 24/10, 2014 at 21:53 Comment(1)
Working! Thanks a lot. I'll post the complete code.Nonunion
N
8

complete code:

Parser:

start = item*

item = comment / content_line

content_line = _ p:content _ {return ['CONTENT',p]}

content = 'all' / 'legal' / 'values' / 'Thanks!'

comment = _ p:(single / multi) {return ['COMMENT',p]}

single = '//' p:([^\n]*) {return p.join('')}

multi = "/*" inner:(!"*/" i:. {return i})* "*/" {return inner.join('')}

_ = [ \t\r\n]* {return null}

Sample:

all  

// a comment

values

// another comment

legal

/*12
345 /* 
*/

Thanks!

Result:

[
    ["CONTENT","all"],
    ["COMMENT"," a comment"],
    ["CONTENT","values"],
    ["COMMENT"," another comment"],
    ["CONTENT","legal"],
    ["COMMENT","12\n345 /* \n"],
    ["CONTENT","Thanks!"]
]
Nonunion answered 24/10, 2014 at 22:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.