I'm trying to wrap my head around PEG by entering simple grammars into the PEG.js playground.
Example 1:
- Input:
"abcdef1234567ghijklmn8901opqrs"
Desired output:
["abcdef", "1234567", "ghijklmn", "8901", "opqrs"]
Actual output:
["abcdef", ["1234567", ["ghijklmn", ["8901", ["opqrs", ""]]]]]
This example pretty much works, but can I get PEG.js to not nest the resulting array to a million levels? I assume the trick is to use concat()
instead of join()
somewhere, but I can't find the spot.
start
= Text
Text
= Numbers Text
/ Characters Text
/ EOF
Numbers
= numbers: [0-9]+ {return numbers.join("")}
Characters
= text: [a-z]+ {return text.join("")}
EOF
= !.
Example 2:
Same problem and code as Example 1, but change the Characters rule to the following, which I expected would produce the same result.
Characters
= text: (!Numbers .)+ {return text.join("")}
The resulting output is:
[",a,b,c,d,e,f", ["1234567", [",g,h,i,j,k,l,m,n", ["8901", [",o,p,q,r,s", ""]]]]]
Why do I get all these empty matches?
Example 3:
Last question. This doesn't work at all. How can I make it work? And for bonus points, any pointers on efficiency? For example, should I avoid recursion if possible?
I'd also appreciate a link to a good PEG tutorial. I've read (http://www.codeproject.com/KB/recipes/grammar_support_1.aspx), but as you can see I need more help ...
- Input:
'abcdefghijklmnop"qrstuvwxyz"abcdefg'
- Desired output:
["abcdefghijklmnop", "qrstuvwxyz", "abcdefg"]
- Actual output:
"abcdefghijklmnop\"qrstuvwxyz\"abcdefg"
start
= Words
Words
= Quote
/ Text
/ EOF
Quote
= quote: ('"' .* '"') Words {return quote.join("")}
Text
= text: (!Quote . Words) {return text.join("")}
EOF
= !.
text()
which references the complete text string that as been matched, as in:digits = [0-9]+ { return text() }
– Albertson