How best to parse a comma separate list in PEG grammar
Asked Answered
G

2

6

I'm trying to parse a comma separated list. To simplify, I'm just using digits. These expressions would be valid:

(1, 4, 3)

()

(4)

I can think of two ways to do this and I'm wondering why exactly the failed example does not work. I believe it is a correct BNF, but I can't get it to work as PEG. Can anyone explain why exactly? I'm trying to get a better understanding of the PEG parsing logic.

I'm testing using the online browser parser generator here: https://pegjs.org/online

This does not work:

list = '(' some_digits? ')'
some_digits = digit / ', ' some_digits
digit = [0-9]

(actually, it parses okay, and likes () or (1) but doesn't recognize (1, 2)

But this does work:

list = '(' some_digits? ')'
some_digits = digit another_digit*
another_digit = ', ' digit
digit = [0-9]

Why is that? (Grammar novice here)

Gisela answered 12/6, 2019 at 2:10 Comment(0)
F
5

Cool question and after digging around in their docs for a second I found that the / character means:

Try to match the first expression, if it does not succeed, try the second one, etc. Return the match result of the first successfully matched expression. If no expression matches, consider the match failed.

So this lead me to the solution:

list = '(' some_digits? ')'
some_digits = digit ', ' some_digits / digit
digit = [0-9]

The reason this works:

input: (1, 4)

  • eats '('
  • check are there some digits?
  • check some_digits - first condition:
    • eats '1'
    • eats ', '
    • check some_digits - first condition:
      • eats '4'
      • fails to eat ', '
    • check some_digits - second condition:
      • eats '4'
      • succeeds
    • succeeds
  • eats ')'
  • succeeds

if you reverse the order of the some_digits conditions the first number is comes across gets eaten by digit and no recursion occurs. Then it throws an error because ')' is not present.

Fronnia answered 12/6, 2019 at 2:30 Comment(3)
On a side note for line two this works as well: some_digits = digit (', ' some_digits)?Fronnia
@LeonStarr Oh, did you want me to answer the second part of your post? Sorry I forgotFronnia
No need James, your answer clarified the PEG parsing logic for me (which was my real problem). But thanks for asking.Gisela
T
0

In one line:

some_digits = '(' digit (', ' digit)* ')'

It depends on what you want with the values and on the PEG implementation, but extracting them might be easier this way.

Tag answered 15/2, 2022 at 0:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.