POSIX sh EBNF grammar
Asked Answered
W

4

12

Is there an existing POSIX sh grammar available or do I have to figure it out from the specification directly?

Note I'm not so much interested in a pure sh; an extended but conformant sh is also more than fine for my purposes.

Weymouth answered 24/3, 2013 at 13:11 Comment(0)
W
8

I have done some more digging and found these resources:

  1. An sh tutorial located here

  2. A Bash book containing Bash 2.0's BNF grammar (gone from here) with the relevant appendix still here

I have looked through the sources of bash, pdksh, and posh but haven't found anything remotely at the level of abstraction I need.

Weymouth answered 24/3, 2013 at 14:48 Comment(1)
@ceving found another copy and saved that to the WayBackMachine, should increase the link longevity.Weymouth
H
8

The POSIX standard defines the grammar for the POSIX shell. The definition includes an annotated Yacc grammar. As such, it can be converted to EBNF more or less mechanically.

If you want a 'real' grammar, then you have to look harder. Choose your 'real shell' and find the source and work out what the grammar is from that.

Note that EBNF is not used widely. It is of limited practical value, not least because there are essentially no tools that support it. Therefore, you are unlikely to find an EBNF grammar (of almost anything) off-the-shelf.

Holmann answered 24/3, 2013 at 15:18 Comment(1)
BNF is used easily: Grako derives ASTs from BNF grammars (bash) in Python, etcMalcolm
M
3

I've had multiple attempts at writing my own full blown Bash interpreters over the past year, and I've also reached at some point the same book appendix reference stated in the marked answer (#2), but it's not completely correct/updated (for example it doesn't define production rules using the 'coproc' reserved keyword and has a duplicate production rule definition for a redirection using '<&', might be more problems but those are the ones I've noticed).

  1. The best way i've found was to go to http://ftp.gnu.org/gnu/bash/
  2. Download the current bash version's sources
  3. Open the parse.y file (which in this case is the YACC file that basically contains all the parsing logic that bash uses) and just copy paste the lines between '%%' in your favorite text editor, those define the grammar's production rules
  4. Then, using a little bit of regex (which I'm terrible at btw) we can delete the extra code logic that are in between '{...}' to make the grammar look more BNF-like.

The regex i used was :

(\{(\s+.*?)+\})\s+([;|])

It matches any line non greedily .*? including spaces and new lines \s+ that are between curly braces, and specifically the last closing brace before a ; or | character. Then i just replaced the matched strings to \3 (e.g. the result of the third capturing group, being either ; or |).

Here's the grammar definition that I managed to extract at the time of posting https://pastebin.com/qpsK4TF6

Microtome answered 6/5, 2019 at 1:39 Comment(0)
M
1

I'd expect that sh, csh, ash, bash, would contain parsers. GNU versions of these are open source; you might just go check there.

Midpoint answered 24/3, 2013 at 13:47 Comment(1)
Not pure EBNF, but Yacc's variation on it. You can find the grammar rules if you look. Yes, they are buried in among the rest of the YACC/Lex definition. Welcome to real grammar definitions for working tools.Midpoint

© 2022 - 2024 — McMap. All rights reserved.