Is Rust's syntactical grammar context-free or context-sensitive?
Asked Answered
B

2

7

The syntactical grammar of hardly any programming language is regular, as they allow arbitrarily deeply nested parenthesis. Rust does, too:

let x = ((((()))));

But is Rust's syntactical grammar at least context-free? If not, what element makes the grammar context-sensitive? Or is the grammar even recursively enumerable, like C++'s syntactical grammar?


Related: Is Rust's lexical grammar regular, context-free or context-sensitive?

Bonny answered 28/4, 2017 at 10:13 Comment(3)
Given that an Antlr grammar exists for Rust and that Antlr is restricted to context-free grammars, it looks like Rust's grammar has to be context-free.Stenotypy
@peter (1) as ira baxter says in your link, ANTLR allows arbitrary predicates in a production, so it most certainly can recognize some non-CF grammars. (2) The fact that someone writes a file with the name rust.g4 is not prima facie evidence that its contents actually correspond to the language commonly known as rust. Particularly when the file is 4 years old, and the language is infamous for its repeated syntax changes.Cyrstalcyrus
Except for the license, this project has not been updated in 4 years. Rust changed a lot before 1.0, I don't think this is a good enough argument :]Decode
C
7

Rust includes a macro processor, whose operation is highly context-sensitive.

You could attempt to skate around this issue by only doing syntactic analysis up to but not including macro expansion -- possible, but not particularly useful -- or by assuming that the macro expansion is done by some intermediate tool which is given a free pass to allow it to be Turing complete.

But I'm inclined to say that it simply means that the Rust language is recursively enumerable.

There are a number of restrictions on the validity of macro definitions which probably make the language (at least) context-sensitive, even if you settle for not performing the macro expansions as part of syntactic analysis.

This doesn't mean that a context-free grammar cannot be useful as part of the syntactic analysis of Rust. It's probably essential, and it could even be useful to use a parser generator such as bison or Antlr (and examples of both exist). Like most programming languages, there is a simple superset of Rust which is context-free, and which can be usefully analysed with context-free grammar tools; however, in the end there are texts which will need to be rejected at compile-time as invalid even though they are part of the CF superset.

Cyrstalcyrus answered 29/4, 2017 at 7:51 Comment(0)
H
5

Answer straight from the source code for Rust:

Rust's lexical grammar is not context-free. Raw string literals are the source of the problem. Informally, a raw string literal is an r, followed by N hashes (where N can be zero), a quote, any characters, then a quote followed by N hashes. Critically, once inside the first pair of quotes, another quote cannot be followed by N consecutive hashes. e.g. r###""###"### is invalid.

Hubby answered 30/6, 2019 at 4:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.