No programming language is (completely) context-free (i would say including CSS). Even though context-free grammars (CFGs) may be used to define/generate compilers/parsers for the language.
The simple fact (for example) that variables need to be defined first, before used, or that declarations involving identifiers should be unique, makes the language "context-sensitive".
A grammar for a (programming) language is supposed to describe (and generate) strings which are only the valid programs in that language (syntacticaly, but also semanticaly). Yet a CFG can describe and generate strings which are not valid programs (given the language semantics and specification). Conditions which describe valid programs (like for example: 1. a class
needs to be defined before using new class()
, 2. ids
must match etc..) require context-sensitivity.
No CFG (with any finite number of productions) can correctly represent only the valid strings of this language: {
a
n
b
n
c
n
: n >= 1
}
, where n
should be the same for a
, b
, c
(it should match). Note one can indeed define a CFG for (a superset of) this language, but it will accept also non-valid strings along with valid ones (and then by other means filter them out), this is not what a grammar specification for a language is supposed to do. It should accept only the valid strings and reject the non-valid. In an analogy with statistics, one could say that a grammar specification for a language should eliminate/minimise both Type-I (reject valid strings) and Type-II (accept non-valid strings) errors, not just one of them.
Let me give a simple example in the context of JavaScript (since variables may seem as posing no problem for JavaScript).
In JavaScript (in strict mode), duplicate named function declaration is not valid. So this is not valid:
function duplicateFunc(){}
function duplicateFunc(){} // duplicate named function declaration
So the program is not correct, yet a CFG cannot handle this type of condition.
Even turning on strict mode itself is context-sensitive
a subset of strict mode rules can be handled by spliting the CFG in cases and parsing accordingly as per @Bergi's answer (strict mode examples removed)
[UPDATE]
i will try to give a couple of examples of JavaScript non-context-free code which does not require "strict mode" (open to suggestions/corrections).
The use of reserved words/keywords is an extension (or limitation) on the grammar. It is an extraneous feature, so the following examples should count as examples of non-CF behaviour.
var var; // identifier using reserved name
var function; // identifier using reserved name
obj.var; // reserved name used as (explicit) property
obj["var"]; // this is fine!!
Object++; // built-in type used as numeric variable
[/UPDATE]
So the context plays a part in the correct parsing of the program. As it is said "context is everything"!
However this context-sensitivity can be handled (hopefuly) by only slight extensions to context-free grammars (like for example Attribute Grammars, Affix Grammars, TAG Grammars and so on), which still make for efficient parsing (meaning in polynomial time).
[UPDATE]
"i would say including CSS"
To elaborate a little on this statement. CSS1
would be CF
, but as CSS
specification adds more features inclufing variable
support (e.g css-counters
) it makes the CSS
code context-sensitive in the sense described above (e.g variables need to be defined before used). so the following css
code would be parsed by the browser (and ignored as it is not valid) but it cannot be described by a CFG
body { }
h3::before {
counter-increment: section; /* no counter section has been defined, not valid css code */
content: "Section" counter(section) ": "; /* Display the counter */
}
[/UPDATE]