If a language has an LR(k)
grammar, then it has an LR(1)
grammar which can be generated mechanically from the LR(k)
grammar; furthermore, the original parse tree can be recreated from the LR(1)
parse tree. A proof of this theorem appears in Section 6.7 of Parsing Theory Vol. II, Sippu & Soisalon-Soininen; they attribute it to "On the complete covering problem for LR(k) grammars" (1976) by MD Mickunas, in JACM 23:17-30.
So there's no language parseable as LR(k)
for k>1
which is not also parseable as LR(1)
, and consequently there really isn't such as thing as an LR(k)
language for values of k
other than 0 and 1.
However, there are languages for which an LR(2)
grammar is much easier. One classic example is the Posix grammar for yacc
, which does not require productions to be terminated with a ;
. This is unambiguous, because a production must start with an identifier followed by a colon. Written that way, the grammar is clearly LR(2)
; by the above theorem, an LR(1)
grammar exists but it's nowhere near as clean.