C#, ANTLR, ECMAScript grammar troubles
Asked Answered
M

1

9

I'm trying to parse JavaScript (ECMASCript) with C#.

I found the following instruction on how to create new project: http://www.antlr.org/wiki/pages/viewpage.action?pageId=557075

So I've downloaded ANTLRWorks, ANTLR v3, unpacked ANTLR, created a VS2010 project (.NET4), added references, checked and generated the grammar.

Then I recieved a lot of compilation error:

The type or namespace name 'AstParserRuleReturnScope' could not be found (are you missing a using directive or an assembly reference?)

The type or namespace name 'GrammarRule' could not be found (are you missing a using directive or an assembly reference?)

Stackoverlowed for them and got a solution: antlr c# errors when integrating into VS2008

So I've downloaded new runtime, overwrite the old one and recompiled the project and got

The name 'HIDDEN' does not exist in the current context d:\Workspace.1\ScriptParser\ScriptParser\TestLexer.cs

Ok, I've changed HIDDEN to Hidden as recommended at in the following conversation: [antlr-interest] How viable is the Csharp3 target? (more specific questions)

Now I'm trying to parse the input. I found a few examples and wrote the following code:

using Antlr.Runtime;
namespace ScriptParser
{
    class Program
    {
        static void Main(string[] args)
        {
            var stream = new ANTLRStringStream("1+2");
            var lexer = new TestLexer(stream);
            var tokenStream = new CommonTokenStream(lexer);
            var parser = new TestParser(tokenStream);
            // what exactly should be here???
        }
    }
}

My goal is to parser JavaScript file with ANTLR but it seems that it will be the not as easy as I thought...

Update:

As suggested in Why are antlr3 c# parser methods private? I've modified the Test.g grammar by adding the "public" modified before the expr rule:

public expr : mexpr (PLUS^ mexpr)* SEMI! 
; 

and then regenerated the code, replaced HIDDEN to Hidden (again) and modified the code as follows:

var stream = new ANTLRStringStream("1+2");
var lexer = new TestLexer(stream);
var tokenStream = new CommonTokenStream(lexer);
var parser = new TestParser(tokenStream);
var result = parser.expr();
var tree = (CommonTree)result.Tree;

And not it is crashing on the line

root_0 = (object)adaptor.Nil(); 

in the following generated code

try { DebugEnterRule(GrammarFileName, "expr");
DebugLocation(7, 0);
try
{
    // d:\\Workspace.1\\ScriptParser\\ScriptParser\\Test.g:7:13: ( mexpr ( PLUS ^ mexpr )* SEMI !)
    DebugEnterAlt(1);
    // d:\\Workspace.1\\ScriptParser\\ScriptParser\\Test.g:7:15: mexpr ( PLUS ^ mexpr )* SEMI !
    {
    root_0 = (object)adaptor.Nil(); 

    DebugLocation(7, 15);
    PushFollow(Follow._mexpr_in_expr31);

with the NullReferenceException message because the adapter is null.

I've resolved it by adding

parser.TreeAdaptor = new CommonTreeAdaptor();

Update 2:

So, finally I've started with my primary task: parse JavaScript.

ANTLR highlights the ECMAScript grammar by Chris Lambrou.

So I've generated lexer/parser and run it with the very simple JavaScript code:

var f = function () { };

and the parsing fails with the following output from tree.ToStringTree():

<error: var q = function () { };>
Microstructure answered 8/2, 2012 at 17:6 Comment(4)
Checkout this previous Q&A: https://mcmap.net/q/273543/-using-antlr-3-3Anishaaniso
Thank you for excellent tutorial. As I understand, the rules in grammar should be converted to methods. I use grammar from antlr.org/wiki/pages/viewpage.action?pageId=557075 and ANTLRWorks generate the "expr", "atom" and other methods, but they are private.Microstructure
Thanks Alex. W.r.t. the private methods (parser rules), see another Q&A: #6412020Anishaaniso
Thanks for posting this, I was getting exact same issues, and each of your solutions worked perfectly. +1.Stepmother
D
1

Your grammar rule says that there should be a semicolon at the end of the expression, but in you main function:

var stream = new ANTLRStringStream("1+2");

is missing a semicolon. Shouldn't it be "1+2;"?

Diagonal answered 4/9, 2012 at 15:14 Comment(1)
Certainly "1+2;" is valid Javascript. But then so is "1+2" (or at least "1+2\n"). How does the ANTLR grammar handle the missing ";" question?Countrydance

© 2022 - 2024 — McMap. All rights reserved.